'beeline 을 이용한 테이블 데이터 조회 및 다운로드' 태그의 글 목록

이번 글에서는 Hive 테이블의 데이터를 조회, 테이블 데이터를 클라이언트로 다운로드하는 과정에 대해 설명합니다.

1. 데이터 조회

Hive 테이블 생성 - beeline 에서 만들어진 테이블을 조회합니다.

SELECT
  *
FROM speech_db.speech_internal_db
WHERE ymd between '2021-06-11' and '2021-06-11';

+----------------------+--------------------------+------------------------------------+---------------------+
| speech_internal_db.indx  | speech_internal_db.path_wav  |      speech_internal_db.utterance      | speech_internal_db.ymd  |
+----------------------+--------------------------+------------------------------------+---------------------+
| 1                    | /root/1.wav              | This is an example                 | 2021-06-11          |
| 2                    | /root/2.wav              | Let us learn apache hive together  | 2021-06-11          |
+----------------------+--------------------------+------------------------------------+---------------------+

2. 데이터 저장

조회 결과를 File 로 저장합니다. /user/new/download 디렉토리에 결과가 저장됩니다.

하이브 처리 결과를 gzip으로 압축하여 출력할 때는 다음과 같이 사용합니다.
- hive.exec.compress.output: 출력결과의 압축 여부를 설정
- mapred.output.compression.codec: 압축 코덱을 설정. core-site.xml의 io.compression.codecs에 설정된 값을 사용

set hive.exec.compress.output=false;


-- 결과를 /user/new/example_download 에 저장합니다.
INSERT OVERWRITE DIRECTORY '/user/new/download'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ESCAPED BY '\\'
STORED AS TEXTFILE
SELECT
  *
FROM speech_db.speech_internal_db
WHERE ymd between '2021-06-11' and '2021-06-11';

3. /user/user/download 에 파일이 저장되었는지 확인하고 다운로드합니다. 다운로드 후 파일은 삭제합니다.

[hadoop] [user@user-MacBookPro-5 ~/Downloads 15:12:34] hadoop fs -get /user/new/download

[hadoop] [user@user-MacBookPro-5 ~/Downloads 15:13:59] ls download/000000_0
download/000000_0

[Hive] 로컬 CSV 데이터를 Hive 테이블에 Load하기 (2)	2021.05.23
[Hive] 테이블 분할(partition) 과 버킷화(bucket) (0)	2021.05.23
[Hive] 파일 포맷 (Storage Formats) (0)	2021.05.23
[Hive] 관리형(Managed) 테이블과 외부(External) 테이블 (0)	2021.05.23
[Hive] Hive DDL Commands (0)	2021.05.23

Notes

beeline 을 이용한 테이블 데이터 조회 및 다운로드

[Hive] beeline 으로 테이블 데이터 조회 및 다운로드 하기

1. 데이터 조회

2. 데이터 저장

3. /user/user/download 에 파일이 저장되었는지 확인하고 다운로드합니다. 다운로드 후 파일은 삭제합니다.

'ML Engineering > Hadoop and Hive' 카테고리의 다른 글

+ Recent posts

티스토리툴바