美文网首页
16.Hadoop:impala使用

16.Hadoop:impala使用

作者: 負笈在线 | 来源:发表于2020-07-09 09:04 被阅读0次

本节主要内容:

impala使用

impala建表

一.直接建表

        # sudu -u hdfs impala-shell

[not connected] > connect node1.hadoop.com:21000

[node1.hadoop.com:21000] >

[node1.hadoop.com:21000] >

[node1.hadoop.com:21000] > DROP TABLE IF EXISTS student;

[node1.hadoop.com:21000] > DROP TABLE IF EXISTS student;

[node1.hadoop.com:21000] > CREATE TABLE student

                        > (

                        > id INT,

                        > name STRING

                        > )

                        > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

[node1.hadoop.com:21000] > SHOW TABLES;

单条插入(hive不支持单条插入)

[node1.hadoop.com:21000] > insert into student values(1,'zhangsan');

Query: insert into student values(1,'zhangsan')

Query submitted at: 2020-07-05 22:12:20 (Coordinator: http://node1.hadoop.com:25000)

Query progress can be monitored at: http://node1.hadoop.com:25000/query_plan?query_id=b34e63473dd63cfe:7368e1bc00000000

Modified 1 row(s) in 0.67s

[node1.hadoop.com:21000] > select * from student;

Query: select * from student

Query submitted at: 2020-07-05 22:12:29 (Coordinator: http://node1.hadoop.com:25000)

Query progress can be monitored at: http://node1.hadoop.com:25000/query_plan?query_id=b54fb8a8284e60fe:cf6aab0300000000

+----+----------+

| id | name    |

+----+----------+

| 1  | zhangsan |

+----+----------+

Fetched 1 row(s) in 0.45s

[node1.hadoop.com:21000] >

查看存储位置及创建者(Node1节点)

          # sudo -u hdfs hadoop fs -ls  /user/hive/warehouse

Found 9 items

...

drwxrwxrwt  - impala supergroup          0 2020-07-05 22:12 /user/hive/warehouse/student

...

        # sudo -u hdfs hadoop fs -ls  /user/hive/warehouse/student

Found 2 items

drwxrwxrwt  - impala supergroup          0 2020-07-05 22:12 /user/hive/warehouse/student/_impala_insert_staging

-rw-r--r--  3 impala supergroup        11 2020-07-05 22:12 /user/hive/warehouse/student/b34e63473dd63cfe-7368e1bc00000000_1806132997_data.0.

以上可见impala是基于hive的。

二.通过文件直接创建表

创建文件夹

       # sudo -u hdfs hdfs dfs -mkdir -p /user/cloudera/sample_data/tab1 /user/cloudera/sample_data/tab2

配置权限

       # sudo -u hdfs hadoop fs -chmod -R 777  /user/cloudera/sample_data

创建测试文件

       # vi /tab1.csv

插入以下内容

1,true,123.123,2012-10-24 08:55:00

2,false,1243.5,2012-10-25 13:40:00

3,false,24453.325,2008-08-22 09:33:21.123

4,false,243423.325,2007-05-12 22:32:21.33454

5,true,243.325,1953-04-22 09:11:33

       # vi /tab2.csv

1,true,12789.123

2,false,1243.5

3,false,24453.325

4,false,2423.3254

5,true,243.325

60,false,243565423.325

70,true,243.325

80,false,243423.325

90,true,243.325

上传测试文件

       # cd /

       # sudo -u hdfs hdfs dfs -put tab1.csv /user/cloudera/sample_data/tab1

       # sudo -u hdfs hdfs dfs -put tab2.csv /user/cloudera/sample_data/tab2

进入impala-shell,创建表1和表2

impala使用

       # sudu -u hdfs impala-shell

[not connected] > connect node1.hadoop.com:21000

[node1.hadoop.com:21000] >

[node1.hadoop.com:21000] >

[node1.hadoop.com:21000] > DROP TABLE IF EXISTS tab1;

[node1.hadoop.com:21000] > CREATE EXTERNAL TABLE tab1

                        > (

                        >    id INT,

                        >    col_1 BOOLEAN,

                        >    col_2 DOUBLE,

                        >    col_3 TIMESTAMP

                        > )

                        > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

                        > LOCATION '/user/cloudera/sample_data/tab1'; 数据文件的读取位置

[node1.hadoop.com:21000] > DROP TABLE IF EXISTS tab2;

[node1.hadoop.com:21000] > CREATE EXTERNAL TABLE tab2

                        > (

                        >    id INT,

                        >    col_1 BOOLEAN,

                        >    col_2 DOUBLE

                        > )

                        > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

                        > LOCATION '/user/cloudera/sample_data/tab2';

查看:

[node1.hadoop.com:21000] > select * from tab1;

Query: select * from tab1

Query submitted at: 2020-07-05 22:19:16 (Coordinator: http://node1.hadoop.com:25000)

Query progress can be monitored at: http://node1.hadoop.com:25000/query_plan?query_id=5a4837a6f2f94e84:e372078c00000000

+----+-------+------------+-------------------------------+

| id | col_1 | col_2      | col_3                        |

+----+-------+------------+-------------------------------+

| 1  | true  | 123.123    | 2012-10-24 08:55:00          |

| 2  | false | 1243.5    | 2012-10-25 13:40:00          |

| 3  | false | 24453.325  | 2008-08-22 09:33:21.123000000 |

| 4  | false | 243423.325 | 2007-05-12 22:32:21.334540000 |

| 5  | true  | 243.325    | 1953-04-22 09:11:33          |

+----+-------+------------+-------------------------------+

Fetched 5 row(s) in 0.45s

[node1.hadoop.com:21000] > select * from tab2;

Query: select * from tab2

Query submitted at: 2020-07-05 22:19:25 (Coordinator: http://node1.hadoop.com:25000)

Query progress can be monitored at: http://node1.hadoop.com:25000/query_plan?query_id=9e49976674c988c3:b63f1a2100000000

+----+-------+---------------+

| id | col_1 | col_2        |

+----+-------+---------------+

| 1  | true  | 12789.123    |

| 2  | false | 1243.5        |

| 3  | false | 24453.325    |

| 4  | false | 2423.3254    |

| 5  | true  | 243.325      |

| 60 | false | 243565423.325 |

| 70 | true  | 243.325      |

| 80 | false | 243423.325    |

| 90 | true  | 243.325      |

+----+-------+---------------+

Fetched 9 row(s) in 6.26s

[node1.hadoop.com:21000] >

相关文章

网友评论

      本文标题:16.Hadoop:impala使用

      本文链接:https://www.haomeiwen.com/subject/endvqktx.html