美文网首页
hive dml sql

hive dml sql

作者: siyongshuai | 来源:发表于2018-09-27 18:54 被阅读0次

    INSERT

    标准语法

    Standard Syntax:INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] 

    VALUES values_row [, values_row ...]  Where values_row is:( value [, value ...] )where a value is either null or any valid SQL literal

    无分区

    CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2))  CLUSTERED BY (age) INTO 2 BUCKETS STORED AS ORC;

     INSERT INTO TABLE students  VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32); 

    有分区

     CREATE TABLE pageviews (userid VARCHAR(64), link STRING, came_from STRING)  PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC; 

    INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')  VALUES ('jsmith', 'mail.com', 'sports.com'), ('jdoe', 'mail.com', null); 

    INSERT INTO TABLE pageviews PARTITION (datestamp)  VALUES ('tjohnson', 'sports.com', 'finance.com', '2014-09-23'), ('tlee', 'finance.com', null, '2014-09-21'); 

     INSERT INTO TABLE pageviews  VALUES ('tjohnson', 'sports.com', 'finance.com', '2014-09-23'), ('tlee', 'finance.com', null, '2014-09-21');

    全列插入

    CREATE TABLE test(id STRING,name STRING)PARTITIONED BY (dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

    CREATE TABLE tmp_test(id STRING, name STRING) PARTITIONED BY (dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

    CREATE TABLE  tmp_test_2( id STRING) PARTITIONED BY (dt STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;

    列数不一致会报错

    insert into test partition(dt='20180101') select * from tmp_test

    在hive里面执行没问题,但是在spark-sql执行的时候会报错,spark-sql 会先对列数进行校验

    insert into test partition(dt='20180101') select * from tmp_test2

    部分列插入

    insert into test partition(dt='20180101') (id) select (id) from tmp_test

    insert into test partition(dt='20180101') (id,name) select (id,name) from tmp_test

    问题

    1.列数不匹配(查询列多于插入列)

    hive> insert into test partition(dt='20180101') select * from tmp_test;

    FAILED: SemanticException [Error 10044]: Line 1:12 Cannot insert into target table because column number/types are different ''20180101'': Table insclause-0 has 2 columns, but query has 3 columns. 

    解决方案

    指定列插入

    相关文章

      网友评论

          本文标题:hive dml sql

          本文链接:https://www.haomeiwen.com/subject/qkkyoftx.html