DDL
• 建表
• 建立外部表
• 建立分区表
• 建立Bucket表
• 复制一个空表
• 实例
• JDBC程序
建表
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...)
[SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]
-
CREATE TABLE
创建一个指定名字的表。如果相同名字的表已经存在,则抛出异常;用户可以用IF NOT EXIST
选项来忽略这个异常. -
EXTERNAL
关键字可以让用户创建一个外部表,在建表的同时指定一个指向实际数据的路径(LOCATION) -
LIKE
允许用户复制现有的表结构,但是不复制数据 -
COMMENT
可以为表与字段增加描述
ROW FORMAT DELIMITED
[FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char][MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]
- 用户在建表的时候可以自定义
SerDe
或者使用自带的SerDe
。如果没有指定ROW FORMAT
或者ROW FORMAT DELIMITED
,将会使用自带的SerDe
。在建表的时候,用户还需要为表指定列,用户在指定表的列的同时也会指定自定义的SerDe
,Hive 通过SerDe
确定表的具体的列的数据。
STORED AS
SEQUENCEFILE
| TEXTFILE
| RCFILE
| INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
- 如果文件数据是纯文本,可以使用
STORED AS TEXTFILE
。如果数据需要压缩,使用STORED AS SEQUENCE
。
建立外部表
CREATE EXTERNAL TABLE page_view
(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User',
country STRING COMMENT 'country of origination')
COMMENT 'This is the staging page view table'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
STORED AS TEXTFILE
LOCATION '<hdfs_location>';
建立分区表
CREATE TABLE par_table
(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(date STRING, pos STRING)
ROW FORMAT DELIMITED ‘\t’
FIELDS TERMINATED BY '\n'
STORED AS SEQUENCEFILE;
建立Bucket表
CREATE TABLE par_table
(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(date STRING, pos STRING)
CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
ROW FORMAT DELIMITED ‘\t’
FIELDS TERMINATED BY '\n'
STORED AS SEQUENCEFILE;
复制一个空表
CREATE TABLE empty_key_value_store
LIKE key_value_store;
实例
假设需要使用CREATE TABLE
语句创建一个名为employee
表。下表列出了employee
表中的字段和数据类型:
Sr.No | 字段名称 | 数据类型 |
---|---|---|
1 | Eid | int |
2 | Name | String |
3 | Salary | Float |
4 | Designation | string |
下面的数据是一个注释,行格式字段,如字段终止符,行终止符,并保存的文件类型。
COMMENT ‘Employee details’
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED IN TEXT FILE
下面的查询创建使用上述数据的表名为 employee
。
hive> CREATE TABLE IF NOT EXISTS employee
>( eid int, name String, salary String, destination String)
> COMMENT ‘Employee details’
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ‘\t’
> LINES TERMINATED BY ‘\n’
> STORED AS TEXTFILE;
成功创建表后,能看到以下回应:
OK
Time taken: 5.905 seconds
hive>
JDBC 程序
以下是使用JDBC程序来创建表给出的一个例子。
import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
public class HiveCreateTable {
private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";
public static void main(String[] args) throws SQLException {
// Register driver and create driver instance
Class.forName(driverName);
// get connection
Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/userdb", "", "");
// create statement
Statement stmt = con.createStatement();
// execute statement
stmt.executeQuery("CREATE TABLE IF NOT EXISTS employee "
+" ( eid int, name String, salary String, destignation String)"
+" COMMENT ‘Employee details’"
+" ROW FORMAT DELIMITED"
+" FIELDS TERMINATED BY ‘\t’"
+" LINES TERMINATED BY ‘\n’"
+" STORED AS TEXTFILE;");
System.out.println(“ Table employee created.”);
con.close();
}
}
将该程序保存在一个名为HiveCreateDb.java文件。下面的命令用于编译和执行这个程序。
$ javac HiveCreateDb.java
$ java HiveCreateDb
输出
Table employee created.
网友评论