美文网首页大数据BigData
Hive建表并加载数据

Hive建表并加载数据

作者: geekAppke | 来源:发表于2018-11-26 20:47 被阅读16次

    hive的元数据要保存到mysql里,hdfs上就是文件,没有定义各种数据格式
    将文件格式化读取,必须要有区分字符
    hive依赖的关系数据库mysql,不用维护!

    建库

    hive> create database test;
    hive> drop database test;
    
    默认在default数据库
    

    建表

    LanguageManual DDL - Create/Drop/Truncate Table

    hive> drop table psn;
    hive> create table psn(
    id int,
    name string,
    likes array<string>,
    address map<string, string>
    )
    ROW FORMAT DELIMITED 
    FIELDS TERMINATED BY ','
    COLLECTION ITEMS TERMINATED BY '-'
    MAP KEYS TERMINATED BY ':'
    LINES TERMINATED BY '\n'; 
    
    属性通过用什么隔开?(集合、map……)
    行通过什么隔开?默认是'\n',可不写
    

    外部表和内部表的区别

    EXTERNAL_TABLE | MANAGED_TABLE
    创建时需要指定目录
        location '/usr/';
    删除时内部表将`表结构`和`数据`全部删除;外部表只删除`表结构`,不删除`数据`
        表结构:hive上的表
        数据:HDFS上数据文件
    外部表、外部表元数据都保存在mysql中
    

    查看表结构

    hive> desc formatted psn;
    OK
    # col_name              data_type               comment             
             
    id                      int                                         
    name                    string                                      
    likes                   array<string>                               
    address                 map<string,string>                          
             
    # Detailed Table Information         
    Database:               default                  
    Owner:                  root                     
    CreateTime:             Sat Nov 24 08:43:34 CST 2018     
    LastAccessTime:         UNKNOWN                  
    Protect Mode:           None                     
    Retention:              0                        
    Location:               hdfs://mycluster/user/hive/warehouse/psn     
    Table Type:             MANAGED_TABLE            
    Table Parameters:        
        COLUMN_STATS_ACCURATE   true                
        numFiles                1                   
        totalSize               498                 
        transient_lastDdlTime   1543020308          
             
    # Storage Information        
    SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe   
    InputFormat:            org.apache.hadoop.mapred.TextInputFormat     
    OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
    Compressed:             No                       
    Num Buckets:            -1                       
    Bucket Columns:         []                       
    Sort Columns:           []                       
    Storage Desc Params:         
        colelction.delim        -                   
        field.delim             ,                   
        line.delim              \n                  
        mapkey.delim            :                   
        serialization.format    ,                   
    Time taken: 0.251 seconds, Fetched: 36 row(s)
    

    上传并加载文件

    LanguageManual DML - Loading files into tables
    数据要对应,否则不能成功!

    hive> LOAD DATA LOCAL INPATH '/root/psn.data' INTO TABLE psn;
    

    Hive上查询

    hive> select * from psn;
    OK
    1 小明1 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    2 小明2 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    3 小明3 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    4 小明4 ["lol","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    5 小明5 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    6 小明6 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    7 小明7 ["lol","book","game"] {"shenzhen":"luohu","shanghai":"pudong"}
    8 小明8 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    9 小明9 ["lol","book","movie"] {"shenzhen":"luohu","shanghai":"pudong"}
    

    HDFS上查询

    [root@node004 ~]# hdfs dfs -cat /user/hive/warehouse/psn/*
    1,小明1,lol-book-movie,shenzhen:luohu-shanghai:pudong
    2,小明2,lol-book-movie,shenzhen:luohu-shanghai:pudong
    3,小明3,lol-book-movie,shenzhen:luohu-shanghai:pudong
    4,小明4,lol-movie,shenzhen:luohu-shanghai:pudong
    5,小明5,lol-book-movie,shenzhen:luohu-shanghai:pudong
    6,小明6,lol-book-movie,shenzhen:luohu-shanghai:pudong
    7,小明7,lol-book-game,shenzhen:luohu-shanghai:pudong
    8,小明8,lol-book-movie,shenzhen:luohu-shanghai:pudong
    9,小明9,lol-book-movie,shenzhen:luohu-shanghai:pudong
    

    参考资料

    Hadoop集群上搭建Hive - 简书

    相关文章

      网友评论

        本文标题:Hive建表并加载数据

        本文链接:https://www.haomeiwen.com/subject/hfodqqtx.html