美文网首页大数据BigData
Hive的serde,beeline,jdbc

Hive的serde,beeline,jdbc

作者: geekAppke | 来源:发表于2018-11-29 22:14 被阅读18次

    Hive SerDe - Serializer and Deserializer

    • SerDe 用于做序列化和反序列化。
    • 构建在数据存储和执行引擎之间,对两者实现解耦。
    • Hive通过ROW FORMAT DELIMITED以及SERDE进行内容的读写
    row_format
    : DELIMITED 
              [FIELDS TERMINATED BY char [ESCAPED BY char]] 
              [COLLECTION ITEMS TERMINATED BY char] 
              [MAP KEYS TERMINATED BY char] 
              [LINES TERMINATED BY char] 
    : SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]
    
    

    Hive正则匹配

    CREATE TABLE logtbl (
        host STRING,
        identity STRING,
        t_user STRING,
        time STRING,
        request STRING,
        referer STRING,
        agent STRING
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
    WITH SERDEPROPERTIES (
        "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) \\[(.*)\\] \"(.*)\" (-|[0-9]*) (-|[0-9]*)"
    )
    STORED AS TEXTFILE;
    
    把各个字段取出来,提取时间,修改提取规则
    满足读取规则就展示,不满足这行全是NULL
        写数据你随便,hive是度检查
        上传数据还检查格式,浪费时间、资源
    
    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
    ([^ ]*) ([^ ]*) ([^ ]*) \\[(.*)\\] \"(.*)\" (-|[0-9]*) (-|[0-9]*)
    
    hive> load data local inpath '/root/log' into table logtbl;
    hive> select * from logtbl;
    OK
    192.168.57.4 - - 29/Feb/2016:18:14:35 +0800 GET /bg-upper.png HTTP/1.1 304 -
    192.168.57.4 - - 29/Feb/2016:18:14:35 +0800 GET /bg-nav.png HTTP/1.1 304 -
    

    Tomcat日志

    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-nav.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /asf-logo.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-button.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:35 +0800] "GET /bg-middle.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /asf-logo.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-middle.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-button.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-nav.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET / HTTP/1.1" 200 11217
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.css HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /tomcat.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-button.png HTTP/1.1" 304 -
    192.168.57.4 - - [29/Feb/2016:18:14:36 +0800] "GET /bg-upper.png HTTP/1.1" 304 -
    

    Hive Beeline

    • Beeline 要与HiveServer2配合使用
    • 服务端启动hiveserver2
    • 客户的通过beeline两种方式连接到hive
    node002既是服务端,有时客户端,开多个连接
    [root@node002 ~]# hiveserver2
    让它后台启动!
    
    查看10000端口有没有启动
    # ss -nal
    
    
    [root@node002 ~]# beeline -u jdbc:hive2://node002:10000/default -n root
    0: jdbc:hive2://node002:10000/default>!quit
    
    
    默认 用户名、密码不验证(但不能不输!)权限控制!
    [root@node002 ~]# beeline
    beeline> !connect jdbc:hive2://node002:10000/default root 123
    0: jdbc:hive2://node002:10000/default> select * from psn;
    +---------+-----------+-------------------------+-------------------------------------------+--+
    | psn.id  | psn.name  |        psn.likes        |                psn.address                |
    +---------+-----------+-------------------------+-------------------------------------------+--+
    | 1       | 小明1       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 2       | 小明2       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 3       | 小明3       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 4       | 小明4       | ["lol","movie"]         | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 5       | 小明5       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 6       | 小明6       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 7       | 小明7       | ["lol","book","game"]   | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 8       | 小明8       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    | 9       | 小明9       | ["lol","book","movie"]  | {"shenzhen":"luohu","shanghai":"pudong"}  |
    +---------+-----------+-------------------------+-------------------------------------------+--+
    9 rows selected (0.219 seconds)
    0: jdbc:hive2://node002:10000/default> !quit
    Closing: 0: jdbc:hive2://node002:10000/default
    [root@node002 ~]# 
    

    更加以易读的格式输出而已,2种不同客户端的访问方式!
    beeline的本质是jdbc

    Hive JDBC

    Hive JDBC运行方式
    服务端启动hiveserver2后,在java代码中通过调用hive的jdbc访问默认端口10000进行连接、访问
    

    相关文章

      网友评论

        本文标题:Hive的serde,beeline,jdbc

        本文链接:https://www.haomeiwen.com/subject/dlqqcqtx.html