美文网首页
12-Hive高级02

12-Hive高级02

作者: CrUelAnGElPG | 来源:发表于2018-08-14 00:49 被阅读0次

Hive高级第二部分: *****Hive:复杂数据类型、JDBC编程ZK: Compression压缩比解压速度1G的没压缩数据:1G的gzip压缩数据:codec:我们只需要配置在hadoop的配置文件中即可压缩的使用core-site.xmlio.compression.codecs org.apache.hadoop.io.compress.GzipCodec,

org.apache.hadoop.io.compress.DefaultCodec,

org.apache.hadoop.io.compress.BZip2Codec,

mapred-site.xmlmapreduce.output.fileoutputformat.compresstruemapreduce.output.fileoutputformat.compress.codecorg.apache.hadoop.io.compress.BZip2Codeccreate table ruoze_page_views(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string)row format delimited fields terminated by '\t';load data local inpath '/home/hadoop/data/page_views.dat' overwrite into table  ruoze_page_views; SET hive.exec.compress.output=true;set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec;create table ruoze_page_views_bzip2row format delimited fields terminated by '\t'as select * from ruoze_page_views;  set hive.exec.compress.output=false;----------------------- Storage Format STORED AS file_formatcreate table ruoze_b(id int) stored as INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';行式存储 vs 列式存储保证一行所有的列都在一个block里面大数据中:一个表非常多的字段,我们大部分场景只有其中的某些字段TEXTFILESEQUENCEFILE  ...create table ruoze_page_views_seq(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string) row format delimited fields terminated by '\t'stored as SEQUENCEFILE; load data local inpath '/home/hadoop/data/page_views.dat' overwrite into table  ruoze_page_views_seq;insert into table ruoze_page_views_seq select * from ruoze_page_views;create table ruoze_page_views_rc(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string) row format delimited fields terminated by '\t'stored as rcfile; insert into table ruoze_page_views_rc select * from ruoze_page_views;create table ruoze_page_views_orc(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string) row format delimited fields terminated by '\t'stored as orc; insert into table ruoze_page_views_orc select * from ruoze_page_views;create table ruoze_page_views_orc_null(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string) row format delimited fields terminated by '\t'stored as orc tblproperties ("orc.compress"="NONE"); insert into table ruoze_page_views_orc_null select * from ruoze_page_views;parquet: dremelcreate table ruoze_page_views_parquet(track_time string,url string,session_id string,referer string,ip string,end_user_id string,city_id string) row format delimited fields terminated by '\t'stored as parquet; insert into table ruoze_page_views_parquet select * from ruoze_page_views;set parquet.compression=GZIP;create table ruoze_page_views_parquet_gzip row format delimited fields terminated by '\t'stored as parquetas select * from ruoze_page_views;select count(1) from ruoze_page_views where session_id='B58W48U4WKZCJ5D1T3Z9ZY88RU7QA7B1';19022752select count(1) from ruoze_page_views_orc where session_id='B58W48U4WKZCJ5D1T3Z9ZY88RU7QA7B1';1257523select count(1) from ruoze_page_views_parquet where session_id='B58W48U4WKZCJ5D1T3Z9ZY88RU7QA7B1';26870773496487

相关文章

  • 12-Hive高级02

    Hive高级第二部分: *****Hive:复杂数据类型、JDBC编程ZK: Compression压缩比解压...

  • SQL 高级 01

    SQL 高级 02 SQL 高级 03 SQLite Limit 子句 SQLite 的 LIMIT 子句用于限制...

  • Canvas-矩形绘制-Day01

    01矩形入门 02矩形进阶 03矩形高级 清屏

  • MySQL高级查询

    layout: posttitle: "MySQL高级查询"date: 2016-06-02 11:14:38 +...

  • SQL 高级 03

    SQL 高级 01 SQL 高级 02 CREATE INDEX 语句用于在表中创建索引。 在不读取整个表的情况下...

  • Google Hacking总结

    基础篇: 进阶篇: LINK: buleshit.xyz/2017/02/06/Google搜索之高级使用篇/

  • javascript高级-02

    一、原型及原型链 二、原型的指向是可以改变的 原型最终指向了哪里? 三、函数中的this的指向

  • SQL 高级 02

    SQL 高级 01 SQL 高级 03 SELECT INTO 语句 SELECT INTO 语句从一个表中选取数...

  • js高级-02

    静态成员和实例成员 实例成员实例成员就是构造函数内部通过this添加的成员 如下列代码中uname age sin...

  • flutter高级02

    yaml:管理仓库,代码管理,文件管理 routes:路由MaterialApp:必要,app入口 Statele...

网友评论

      本文标题:12-Hive高级02

      本文链接:https://www.haomeiwen.com/subject/lqapbftx.html