HIVE(2)

作者: 山猪打不过家猪 | 来源:发表于2023-02-20 05:05 被阅读0次

3. Hive Table Operations

3.1 Load
  • you should use load when you finish the data cleaning.


    image.png
3.1.1 filepath
image.png
3.1.2 LOCAL
  • add LOCAL: It means finding files in the local system; relative path or unrelative path file:///user/hive/project/data1..
  • No LOCAL field: It uses fs.default.name(unexpected is HDFS)
  • where is LOCAL
    local file system refers to the local Linux file system of the machine where the Hiveserver2 is located rather than the local file system where the HIVE client is located.


    image.png
3.1.3 EXERCISE
  1. load data from local files. Data is located at HS2(node1), equal to hadoop fs -put
LOAD DATA LOCAL INPATH '/root/hivedata/students.txt' INTO TABLE student_local;
  1. Load data from HDFS to partition and a specified partition. The data is in the root of the HDFS file system.
  • 1st: use put uploading the data to htfs
hadoop fs -put /root/hivedata/students.txt
  • 2end: use LOAD
LOAD DATA INPATH '/students.txt' INTO TABLE student_HDFS_p partition(country="China")

3.1.4 hive 3.0 new characters
  1. create a table
CREATE TABLE if not exists tab1 (col1 int, col2 int)
PARTITIONED BY (col3 int)
row format delimited fields terminated by ',';
>>
11,22
33,44
  1. before 3.0, loading data
LOAD DATA LOCAL INPATH '/root/hivedata/xxx.txt' INTO TABLE tab1 partition(col3="1");
  1. After 3.0
LOAD DATA LOCAL INPATH '/root/hivedata/tab1.txt' INTO TABLE tab1;
3.2 insert

Using insert like the sql in the hive will take a long time. So you can use it in
insert+select makes you insert the data which you selected.

insert into table student_from_insert 
select num, name from student
3.2.1 multiple insert
  • normal insert
insert into student_insert1
select num  from student;

insert into student_insert2
select name  from student;
  • scan once, insert multipule
--多重插入  一次扫描 多次插入
from student
insert overwrite table student_insert1
select num
insert overwrite table student_insert2
select name;
3.3 export data
  • export to hdfs,use default delimiter\001
insert overwrite directory '/tmp/hive_export/e1' select num,name,age from student limit 10;
  • export to local system
insert overwrite local directory '/root/hive_export/e1' select * from student
  • Using specified dilimiters to export data
insert overwrite directory '/tmp/hive_export/e2' row format delimited fields terminated by ','
select * from student
3.4 Select
3.4.1 Covid data exercise
    1. create a normal table
drop table if exists t_usa_covid19;
CREATE TABLE t_usa_covid19(
       count_date string,
       county string,
       state string,
       fips int,
       cases int,
       deaths int)
row format delimited fields terminated by ",";
    1. load data into the t_usa_covid19
load data local inpath '/root/hivedata/us-covid19-counties.dat' into table t_usa_covid19;

select * from t_usa_covid19;
    1. create a partitioned table based on count_data, state
CREATE TABLE if not exists t_usa_covid19_p(
     county string,
     fips int,
     cases int,
     deaths int)
partitioned by(count_date string,state string)
row format delimited fields terminated by ",";
    1. loadt_usa_covid19_p using dynamic partitioned table
set hive.exec.dynamic.partition.mode = nonstrict;

insert into table t_usa_covid19_p partition (count_date,state)
select county,fips,cases,deaths,count_date,state from t_usa_covid19;

相关文章

  • oozie-hive-ssh调用

    hive oozie调用hive action,注意,不是使用hive2 actionoozie需要上传 work...

  • hive的安装步骤

    1上传解压 2修改hive的配置文件: hive-env.sh 3 修改 hive的 hive-site.xml ...

  • java读取hive的orc文件

    hive API2hive udf 读取HDFS的orc文件

  • Hive安装环境准备

    2.1 Hive安装地址 1)Hive官网地址: http://hive.apache.org/ 2)文档查看地址...

  • hive安装

    hive安装 1. hive获取 2. 环境配置 3.hive 初始化 4.启动 Hive 客户端,测试

  • hive相关基础

    hive相关基础 1、进入到hive命令下;(linux下直接输入hive即可) hive 2、查看工作分区下的h...

  • hive(一):hive概述

    1.1 什么是Hive 1) Hive简介 2) Hive本质:将HQL转化成MapReduce程序 [图片上传失...

  • 16.apache-hadoop之hive安装

    1.在hive官网下载hive 2.修改环境变量 3.设置mysql关联hive hive-env.sh 添加以下...

  • Hive安装配置

    2. Hive 的基本概念 2.1. Hive 简介 什么是 Hive Hive是基于Hadoop的一个数据仓库工...

  • Apache Hive Essentials 2nd - 201

    Apache Hive Essentials 2nd - 2018.pdf Apache Hive Essenti...

网友评论

      本文标题:HIVE(2)

      本文链接:https://www.haomeiwen.com/subject/mpnvkdtx.html