1、impala操作命令
https://blog.csdn.net/oracle8090/article/details/53038090
https://www.cnblogs.com/zlslch/p/6785596.html
2、create table as select 时字段类型的坑
https://blog.csdn.net/littlecarton/article/details/82462408
3、判断如果为null,两个数类型需要相同
https://www.cnblogs.com/jiangshengxiang/p/9263429.html
SELECT IFNULL(NULL,'11'); -> 11
SELECT IFNULL('00','11'); -> 00
4、删除分区并插入数据
alter table xx.xxx drop if exists partition(etl_dt='date_str'
from yy.yyy
where etl_dt='$date_str';
5、连接多个字段和字符串(concat)
https://www.cnblogs.com/apolloren/p/11002449.html
concat(column_name1,'-',column_name2)
6、insert overwrite语句
insert overwrite xx.xxx(
column_name1
,column_name2
)partition(etl_dt)
select
column_name1
,column_name2
,etl_dt
from yy.yyy
where etl=''
;
7、日期加减几天
date_sub (string startdate, int days)
date_add('2020-09-01',1)
8、Impala日期时间处理:https://blog.csdn.net/weixin_42414714/article/details/107950876
获取当前日期时间,类型为timestamp
select now();
select current_timestamp();
将日期时间转换为日期字符串(string):
select to_date(now());
加减月份(https://blog.csdn.net/u012580143/article/details/101059463):
add_months(now(),-3)
加减天数:
date_add(now,-2)
9、impala获取本月有多少天
select
32-dayofmonth(date_add(now(),+32-dayofmonth(now())))
10、计算两个日期之间的月份(没有直接计算的函数)可以用下面的方法:
(year(date1)-year(date2))*12+(month(date1)-month(date2))
11、某段时间内每月的天数
select
distinct substring(etl_dt,1,7) as mon
,32-dayofmonth(
date_add(
add_months(now(),-((year(now())-year(etl_dt))*12+(month(now())-month(etl_dt))))
,32-dayofmonth(now())
)
) as day_of_this_month
from xx.xxx
where etl_dt>='' and etl_dt<=''
;
12、hive/impala获取当月第一天
trunc('2018-09-28','MM') -- 获取当月第一天
last_day('2018-09-28') -- 获取当月最后一天
13、获取前一天日期
select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),1)
14、impala没有write权限插入数据到hive表:impala does not have write access to at least on HDFS path hdfs://...
https://blog.csdn.net/lukabruce/article/details/84070025
15、impala分区表用insert into...select插入数据
https://blog.csdn.net/xiaoxiaojavacsdn/article/details/89515526
双字段分区,动态插入:
insert into ta partition(year,month) select * from tb
16、Impala 卸载数据语句
impala-shell -i x.x.x.x:xx -q "select * from xx.xx" -B --output_delimiter="\001" --print_header -o /home/etluser/data/xx.xx.dat
17、Impala 实现在指定位置添加列
https://blog.csdn.net/zy103118/article/details/103010632
18、ALTER TABLE向现有表中添加列的基本语法如下
ALTER TABLE name ADD COLUMNS (col_spec[, col_spec ...])
19、Impala-shell 导入导出数据
https://www.cnblogs.com/zhangyl-/p/13121492.html
20、# SQL语句 按年龄段分组统计人数问题
select nnd as '年龄段',count(*) as '人数',sex as '性别' from
(
select
case
when age>=1 and age<=10 then '1-10'
when age>=11 and age<=20 then '11-20'
when age>=21 and age<=30 then '21-30'
when age>=31 and age<=40 then '31-40' else 'other'
end
as nnd,uname,sex from #t
)
a
group by nnd,sex
21、trunc()
trunc('2021-07-07','dd') 当天日期
trunc('2021-07-07','day') 当前周周一的日期
trunc('2021-07-07','month') 当月的第一天日期
trunc('2021-07-07','mm') 当月的第一天日期
22、WARNINGS: Row of size 3.88 MB could not be materialized in plan node with id 2. Increase the max_row_size query option (currently 512.00 KB) to process larger rows.
(https://blog.csdn.net/wuxintdrh/article/details/112554737)
在impala-shell命令行执行
set max_row_size=5m
23、按行交叉对比
select
t1.extractcashbillid
,t2.extractcashbillid
,t1.s1
,t2.s2
,t1.start_dt
,t1.end_dt
,t2.start_dt
,t2.end_dt
,t1.year_month
,t2.year_mont
from (
select
row_number() over(partition by extractcashbillid order by start_dt desc) as s1
,start_dt
,end_dt
,year_month
,extractcashbillid
from xx.yyy1
) t1
join (
select
row_number() over(partition by extractcashbillid order by start_dt desc) as s1
,start_dt
,end_dt
,year_month
,extractcashbillid
from xx.yyy1
) t2
on t1.extractcashbillid =t2.extractcashbillid
and t1.year_month=t2.year_month
and t1.s1=t2.s2-1
and t1.start_dt<>t2.end_dt
网友评论