impala

作者: 无量儿 | 来源:发表于2021-12-10 10:39 被阅读0次

    1、impala操作命令
    https://blog.csdn.net/oracle8090/article/details/53038090

    https://www.cnblogs.com/zlslch/p/6785596.html

    2、create table as select 时字段类型的坑
    https://blog.csdn.net/littlecarton/article/details/82462408

    3、判断如果为null,两个数类型需要相同
    https://www.cnblogs.com/jiangshengxiang/p/9263429.html

    SELECT IFNULL(NULL,'11'); -> 11
    SELECT IFNULL('00','11'); -> 00

    4、删除分区并插入数据
    alter table xx.xxx drop if exists partition(etl_dt='date_str'); insert into xx.xxx( column_name1 ,column_name2 ,column_name3 )partition (etl_dt) select column_name1 ,column_name2 ,column_name3 ,'date_str'
    from yy.yyy
    where etl_dt='$date_str';

    5、连接多个字段和字符串(concat)
    https://www.cnblogs.com/apolloren/p/11002449.html

    concat(column_name1,'-',column_name2)

    6、insert overwrite语句
    insert overwrite xx.xxx(
    column_name1
    ,column_name2
    )partition(etl_dt)
    select
    column_name1
    ,column_name2
    ,etl_dt
    from yy.yyy
    where etl=''
    ;

    7、日期加减几天
    date_sub (string startdate, int days)
    date_add('2020-09-01',1)

    8、Impala日期时间处理:https://blog.csdn.net/weixin_42414714/article/details/107950876

    获取当前日期时间,类型为timestamp
    select now();
    select current_timestamp();

    将日期时间转换为日期字符串(string):
    select to_date(now());

    加减月份(https://blog.csdn.net/u012580143/article/details/101059463):
    add_months(now(),-3)

    加减天数:
    date_add(now,-2)

    9、impala获取本月有多少天
    select
    32-dayofmonth(date_add(now(),+32-dayofmonth(now())))

    10、计算两个日期之间的月份(没有直接计算的函数)可以用下面的方法:
    (year(date1)-year(date2))*12+(month(date1)-month(date2))

    11、某段时间内每月的天数
    select
    distinct substring(etl_dt,1,7) as mon
    ,32-dayofmonth(
    date_add(
    add_months(now(),-((year(now())-year(etl_dt))*12+(month(now())-month(etl_dt))))
    ,32-dayofmonth(now())
    )
    ) as day_of_this_month
    from xx.xxx
    where etl_dt>='' and etl_dt<=''
    ;

    12、hive/impala获取当月第一天
    trunc('2018-09-28','MM') -- 获取当月第一天
    last_day('2018-09-28') -- 获取当月最后一天

    13、获取前一天日期
    select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),1)

    14、impala没有write权限插入数据到hive表:impala does not have write access to at least on HDFS path hdfs://...

    https://blog.csdn.net/lukabruce/article/details/84070025

    15、impala分区表用insert into...select插入数据
    https://blog.csdn.net/xiaoxiaojavacsdn/article/details/89515526

    双字段分区,动态插入:
    insert into ta partition(year,month) select * from tb

    16、Impala 卸载数据语句
    impala-shell -i x.x.x.x:xx -q "select * from xx.xx" -B --output_delimiter="\001" --print_header -o /home/etluser/data/xx.xx.dat

    17、Impala 实现在指定位置添加列
    https://blog.csdn.net/zy103118/article/details/103010632

    18、ALTER TABLE向现有表中添加列的基本语法如下
    ALTER TABLE name ADD COLUMNS (col_spec[, col_spec ...])

    19、Impala-shell 导入导出数据
    https://www.cnblogs.com/zhangyl-/p/13121492.html

    20、# SQL语句 按年龄段分组统计人数问题
    select nnd as '年龄段',count(*) as '人数',sex as '性别' from
    (
    select
    case
    when age>=1 and age<=10 then '1-10'
    when age>=11 and age<=20 then '11-20'
    when age>=21 and age<=30 then '21-30'
    when age>=31 and age<=40 then '31-40' else 'other'
    end
    as nnd,uname,sex from #t
    )
    a
    group by nnd,sex

    21、trunc()
    trunc('2021-07-07','dd') 当天日期
    trunc('2021-07-07','day') 当前周周一的日期
    trunc('2021-07-07','month') 当月的第一天日期
    trunc('2021-07-07','mm') 当月的第一天日期

    22、WARNINGS: Row of size 3.88 MB could not be materialized in plan node with id 2. Increase the max_row_size query option (currently 512.00 KB) to process larger rows.
    https://blog.csdn.net/wuxintdrh/article/details/112554737

    在impala-shell命令行执行
    set max_row_size=5m

    23、按行交叉对比
    select
    t1.extractcashbillid
    ,t2.extractcashbillid
    ,t1.s1
    ,t2.s2
    ,t1.start_dt
    ,t1.end_dt
    ,t2.start_dt
    ,t2.end_dt
    ,t1.year_month
    ,t2.year_mont
    from (
    select
    row_number() over(partition by extractcashbillid order by start_dt desc) as s1
    ,start_dt
    ,end_dt
    ,year_month
    ,extractcashbillid
    from xx.yyy1
    ) t1
    join (
    select
    row_number() over(partition by extractcashbillid order by start_dt desc) as s1
    ,start_dt
    ,end_dt
    ,year_month
    ,extractcashbillid
    from xx.yyy1
    ) t2
    on t1.extractcashbillid =t2.extractcashbillid
    and t1.year_month=t2.year_month
    and t1.s1=t2.s2-1
    and t1.start_dt<>t2.end_dt

    相关文章

      网友评论

        本文标题:impala

        本文链接:https://www.haomeiwen.com/subject/qvqnzhtx.html