美文网首页
Hive练习(一)

Hive练习(一)

作者: hipeer | 来源:发表于2018-09-22 17:48 被阅读0次

    练习所使用的数据是之前创建的users表和train表中的
    1. caculate the event with thw most number of user associated
    select event_id,  count(*) as ucount from train group by event_id order by ucount desc limit 1;
    

    结果:

    +------------+---------+--+
    |  event_id  | ucount  |
    +------------+---------+--+
    | 955398943  | 242     |
    +------------+---------+--+
    
    

    2.how many users invied and not invied

    select invited, count(distinct `user_id`) as nums from train group by invited;
    

    结果:

    +----------+--------+--+
    | invited  |  nums  |
    +----------+--------+--+
    | 0        | 2032  |
    | 1        | 313    |
    +----------+--------+--+
    

    3.how male/female got invied?(train & users)

    select u.gender as gender, count(*) as nums
       from users u
     left join train t
       on u.user_id=t.user_id and t.invited=1
     group by gender 
    having gender!='';
    

    结果:

    +---------+--------+--+
    | gender  |  nums  |
    +---------+--------+--+
    | female  | 15001  |
    | male    | 23440  |
    +---------+--------+--+
    

    4.create another(internal) table which contains user_id, gender, brithyear,number of events

    create table temp_users as 
      select u.user_id as user_id, u.gender as gender, u.birthdyear as birthdyear, count(*) as event_nums
        from users u left join train t on u.user_id=t.user_id 
       group by u.user_id, u.gender, u.birthdyear 
       order by event_nums desc;
    

    结果:

    INFO  : Moving data to directory hdfs://sandbox-hdp.hortonworks.com:8020/temp/data/temp_users from hdfs://sandbox-hdp.hortonworks.com:8020/temp/data/.hive-staging_hive_2018-09-22_09-41-01_707_8002503432333532967-4/-ext-10001
    INFO  : Table test1.temp_users stats: [numFiles=1, numRows=38209, totalSize=895460, rawDataSize=857251]
    

    看一下temp_users表中的数据

    +---------------------+--------------------+------------------------+------------------------+--+
    | temp_users.user_id  | temp_users.gender  | temp_users.birthdyear  | temp_users.event_nums  |
    +---------------------+--------------------+------------------------+------------------------+--+
    | 2903662804          | female             | 1981                   | 91                     |
    | 3514195773          | male               | 1974                   | 55                     |
    | 3943175229          | female             | 1966                   | 49                     |
    | 1067164735          | female             | 1990                   | 49                     |
    | 3180412264          | female             | 1983                   | 48                     |
    +---------------------+--------------------+------------------------+------------------------+--+
    

    相关文章

      网友评论

          本文标题:Hive练习(一)

          本文链接:https://www.haomeiwen.com/subject/wzpznftx.html