People表明细如下:
image.png
需求:按照性别分组,求分组后年龄最大的两个年龄的人员信息
SELECT * FROM
(select id,name,age,sex,
row_number() OVER(partition by sex order by age desc) r
FROM safety.dw_hive_people_df) t
WHERE r<=2
image.png
学生技能表如下
image.png
需求:按照技能分组,求分组后年龄最大的两个学生信息
先将skills炸开
SELECT id,name,age,skill,
row_number() over(partition by skill order by age desc ) r
FROM safety.dw_hive_stu_skils_df
LATERAL VIEW explode(split(skills,'-')) tb_view as skill
image.png
如果是SparkSQL可以直接写成
SELECT id,name,age,explode(split(skills,'-')) as skill
FROM safety.dw_hive_stu_skils_df
最终统计SQL为:
SELECT id,name,age,skill FROM
(
SELECT id,name,age,skill,
row_number() over(partition by skill order by age desc ) r
FROM safety.dw_hive_stu_skils_df
LATERAL VIEW explode(split(skills,'-')) tb_view as skill
) t
where t.r<=2
image.png
网友评论