美文网首页
hive级联累计

hive级联累计

作者: pamperxg | 来源:发表于2017-08-11 21:20 被阅读0次

根据访问次数统计表,得到累计访问总计

  • 建表,load数据
t_access_times.dat
A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5

create table t_access_times(username string,month string,salary int)
row format delimited fields terminated by ',';

load data local inpath '/home/hadoop/t_access_times.dat' into table t_access_times;
+--------------------------+-----------------------+------------------------+--+
| t_access_times.username  | t_access_times.month  | t_access_times.salary  |
+--------------------------+-----------------------+------------------------+--+
| A                        | 2015-01               | 5                      |
| A                        | 2015-01               | 15                     |
| B                        | 2015-01               | 5                      |
| A                        | 2015-01               | 8                      |
| B                        | 2015-01               | 25                     |
| A                        | 2015-01               | 5                      |
| A                        | 2015-02               | 4                      |
| A                        | 2015-02               | 6                      |
| B                        | 2015-02               | 10                     |
| B                        | 2015-02               | 5                      |
+--------------------------+-----------------------+------------------------+--
  • 求每个用户的月总金额
select username,month,sum(salary) from t_access_times group by username,month;
+-----------+----------+------+--+
| username  |  month   | _c2  |
+-----------+----------+------+--+
| A         | 2015-01  | 33   |
| A         | 2015-02  | 10   |
| B         | 2015-01  | 30   |
| B         | 2015-02  | 15   |
+-----------+----------+------+--+
  • 把表自己inner join
select a.*,b.* from
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
+-------------+----------+-----------+-------------+----------+-----------+--+
| a.username  | a.month  | a.salary  | b.username  | b.month  | b.salary  |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A           | 2015-01  | 33        | A           | 2015-01  | 33        |
| A           | 2015-01  | 33        | A           | 2015-02  | 10        |
| A           | 2015-02  | 10        | A           | 2015-01  | 33        |
| A           | 2015-02  | 10        | A           | 2015-02  | 10        |
| B           | 2015-01  | 30        | B           | 2015-01  | 30        |
| B           | 2015-01  | 30        | B           | 2015-02  | 15        |
| B           | 2015-02  | 15        | B           | 2015-01  | 30        |
| B           | 2015-02  | 15        | B           | 2015-02  | 15        |
+-------------+----------+-----------+-------------+----------+-----------+--+
  • 生成累计值
select a.username,a.month,max(a.salary) as salary,sum(b.salary) as accumulate from
(select username,month,sum(salary) as salary from t_access_times group by username,month) A inner join (select username,month,sum(salary) as salary from t_access_times group by username,month) B on a.username=b.username
where b.month <= a.month
group by a.username,a.month
order by a.username,a.month;

+-------------+----------+---------+-------------+--+
| a.username  | a.month  | salary  | accumulate  |
+-------------+----------+---------+-------------+--+
| A           | 2015-01  | 33      | 33          |
| A           | 2015-02  | 10      | 43          |
| B           | 2015-01  | 30      | 30          |
| B           | 2015-02  | 15      | 45          |
+-------------+----------+---------+-------------+

分组查询求月累计值。
为什么要max(salary)?
salary不是分组字段,只能由聚合函数求得,不然不知道选哪个。sum avg max

最后order by 使全局有序,原始数据无序,最后有可能无序。

相关文章

  • hive级联累计

    根据访问次数统计表,得到累计访问总计 建表,load数据 求每个用户的月总金额 把表自己inner join 生成...

  • hive优化-级联求和

    一、需求:根据每日访问信息,算累计访问 输入数据: 输出数据: 二、准备表和数据 1.创建表 2.准备数据,dat...

  • Hibernate 对象标准查询

    时间 时间 时间 时间 时间 级联 级联 级联 级联 级联 级联 级联 级联 级联 Criteria方式, 效果同...

  • 级联求和

    前提 最近在学习hive,碰到了级联求和的问题.经过一番思考学习,现在做些学习笔记. 需求 原始数据表 根据上面的...

  • 数据仓库Hive

    Hive产生背景 Hive概述 HIve体系架构 Hive部署架构 Hive和RDBMS区别 Hive部署以及快速...

  • 数据查询-Hive基础

    outline 什么是Hive 为什么需要Hive Hive的架构 Hive的常用操作 什么是Hive Hive由...

  • opencv cascade级联分类器的训练

    充电站特征训练项目源码 介绍: Cascade即为级联,在网络WAN口级联、数据库信息间级联以及电路连接中的级联等...

  • 大数据知识 | hive初识

    hive简介 hive架构 hive是什么 官网这样说:https://hive.apache.org/ hive...

  • 2018-10-15:级联操作与类型转换

    级联操作 级联操作:persistent update delete none all新增 ...

  • OpenCV 人脸识别

    1 CascadeClassifier 级联分类器人脸识别 有两种:haar级联和lbp级联,我用brew安装的,...

网友评论

      本文标题:hive级联累计

      本文链接:https://www.haomeiwen.com/subject/ytylrxtx.html