-
数据示例
table_purchase.png
-
数据解释
- ID: 每条购买记录对应一个唯一的ID
- PURCHASE_TIME: 客户购买该商品的时间,以天为单位
- CLIENTID: 每位购买商品的客户对应一个唯一的ID
- SKU_TYPE: 客户购买的商品类型
(理想化假设:每位客户一天只能购买一件商品)
- 数据要求
- 各个商品类型的购买量
- 每天的新客户增长情况
4.分析
去重的常规做法是【 group by +分类项】&【 distinct +去重项】,而SQL是先运行 group by 再运行 select 中的内容,因此第二种要求则不适用这种做法:如A顾客在1/1号购买了一件商品,在1/2也购买了一件商品,如使用常规方式,则在每日增长的统计中就被计算了2次。因此需要先去重,再group by。
-
SQL语句
JULIA大神的SQL运行顺序图(来自微博博主@爱可可-爱生活).jpeg
-各个商品类型的购买人数
select p.sku_type, count(distinct p.clientid) as 'client_amount'
from purchase as p
group by p.sku_type
1.group by p.sku_type
![](https://img.haomeiwen.com/i21886267/b5d0b09412c11f6d.png)
2.
select p.sku_type, count(distinct p.clientid) as 'client_amount'
![](https://img.haomeiwen.com/i21886267/b3ac790105148067.png)
- 每天的新客户增长情况
select a.date_format(date(a.b),'%Y-%m-%d') as 'purchase_time'
, count(distinct a.clientid) as 'new_client_amount'
from
(select p.clientid, date_format(MIN(date(p.purchase_time)),'%Y-%m-%d') as b
from purchase as p
group by p.clientid) as a
group by date(a.b)
order by date(a.b) ASC
from
(select p.clientid, date_format(MIN(date(p.purchase_time)),'%Y-%m-%d') as b
from purchase as p
group by p.clientid) as a
1). group by p.clientid
![](https://img.haomeiwen.com/i21886267/c89c8b490cfd0df2.png)
2).
select p.clientid, date_format(MIN(date(p.purchase_time)),'%Y-%m-%d') as b'
![](https://img.haomeiwen.com/i21886267/81e26f5e89df4ddb.png)
-
group by date(a.b)
group by date(a.b).png
select a.date_format(date(a.b),'%Y-%m-%d') as 'purchase_time',
count(distinct a.clientid) as 'new_client_amount'
![](https://img.haomeiwen.com/i21886267/ff3414536ac01290.png)
网友评论