美文网首页数仓实战1(阿里架构)
数仓项目05:ADS数据应用层

数仓项目05:ADS数据应用层

作者: 勇于自信 | 来源:发表于2020-06-24 11:52 被阅读0次

5. ADS应用数据层

5.1 流量类指标_平台统计表【ADS】

建库:

create database ads_nshop;
use ads_nshop;

建表:

CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_platform_flow_stat (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
visit_avg_duration INT COMMENT '人均页面访问时长',
visit_avg_counts INT COMMENT '人均页面访问数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_platform_flow_stat/'

1.从ods_02_customer 表统计用户信息
2.从dwd_nshop_actlog_pdtview 用户浏览信息表查出用户浏览时长和浏览用户
3.从dws_nshop_ulog_view 用户浏览表查出访问次数和访问用户
4.用总的停留时长除以去重后的用户数量就是平均访问时长,用总访问次数除以去重后的用户数就是平均访问次数。
统计sql如下:

insert overwrite table ads_nshop.ads_nshop_platform_flow_stat partition(bdp_day='20200618')
select 
c.customer_gender,
c.customer_age_range,
c.customer_natives,
sum(pv.duration)/count(distinct pv.user_id) as visit_avg_duration,
sum(v.view_count)/count(distinct pv.user_id)as visti_avg_counts
from
ods_nshop.ods_02_customer c
join
dwd_nshop.dwd_nshop_actlog_pdtview pv
on
c.customer_id=pv.user_id
join 
dws_nshop.dws_nshop_ulog_view v
on
v.user_id=pv.user_id
where
v.bdp_day='20200618'
group by
c.customer_natives,c.customer_gender,c.customer_age_range
5.2 流量类指标_平台流量统计P/UV【ADS】

独立访客数(UV): 指来到网站或页面的用户总数,这个用户是独立的,同一用户不同时段访问网站只算作一个独立访客,不会重复累计,通常以PC端的Cookie数量作为统计依据,APP端的设备号作为统计依据。
页面访客数(PV): 页面访问次数,即当前页面的被访问的次数,即浏览量PV;举例:首页,访问次数,1000次。
人均页面访问数: 平均访问页数是用户访问网站的平均浏览页数。平均访问页数=浏览量/访问次数。平均访问页数很 少,说明访客进入你的网站后访问少数几个页面就离开了.

CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_flowpu_stat (
uv BIGINT COMMENT '独立访客数',
pv BIGINT COMMENT '页面访客数',
pv_avg INT COMMENT '人均页面访问数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_flow/'

数据是从DWS层的用户浏览表里统计,统计sql如下:

insert overwrite table ads_nshop.ads_nshop_flowpu_stat partition(bdp_day='20200618')
select
count(distinct user_id) as uv,
sum(view_count)as pv,
sum(view_count)/count(distinct user_id)as pa_avg
from 
dws_nshop.dws_nshop_ulog_view
where
bdp_day='20200618'
5.3 平台搜索热词统计【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_search_keys (
search_keys string COMMENT '搜索内容',
gender string COMMENT '性别',
age_range string COMMENT '年龄段',
os string COMMENT '手机系统',
manufacturer string COMMENT '手机制造商',
area_code string COMMENT '地区编码',
search_users INT COMMENT '此搜索内容用户数',
search_records INT COMMENT '此搜索内容查询次数',
search_category string COMMENT '查询目标分类',
search_orders string COMMENT '查询排序',
search_targets INT COMMENT '查询目标数量'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_search_keys/'

1.在ods_02_customer表统计用户信息
2.dwd层的用户产品查询表dwd_nshop_actlog_pdtsearch统计查询相关信息
3.dws层的用户查询聚合表dws_nshop_ulog_search统计查询数量
4.对用户进行count聚合,对查询数量进行sum聚合,对产品id进行sum聚合
统计sql如下:

with tbsearch as(
select
    s1.target_keys,
    c.customer_id,
    c.customer_gender,
    c.customer_age_range,
    s1.os ,
    s1.manufacturer,
    s1.area_code ,
    s1.target_order,
    s1.target_id,
    s2.search_count
    from
    ods_nshop.ods_02_customer c
    join
    dwd_nshop.dwd_nshop_actlog_pdtsearch s1
    on
    c.customer_id=s1.user_id
    join
    dws_nshop.dws_nshop_ulog_search s2
    on
    s1.user_id=s2.user_id
    where
    s1.bdp_day='20200618'
)
insert overwrite table ads_nshop.ads_nshop_search_keys partition(bdp_day='20200618')
select
target_keys,
customer_gender,
customer_age_range,
os,
manufacturer,
area_code,
count(distinct customer_id)as search_users,
sum(search_count) as search_records,
target_order,
sum(target_id)as search_targets
from
tbsearch 
group by
target_keys,
customer_gender,
customer_age_range,
os,
manufacturer,
area_code,
target_order
5.4 用户类启动情况表【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_actlog_launch_gsets (
os string COMMENT '手机系统',
manufacturer string COMMENT '手机制造商',
carrier string COMMENT '电信运营商',
network_type string COMMENT '网络类型',
area_code string COMMENT '地区编码',
user_count INT COMMENT '用户数',
launch_count INT COMMENT '启动次数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/actlog/ads_nshop_actlog_launch_gsets/'
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_actlog_launch_rollup (
os string COMMENT '手机系统',
manufacturer string COMMENT '手机制造商',
carrier string COMMENT '电信运营商',
network_type string COMMENT '网络类型',
area_code string COMMENT '地区编码',
user_count INT COMMENT '用户数',
launch_count INT COMMENT '启动次数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/actlog/ads_nshop_actlog_launch_rollup/';
5.5 用户类指标统计表【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_customer_stat (
os string COMMENT '手机系统',
manufacturer string COMMENT '手机制造商',
carrier string COMMENT '电信运营商',
network_type string COMMENT '网络类型',
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range VARCHAR (10) COMMENT '年龄段',
customer_natives VARCHAR (10) COMMENT '所在地区',
total_counts INT COMMENT '总用户数',
add_counts INT COMMENT '新增用户数',
active_counts INT COMMENT '活跃用户数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_customer_stat/'

在这里,总用户数指全网的浏览过的总用户数,新增用户数指当天新注册用户数,活跃用户数指当天浏览的用户数
1.通过用户信息表查询出用户信息
2.通过对用户产品浏览表的userid聚合统计出活跃用户
统计sql如下:

insert overwrite table ads_nshop.ads_nshop_customer_stat partition(bdp_day='20200618')
select
c.customer_gender,
c.customer_age_range,
c.customer_natives,
count(distinct c.customer_id) total_counts,
count(case when from_unixtime(cast(c.customer_ctime/1000 as bigint),'yyyyMMdd')='20200618' then c.customer_id end)  add_counts,
count(distinct p.user_id) active_counts
from 
ods_nshop.ods_02_customer c
join
dwd_nshop.dwd_nshop_actlog_pdtview p
on
p.bdp_day='20200618'
and
p.user_id=c.customer_id
group by
c.customer_gender,
c.customer_age_range,
c.customer_natives
5.7 总体运营指标统计表【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_oper_stat (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range VARCHAR (10) COMMENT '年龄段',
customer_natives VARCHAR (10) COMMENT '所在地区',
consignee_zipcode VARCHAR (10) COMMENT '收货人地区',
product_type INT COMMENT '商品类别',
order_counts INT COMMENT '订单数',
order_rate INT COMMENT '下单率',
order_amounts INT COMMENT '销售总金额',
order_discounts INT COMMENT '优惠总金额',
shipping_amounts INT COMMENT '运费总金额',
per_customer_transaction INT COMMENT '客单价'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_oper_stat/'

这里的总体运营指标统计,统计的有下单数,订单数、销售总金额、有汇总金额、运费总金额等。
1.首先,页面类型肯定是产品页的,所以在页面布局中间表中提前筛选出产品页。
2.在dwd层的产品浏览表中可以关联到用户id、产品id,关联查出用户相关信息,和产品类别
3.在dwd层的订单明细流水表中,分别对order_id,district_money(优惠金额),shipping_money(运费),payment_money进行sum聚合,再关联用户基本信息表和产品信息表查询相关信息。
4.最后,将以上关联,查出总体指标,sql如下:

with pr as(
select
    pp.page_code,
    pr.category_code
    from
    ods_nshop.dim_pub_product pr
    join
    ods_nshop.dim_pub_page pp
    on
    pp.page_target=pr.product_code
    where
    pp.page_type='4'
),
pdview as(
select
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    pr.category_code,
    count(1) pdview_count
    from
    dwd_nshop.dwd_nshop_actlog_pdtview pd
    join
    ods_nshop.ods_02_customer oc
    on
    pd.user_id=oc.customer_id
    join
    pr
    on
    pd.target_id=pr.page_code
    where
    pd.bdp_day='20200321'
    group by
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    pr.category_code
),
od as(
select
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    pr.category_code,
    count(distinct od.order_id) order_counts,
    sum(od.district_money) order_discounts,
    sum(od.shipping_money) shipping_amounts,
    sum(od.payment_money) order_amounts,
    sum(od.payment_money)/count(distinct od.customer_id) per_customer_transaction
    from
    dwd_nshop.dwd_nshop_orders_details od
    join
    ods_nshop.ods_02_customer oc
    on
    od.customer_id=oc.customer_id
    join
    ods_nshop.dim_pub_product pr
    on
    pr.product_code=od.product_code
    where
    od.bdp_day='20200321'
    group by
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    pr.category_code
)
insert overwrite table ads_nshop.ads_nshop_oper_stat partition(bdp_day='20200321')
select
od.customer_gender,
od.customer_age_range,
od.customer_natives,
od.category_code,
od.order_counts/pdview.pdview_count order_rate,
od.order_amounts,
od.order_discounts,
od.shipping_amounts,
od.per_customer_transaction
from
od
join
pdview
on
od.customer_gender = pdview.customer_gender 
and
od.customer_age_range = pdview.customer_age_range
and
od.customer_natives =pdview.customer_natives
and
od.category_code =pdview.category_code
5.8 风控类指标统计表【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_risk_mgt (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range VARCHAR (10) COMMENT '年龄段',
customer_natives VARCHAR (10) COMMENT '所在地区',
product_type INT COMMENT '商品类别',
start_complaint_counts INT COMMENT '发起投诉数',
cancel_complaint_counts INT COMMENT '撤销投诉数',
complaint_rate FLOAT COMMENT '投诉率'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_risk_mgt/'

主要是查询订单详情表的订单状态为6(投诉)的次数,和撤销的次数,然后将订单详情表关联用户信息,产品信息查出相关信息,sql如下:

insert overwrite table ads_nshop.ads_nshop_risk_mgt partition(bdp_day='20200321')
select
oc.customer_gender,
oc.customer_age_range,
oc.customer_natives,
pr.category_code,
count(distinct case when od.order_status=6 then od.order_id end) start_complaint_counts,
count(distinct case when od.order_status=6 then od.order_id end)/count(distinct od.order_id) complaint_rate
from dwd_nshop.dwd_nshop_orders_details od
join
ods_nshop.ods_02_customer oc
on
oc.customer_id=od.customer_id
join
ods_nshop.dim_pub_product pr
on
pr.product_code=od.product_code
where
od.bdp_day='20200321'
group by
oc.customer_gender,
oc.customer_age_range,
oc.customer_natives,
pr.category_code
5.9 支付类指标统计表 4X【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_pay_stat (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
product_type string COMMENT '商品类别',
shop_type string COMMENT '店铺类别',
pay_status string COMMENT '支付状态',
pay_type string COMMENT '支付类型',
pay_nettype string COMMENT '支付网络方式',
pay_amounts INT COMMENT '支付金额',
pay_success INT COMMENT '支付成功数',
pay_fail INT COMMENT '支付失败数',
pay_order_counts INT COMMENT '支付订单数',
pay_user_counts INT COMMENT '支付用户数',
pay_product_counts INT COMMENT '支付商品数',
order_pay_amount_rate FLOAT COMMENT '下单-支付金额转换率',
order_pay_user_rate FLOAT COMMENT '下单-支付买家数转换率',
order_pay_duration FLOAT COMMENT '下单-支付时长'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_pay_stat/';
5.12 支付类指标统计表 4X【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_pay_stat (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
product_type string COMMENT '商品类别',
shop_type string COMMENT '店铺类别',
pay_status string COMMENT '支付状态',
pay_type string COMMENT '支付类型',
pay_nettype string COMMENT '支付网络方式',
pay_amounts INT COMMENT '支付金额',
pay_success INT COMMENT '支付成功数',
pay_fail INT COMMENT '支付失败数',
pay_order_counts INT COMMENT '支付订单数',
pay_user_counts INT COMMENT '支付用户数',
pay_product_counts INT COMMENT '支付商品数',
order_pay_amount_rate FLOAT COMMENT '下单-支付金额转换率',
order_pay_user_rate FLOAT COMMENT '下单-支付买家数转换率',
order_pay_duration FLOAT COMMENT '下单-支付时长'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_pay_stat/';

1.查询dwd层的订单明细表,可以查出订单有关的订单类型、支付网络方式、订单金额等数据,并关联用户信息表。
2.查询ods层的支付记录表,可以查出支付相关的支付状态等
3.关联上面的临时表,用支付次数除以下单次数,得到转换率。
sql如下:

with orders as(
select
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    od.pay_type,
    od.pay_nettype,
    sum(od.payment_money) order_amounts, --支付金额
    count(distinct od.customer_id) order_user_counts, -- 支付用户
    count(distinct od.product_code) pay_product_counts -- 支付商品
    from
    dwd_nshop.dwd_nshop_orders_details od
    join
    ods_nshop.ods_02_customer oc
    on
    od.customer_id=oc.customer_id
    where
    od.bdp_day='20200321'
    group by
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    od.pay_type,
    od.pay_nettype
),
pay as(
select
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    op.pay_type,
    op.pay_nettype,
    sum(case when op.pay_status=1 then op.pay_amount end) pay_amounts,
    sum(case when op.pay_status=1 then 1 end) psy_success,
    sum(case when op.pay_status=0 then 1 end) psy_fail,
    count(distinct op.order_id) pay_order_counts,
    count(distinct op.customer_id) pay_user_counts
    from
    ods_nshop.ods_02_orders_pay_records op
    join
    ods_nshop.ods_02_customer oc
    on
    op.customer_id=oc.customer_id
    where
    from_unixtime(cast(op.pay_ctime/1000 as int),'yyyyMMdd')='20191102'
    group by
    oc.customer_gender,
    oc.customer_age_range,
    oc.customer_natives,
    op.pay_type,
    op.pay_nettype
)
insert overwrite table ads_nshop.ads_nshop_pay_stat_gsets partition(bdp_day='20200321')
select
pay.customer_gender,
pay.customer_age_range,
pay.customer_natives,
pay.pay_type,
pay.pay_nettype,
pay.pay_amounts,
pay.psy_success,
pay.psy_fail,
pay.pay_order_counts,
pay.pay_user_counts,
orders.pay_product_counts,
pay.pay_amounts/orders.order_amounts order_pay_amount_rate,
pay.pay_user_counts/orders.order_user_counts order_pay_user_rate
from
pay
join
orders
on
orders.customer_gender=pay.customer_gender
and
orders.customer_age_range=pay.customer_age_range
and
orders.customer_natives=pay.customer_natives
and
orders.pay_type=pay.pay_type
and
orders.pay_nettype=pay.pay_nettype
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_pay_stat_gsets (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
product_type string COMMENT '商品类别',
shop_type string COMMENT '店铺类别',
pay_status string COMMENT '支付状态',
pay_type string COMMENT '支付类型',
pay_nettype string COMMENT '支付网络方式',
pay_amounts INT COMMENT '支付金额',
pay_success INT COMMENT '支付成功数',
pay_fail INT COMMENT '支付失败数',
pay_order_counts INT COMMENT '支付订单数',
pay_user_counts INT COMMENT '支付用户数',
pay_product_counts INT COMMENT '支付商品数',
order_pay_amount_rate FLOAT COMMENT '下单-支付金额转换率',
order_pay_user_rate FLOAT COMMENT '下单-支付买家数转换率',
order_pay_duration FLOAT COMMENT '下单-支付时长'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_pay_stat_gsets/'
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_pay_stat_rollup (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
product_type string COMMENT '商品类别',
shop_type string COMMENT '店铺类别',
pay_status string COMMENT '支付状态',
pay_type string COMMENT '支付类型',
pay_nettype string COMMENT '支付网络方式',
pay_amounts INT COMMENT '支付金额',
pay_success INT COMMENT '支付成功数',
pay_fail INT COMMENT '支付失败数',
pay_order_counts INT COMMENT '支付订单数',
pay_user_counts INT COMMENT '支付用户数',
pay_product_counts INT COMMENT '支付商品数',
order_pay_amount_rate FLOAT COMMENT '下单-支付金额转换率',
order_pay_user_rate FLOAT COMMENT '下单-支付买家数转换率',
order_pay_duration FLOAT COMMENT '下单-支付时长'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_pay_stat_rollup/'
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_pay_stat_topn (
pay_type string COMMENT '支付类型',
customer_area_code string COMMENT '所在地区',
pay_count INT COMMENT '支付数量',
pay_amounts INT COMMENT '支付金额'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_pay_stat_topn/'
5.13 交易类指标表 4X【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_busi_stat (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_area_code string COMMENT '所在地区',
category_type string COMMENT '商品类别',
supplier_type string COMMENT '店铺类别',
busi_succ_orders INT COMMENT '交易成功订单数',
busi_succ_amounts INT COMMENT '交易成功金额',
busi_succ_users INT COMMENT '交易成功买家数',
busi_succ_products INT COMMENT '交易成功商品数',
busi_fail_orders INT COMMENT '交易失败订单数',
busi_fail_amounts INT COMMENT '交易失败金额'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_busi_stat/'

查询dwd层的订单明细流水表的订单状态并累加次数,再关联用户信息表查出用户相关信息。
关联出产品表,查出产品类别,sql如下:

insert overwrite table ads_nshop.ads_nshop_busi_stat partition(bdp_day='20200321')
select
oc.customer_gender,
oc.customer_age_range,
oc.customer_natives,
pr.category_code,
su.supplier_type,
count(distinct (case when od.order_status<7 then od.order_id end)) busi_succ_orders,
sum(case when od.order_status<7 then od.payment_money end) busi_succ_amounts,
count(distinct (case when od.order_status<7 then od.customer_id end)) busi_succ_users,
count(distinct (case when od.order_status<7 then od.product_code end)) busi_succ_products,
count(distinct (case when od.order_status>7 then od.order_id end)) busi_fail_orders,
sum(case when od.order_status>7 then od.payment_money end) busi_fail_amounts
from
dwd_nshop.dwd_nshop_orders_details od
join
ods_nshop.ods_02_customer oc
on
od.customer_id=oc.customer_id
join
ods_nshop.dim_pub_product pr
on
pr.product_code=od.product_code
join
ods_nshop.dim_pub_supplier su
on
od.supplier_code=su.supplier_code
where 
od.bdp_day='20200321'
group by
oc.customer_gender,
oc.customer_age_range,
oc.customer_natives,
pr.category_code,
su.supplier_type
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_busi_stat_gsets (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_area_code string COMMENT '所在地区',
category_type string COMMENT '商品类别',
supplier_type string COMMENT '店铺类别',
busi_succ_orders INT COMMENT '交易成功订单数',
busi_succ_amounts INT COMMENT '交易成功金额',
busi_succ_users INT COMMENT '交易成功买家数',
busi_succ_products INT COMMENT '交易成功商品数',
busi_fail_orders INT COMMENT '交易失败订单数',
busi_fail_amounts INT COMMENT '交易失败金额'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_busi_stat_gsets/'
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_busi_stat_rollup (
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_area_code string COMMENT '所在地区',
category_type string COMMENT '商品类别',
supplier_type string COMMENT '店铺类别',
busi_succ_orders INT COMMENT '交易成功订单数',
busi_succ_amounts INT COMMENT '交易成功金额',
busi_succ_users INT COMMENT '交易成功买家数',
busi_succ_products INT COMMENT '交易成功商品数',
busi_fail_orders INT COMMENT '交易失败订单数',
busi_fail_amounts INT COMMENT '交易失败金额'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_busi_stat_rollup/'
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_busi_stat_topn (
customer_natives string COMMENT '所在地区',
category_type string COMMENT '商品类别',
supplier_type string COMMENT '店铺类别',
busi_succ_orders INT COMMENT '交易成功订单数',
busi_succ_amounts INT COMMENT '交易成功金额',
busi_succ_users INT COMMENT '交易成功买家数',
busi_succ_products INT COMMENT '交易成功商品数',
busi_fail_orders INT COMMENT '交易失败订单数',
busi_fail_amounts INT COMMENT '交易失败金额'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_busi_stat_topn/'
5.14 广告投放类指标统计【ADS】
CREATE external TABLE
IF NOT EXISTS ads_nshop.ads_nshop_release_stat (
device_type string COMMENT '设备类型',
os string COMMENT '手机系统',
customer_gender TINYINT COMMENT '性别:1男 0女',
age_range string COMMENT '年龄段',
customer_natives string COMMENT '所在地区',
release_sources string COMMENT '投放渠道',
release_category string COMMENT '投放浏览产品分类',
visit_total_customers INT COMMENT '总访客数',
visit_total_counts INT COMMENT '总访问次数'
) partitioned BY (bdp_day string) stored AS parquet location '/data/nshop/ads/operation/ads_nshop_release_stat/'

总访客数:需要去重
总访问次数不需要去重
统计sql如下:

insert overwrite table ads_nshop.ads_nshop_release_stat partition(bdp_day='20200321')
select
dr.device_type ,
dr.os ,
oc.customer_gender ,
oc.customer_age_range ,
oc.customer_natives,
dr.release_sources ,
dr.release_category,
count(distinct dr.customer_id) visit_total_customers,
count(1) visit_total_counts
from
dwd_nshop.dwd_nshop_releasedatas dr
join
ods_nshop.ods_02_customer oc
on
dr.customer_id=oc.customer_id
where
dr.bdp_day='20200321'
group by
dr.device_type ,
dr.os ,
oc.customer_gender ,
oc.customer_age_range ,
oc.customer_natives,
dr.release_sources ,
dr.release_category

相关文章

  • 数仓项目05:ADS数据应用层

    5. ADS应用数据层 5.1 流量类指标_平台统计表【ADS】 建库: 建表: 1.从ods_02_custom...

  • 用户行为数仓搭建

    # 数仓分层概念 1.数仓分层 ods 存放原始数据 dwd 数据清洗 dws 数据汇总 ads 为统计报表提供数...

  • 1.理清稀里糊涂的数仓分层

    阿里巴巴大数据体系架构:数据采集层——数据计算层——数据服务层——数据应用层 数仓架构: ODS:操作数据层(Op...

  • 数仓实战07:数仓搭建-ADS层

    1.设备主题 1.1 活跃设备数(日、周、月)需求定义:日活:当日活跃的设备数周活:当周活跃的设备数月活:当月活跃...

  • 企业何时进行数据治理才是最佳时机

    找准数据治理的切入点,是关乎数据治理成败的关键。如果将数仓建设分为数仓雏形阶段、数仓迭代阶段和能力沉淀阶段,数据治...

  • ARX<==>ADS

    ADS 数据类型 ads_point ads_name ARX 数据类型 AcGePoint3dAcGePoint...

  • ELT太慢怎么行,试试PolyBase?

    传统的数仓项目,我们会用ETL把数据Extract Transform然后再Load到目标数据平台。 现在到了云和...

  • 实时数据引擎项目介绍

    项目引言 从数仓 ETL (数据仓库技术) ETL,是英文Extract-Transform-Load的缩写,用来...

  • 数仓的发展与特性

    一.何为数仓? 也就是集成化数据环境,控制数据的流入与流出,本身并不产生和消费数据。 二.数仓的作用? 1、管理数...

  • 数据中台和数仓的关系

    传统数仓 传统数仓有几个特点: 数据具有历史性 基于文件存储 以表为形态,自带元数据存储(比如Hive) 在数仓的...

网友评论

    本文标题:数仓项目05:ADS数据应用层

    本文链接:https://www.haomeiwen.com/subject/qokrfktx.html