美文网首页
20200903 数据蛙练习题

20200903 数据蛙练习题

作者: 沐若啊 | 来源:发表于2020-09-06 15:35 被阅读0次

    1、求拥有2辆及以上车的人每辆车的购车金额占个人总购车金额的比重

    --- 建表
    CREATE TABLE car (
    userid int,
    carid varchar(10),
    price decimal(6, 2),
    date date
    );
    --- 插入数据
    INSERT INTO car
    VALUES (1, '沪A66666', 66.66, '2020-1-1'),
    (2, '沪A13256', 100.66, '2020-1-1'),
    (3, '沪A95466', 20.66, '2020-1-3'),
    (4, '沪A78945', 50.66, '2020-1-4'),
    (1, '沪A33666', 70.66, '2020-1-4'),
    (1, '沪A68886', 1006.66, '2020-1-5'),
    (2, '沪A88886', 88.66, '2020-1-5'),
    (4, '沪A45466', 123.66, '2020-6-6'),
    (1, '沪A66886', 2066.66, '2020-6-1');
    

    解题过程:
    1)首先找到拥有2辆车及以上的用户,采用group by 对userid分组,求每个userid的车数量可以组合
    group by having结构;同时求出每个用户的购车总额,用sum函数
    2)求占比,上一步求出了满足条件的用户,再用一张新表与上面新生成的表进行关联,round函数对小数点保留两位小数,concat拼接‘%’号。
    sql如下:

    SELECT
        a.userid,
        a.carid,
        concat(ROUND(a.price / b.all_price * 100, 2),'%')
    FROM car a
    JOIN (
        SELECT
            userid,
            count(1) AS num,
            sum(price) AS all_price
        FROM car
        GROUP BY userid HAVING num >= 2
    ) b ON a.userid = b.userid
    ORDER BY userid;
    

    得出结果如下:


    image.png

    2、求年累加值,总累加值

    
    -- 建表
    
    CREATE TABLE temp (DATE DATETIME, VALUE INT);
    
    -- 插入数据 
    
    INSERT INTO temp
    VALUES
        ('2018/11/23', 10),
        ('2018/11/25', 12),
        ('2018/12/31', 3),
        ('2019/2/9', 53),
        ('2019/3/31', 23),
        ('2019/7/8', 11),
        ('2019/7/31', 10);
    
    

    解题过程
    1)先拆解除年份,月份,一直每月的数量
    2)求解当年到当月的总量
    3)求解所有截止到当月的总量
    mysql5.7不能用窗口函数,所以比较麻烦,还只做出来一半

    
    CREATE TABLE temp1 AS SELECT
        YEAR (date) AS YEAR,
        MONTH (date) AS MONTH,
        sum(VALUE) AS value_month
    FROM temp
    GROUP BY YEAR, MONTH;
    
    SELECT
        a. YEAR,
        a. MONTH,
        sum(b.value_month)
    FROM temp1 a
    JOIN temp1 b ON a. YEAR = b. YEAR
    AND a. MONTH >= b. MONTH
    GROUP BY a. YEAR, a. MONTH
    

    结果:


    image.png

    利用变量赋值的方式完成:

    select c.year_dt,c.month_dt,round(c.year_sum,0) as year_sum,round(c.all_sum,0) as all_sum
    from 
    (select  
    a.year_dt,a.month_dt,
    @ysum:=if(@year=year_dt,
    @ysum+sum_value,sum_value) year_sum,
    @asum:=@asum+sum_value as all_sum,
    @year:=year_dt
    from 
    (select year(date)as year_dt,
    month(date) as month_dt,
    sum(value) as sum_value
    from temp  group by year_dt,month_dt 
               order by year_dt,month_dt)a,
    (select @year:=0,@ysum:=0,@asum:=0)b
    )c
    

    结果如下:


    image.png

    3、列转行

    #建表 
    CREATE TABLE st_score (
        userid VARCHAR (20) NOT NULL COMMENT '用户ID',
        SUBJECT VARCHAR (20) COMMENT '科目',
        score INT (4) COMMENT '成绩'
    )
     #插入数据 
    INSERT INTO st_score
    VALUES
        ('001', '语文', 90),
        ('001', '数学', 92),
        ('001', '英语', 80),
        ('002', '语文', 88),
        ('002', '数学', 90),
        ('002', '英语', 75),
        ('003', '语文', 70),
        ('003', '数学', 85),
        ('003', '英语', 90),
        ('003', '政治', 82);
    

    解题过程:
    采用case when 将行拆成列

    select userid,
    case when SUBJECT='语文' then score else 0 end as '语文' ,
    case when SUBJECT='数学' then score else 0 end as '数学',
    case when SUBJECT='英语' then score else 0 end as '英语',
    case when SUBJECT='政治' then score else 0 end as '政治'
    from st_score;
    

    结果如下:


    image.png

    4、行转列

    建表

    CREATE TABLE st_score1 (
        userid VARCHAR (20) NOT NULL COMMENT '用户id',
        cn_score DOUBLE COMMENT '语文成绩',
        math_score DOUBLE COMMENT '数学成绩',
        en_score DOUBLE COMMENT '英语成绩',
        po_score DOUBLE COMMENT '政治成绩'
    )
    #插入数据 
    
    INSERT INTO st_score1
    VALUES
        ('001', 90, 92, 80, 0),
        ('002', 88, 90, 75.5, 0),
        ('003', 70, 85, 90, 82);
    

    sql如下:

    
    select userid, '语文' as course,cn_score as score from st_score1 
    union all 
    select userid, '数学' as course,math_score as score from st_score1 
    union all 
    select userid, '英语' as course,en_score as score from st_score1 
    union all 
    select userid, '政治' as course,po_score as score from st_score1 ;
    

    结果如下:


    image.png

    5、计算各院系的男女计数以及合计

    
    #建表
    
    CREATE TABLE st (
        id VARCHAR (20),
        NAME VARCHAR (20),
        gender CHAR (1),
        birth VARCHAR (20),
        department VARCHAR (20),
        address VARCHAR (20)
    );
    #插入数据 
    
    
    INSERT INTO st
    VALUES
        (
            '201901',
            '张大佬',
            '男',
            '1985',
            '计算机系',
            '北京市海淀区'
        ),
        (
            '201902',
            '郭大侠',
            '男',
            '1986',
            '中文系',
            '北京市昌平区'
        ),
        (
            '201903',
            '张三',
            '女',
            '1990',
            '中文系',
            '湖南省永州市'
        ),
        (
            '201904',
            '李四',
            '男',
            '1990',
            '英语系',
            '辽宁市阜新市'
        ),
        (
            '201905',
            '王五',
            '女',
            '1991',
            '英语系',
            '福建省厦门市'
        ),
        (
            '201906',
            '王六',
            '男',
            '1988',
            '计算机系',
            '湖南省衡阳市'
        );
    

    解题思路:
    1)先算出各个系的男女数量,采用case when 区分男女,并且加上group by 对系进行分组
    2)计算每个系的总数,直接分组即可

    select a.*,b.sum 
    from 
    (
    select department,
    sum(case when gender='男' then 1 else 0 end ) as '男',
    sum(case when gender='女' then 1 else 0 end ) as '女'
    from st  
    group by department )a 
    join 
    (select department,count(1) as sum 
    from st group by department )b
    on a.department=b.department;
    

    结果如下:


    image.png

    python练习题
    如有暂时没学到的部分,可跳过
    1、什么是PEP8规范?
    答:python的一种代码规范,使代码更好看,更易懂
    2、Python都有那些自带的数据结构?
    答:列表,元组,字典,字符串,集合
    3、Python中的负索引是什么?
    答:索引从右边活着末尾开始检索,从-1开始,-2为倒数第二个数
    4、怎么对列表进行去重操作

    list1=[2,1,2,4,5,6,10,3,3,3,7]
    list2 = []
    for tmp in list1:
        if tmp not in list2:
            list2.append(tmp)
        else:
            continue
    print(list2)
    

    5、pandas的axis参数怎么理解?
    axis=0,代表跨行,箭头方向向下
    axis=1,代表跨列,箭头方向向右


    image.png

    1)df.sum(axis=0) #为0时,箭头方向向下,把每一列的值求和
    2)df.sum(axis=1) #为1时,箭头方向向右,把每一行的值求和
    3)df.drop('country',axis=1) #向右方向查找列名为country的进行删除
    当我把axis改为0时,会报错,因为不存在

    sql每天都要练,不然真的是手生啊

    相关文章

      网友评论

          本文标题:20200903 数据蛙练习题

          本文链接:https://www.haomeiwen.com/subject/uiaqektx.html