【SQL刷题】SQL语法学习与练习题

作者: dataTONG | 来源:发表于2020-02-26 21:28 被阅读0次

【SQL刷题】SQL语法学习与练习题
sql4
MySQL 练习题
数据库学习线路图
SQL 语法大全
总结：SQL练习【SQL经典练习题】
sql刷题
SQL 刷题
刷题sql
sql 经典50题

20200701效率优化

【between and】查看一段时间范围内的数据，between and效率高于关系运算符。
【exists>in>关系运算符】
【索引】减少对磁盘的读写操作，提高查询速度。而视图的缺点是不能使用索引，这也是使用物化视图（进一步提高查询速度）的原因。【物化视图 VS 视图】同：反映某个查询的结果；异：视图仅保存SQL定义，而物化视图本身会存储数据，故名物化了的视图（需要占用存储空间存储数据；当底层表的数据发生变化时，物化视图也应相应更新到最新数据）。

20200603

【1】4种风格的嵌套：关键词in、exists（exists比in查询更快，但hive没有exists实现嵌套查询的功能）、比较运算符!=或=、any或all。
【复杂的多表查询】

纵向的表合并（焊接）：union all（速度快：仅合并，无其他附加动作）、union（合并+排重+排序）
横向的表连接（多个表中的字段合并到一张宽表中）

【通过索引提高数据的查询速度】类似书的目录；尽管有提速功能，但滥用的话会降低数据表的写操作（如降低表记录的增加insert、更新update或删除delete速度），也会占用一定的磁盘空间。
根据索引类型可分为普通索引（使用最频繁、无任何约束，不管变量的值是否重复或缺失）、唯一索引（唯一即不存在重复值）、主键索引（最严格：既不重复也不缺失）、联合索引和全文索引。
【索引的查询和删除】查询show inex from table；删除drop index indexname on table。

何时创建索引

七周成为数据分析师课后题

1、统计3月份的回购率

select count(ct),count(if(ct>1,1,null)) from
(select userid,count() as ct from orderinfo
where ispaid = '已支付' and month(paidtime)=3 
group by userid)t

2、统计3月份的复购率【date_sub(t2.m,interval 1 month)】

select t1.m,count(t1.m) ,count(t2.m)  from 
(select userid,date_format(paidtime,'%Y-%m-%01') as m
from orderinfo where ispaid='已支付'
group by userid,date_format(paidtime,'%Y-%m-%01') )t1
left join
(select userid,date_format(paidtime,'%Y-%m-%01') as m
from orderinfo where ispaid='已支付'
group by userid,date_format(paidtime,'%Y-%m-%01') )t2
on t1.userid=t2.userid and t1.m=date_sub(t2.m,interval 1 month)
group by t1.m

【滴滴面试题】快车订单order表
order_id,
user_id
start_address
end_address
call_time如20200101 08：35：26 （起始时间为2020-01-01到2020-03-31）
问题1：5分钟内的重复订单数量（起止地点相同）
问题2：20200101的用户在以后每天的留存率

【留存率 / 回购率】构造用户id和日期的中间表！！

select  t1.d,count(t1.d),count(t2.d) from
(select userid,date_format(calltime,'%Y-%m-%d') as d from order
group by userid,date_format(calltime,'%Y-%m-%d')
having date_format(calltime,'%Y-%m-%d')='2020-01-01'  )t1
left join 
(select userid,date_format(calltime,'%Y-%m-%d') as d from order
group by userid,date_format(calltime,'%Y-%m-%d'))t2
on t1.userid=t2.userid and t1.d=date_sub(t2.d,interval 1 day)
group by t1.d

3、统计男女用户的消费频次差异
4、多次消费的用户，首末次消费间隔是多少

【计算天数datediff(max,min)；而非秒数】
select userid,datediff( max(paidtime),min(paidtime))
from order where ispaid='已支付'
group by userid having count(1)>1

大小厂面试题汇总

sql 面试题(难题汇总)
1、腾讯面试题：table_A （用户userid和登录时间time）求连续登录3天的用户数

【解析：用窗口函数row_number 进行排序，日期函数DATESUB，将（日期-排序数）得到一个相等的日期flag_date，以其进行分组，将连续的日期分为一组】
select userid,date_sub(time,interval t.rn day) as flag_date,count(*)
from (
select userid,time,row_number() over(partition by userid order by time) as rn 
from table_A
)t
group by userid,flag_date
having count(*)>=3

2、原始座次表 ‘seat’如下，现需要更换相邻位置学生的座次。

select
(case
when mod(id, 2) != 0 and c!= id then id + 1
when mod(id, 2) != 0 and c = id then id
else id - 1
end) as id2,student
from seat ,(select count(*) as c from seat)as b
order by id2

3、现在需要找出语文课中成绩第二高的学生成绩。如果不存在第二高成绩的学生，那么查询应返回 null。

题目要求，如果没有第二高的成绩，返回空值，所以这里用判断空值的函数（ifnull）函数来处理特殊情况。select ifnull(第2步的sql,null) as '语文课第二名成绩';

select 
ifnull( (
select max(distinct 成绩) 
from 成绩表
where 课程='语文' and
      成绩 < (select max(distinct 成绩) 
              from 成绩表 
              where 课程='语文') ),null) 
as '语文课第二名成绩';

4、如何提高SQL查询的效率？

1. select子句中尽量避免使用*
2、为了提高效率，where子句中遇到函数或加减乘除的运算，应当将其移到比较符号的右侧。
where 成绩 > 90 – 5（表达式在比较符号的右侧）
3、 尽量避免使用in和not in【会导致数据库进行全表搜索，增加运行时间】
4. 尽量避免使用or
select 学号
from 成绩表
where 成绩 = 88 or 成绩 = 89

优化后：
select 学号 from 成绩表 where 成绩 = 88
union
select 学号 from 成绩表 where 成绩 = 89

5、用户访问次数表，列名包括用户编号、用户类型、访问量。要求在剔除访问次数前20%的用户后，每类用户的平均访问次数。（拼多多、网易面试题）

select 用户类型,avg(访问量)
from 
(select * 
from 
(select *,
       row_number() over(order by 访问量 desc) as 排名
from 用户访问次数表) as a
where 排名 > (select max(排名) from a) * 0.2) as b
group by 用户类型;

牛客网刷题

【202005】

0、【重点回顾】
给出每个员工每年薪水涨幅超过5000的员工编号emp_no、薪水变更开始日期from_date以及薪水涨幅值salary_growth，并按照salary_growth逆序排列。

0、【重点回顾】
查找所有员工自入职以来的薪水涨幅情况，给出员工编号emp_no以及其对应的薪水涨幅growth，并按照growth进行升序

select a.emp_no,c.salary-b.salary as growth from employees a
inner join salaries b 
on a.emp_no=b.emp_no and a.hire_date=b.from_date
inner join salaries c
on a.emp_no=c.emp_no and c.to_date='9999-01-01'
order by c.salary-b.salary asc

0、【重点回顾：有思路（这种场景，最重要的是学会拆分），完整写出来不易】
获取员工其当前的薪水比其manager当前薪水还高的相关信息

0、【重点回顾】dense_rank的排序（有序号） VS 最后输出显示的排序order by（无序号）
dense_rank() over()的用法以及
我错在哪order by salary desc,emp_no asc？【正确写法——按照一个标准来排序order by salary desc 】
对所有员工的当前(to_date='9999-01-01')薪水按照salary进行按照1-N的排名，相同salary并列且按照emp_no升序排列

【正确写法——按照一个标准来排序order by salary desc 】
select emp_no,salary, dense_rank() over (order by salary desc ) as rank
from salaries
where to_date='9999-01-01'
order by salary desc,emp_no asc

【错误写法：2处错误】
select emp_no,salary,dense_rank(order by salary desc,emp_no asc) as rank
from salaries where to_date='9999-01-01'
order by salary desc,emp_no asc

1、【如何去重 distinct（distinct效率不行，且大数据量的时候都禁止用distinct，建议用group by解决重复问题）和 group by】查找入职员工时间排名倒数第三的员工所有信息

【法1：groupby来去重】
select * from employees
where hire_date = (
select hire_date  from employees 
group by hire_date
order by hire_date desc
limit 2,1)

【法2：distinct来去重】
select * from employees 
where hire_date = (
    select distinct hire_date from employees order by hire_date desc limit 2,1)

2、【SQL是支持集合运算：EXPECT 集合差运算、 UNION 集合并运算、 INTERSECT 集合交运算】获取所有非manager的员工emp_no

SELECT employees.emp_no
FROM salaries
EXCEPT
SELECT dept_manager.emp_no
FROM dept_manager;

3、【MAX(SALARY) 和 emp_no 不一定对应哦！！！因为GROUP BY 默认取非聚合的第一条记录】最大最小问题要善用row_number（）
【知识点】使用group by子句时，select子句中只能有聚合键、聚合函数、常数。emp_no并不符合这个要求。
获取所有部门中当前员工薪水最高的相关信息，给出dept_no, emp_no以及其对应的salary

4、查找员工编号emp_no为10001其自入职以来的薪水salary涨幅值growth

select max(salary)-min(salary) as growth from salaries
group by emp_no
having emp_no='10001'

【严谨的思路】
SELECT ( 
(SELECT salary FROM salaries WHERE emp_no = 10001 ORDER BY to_date DESC LIMIT 1) -
(SELECT salary FROM salaries WHERE emp_no = 10001 ORDER BY to_date ASC LIMIT 1)
) AS growth

5、【累计求和】所有聚合函数都能用做窗口函数，其语法和专用窗口函数完全相同——sum(<汇总列>) over(<排序列>) as 别名；

select emp_no,salary,sum(salary) over(order by emp_no) as running_total 
from  salaries
where to_date = '9999-01-01';

【把所有小于等于当前编号的表s1和当前编号表s2联立起来，然后按照当前编号分组，计算出所有小于等于当前标号的工资总数】
select b.emp_no,b.salary,sum(a.salary) as running_total from salaries a
inner join salaries b 
on a.emp_no<=b.emp_no and a.to_date = "9999-01-01" and b.to_date = "9999-01-01"
group by b.emp_no,b.salary

6、对于employees表中，给出奇数行的first_name

【未通过】
select first_name from(
select *,row_number() over(order by first_name) as rank from employees)a
where a.rank%2=1

select e1.first_name from employees as e1
    where (select count(e2.first_name) from employees as e2
               where e1.first_name >= e2.first_name)%2 = 1;

【202004】

1、case when 的用法（离散数值映射为对应的实际含义、连续数值映射为离散区间、构建长形统计表、构建宽形统计表）

简单case函数法
case搜索函数法（推荐使用，因为其既可完成等式表达、也可实现不等式表达）

case when 的用法：映射处理、筛选计算

case 
when expr1 then expr2
when expr3 then expr4
else expr5

`case` `eb.btype`
`when ``1` `then s.salary*``0.1`
`when ``2` `then s.salary*``0.2`
`else` `s.salary*``0.3`

-- 收入区间分组
select id,
(case 
when CAST(salary as float)<50000 Then '0-5万'
when CAST(salary as float)>=50000 and CAST(salary as float)<100000 then '5-10万'
when CAST(salary as float) >=100000 and CAST(salary as float)<200000 then '10-20万'
when CAST(salary as float)>200000 then '20万以上'
else NULL end ) as qujian
from table_1;

2、 EXISTS用于检查子查询是否至少会返回一行数据，该子查询实际上并不返回任何数据，而是返回值True或False
SQL中EXISTS的用法

select * from employees
where not exists  (select * from dept_emp where emp_no=employees.emp_no)

3、LIMIT 1 OFFSET 2 -- 去掉OFFSET排名倒数第一第二的时间，取倒数第三
【limit y,x和limit x offset y等价】
【LIMIT 后的数字代表返回几条记录，OFFSET 后的数字代表从第几条记录开始返回（第一条记录序号为0），也可理解为跳过多少条记录后开始返回】
【在 LIMIT X,Y 中，Y代表返回几条记录，X代表从第几条记录开始返回（第一条记录序号为0），切勿记反】

`以下的两种方式均表示取``2``,``3``,``4``三条条数据。`
`1``.select* from test LIMIT` `1``,``3``;`
`当limit后面跟两个参数的时候，第一个数表示要跳过的数量，后一位表示要取的数量。`
`2``.select * from test LIMIT` `3` `OFFSET` `1``;(在mysql` `5``以后支持这种写法)`
`当 limit和offset组合使用的时候，limit后面只能有一个参数，表示要取的的数量,offset表示要跳过的数量 。`

4、查找排除当前最大、最小salary之后的员工的平均工资avg_salary。

select avg(salary) as avg_salary 
from salaries
where to_date = '9999-01-01' 
and salary<(select max(salary) from salaries) 
and salary>(select min(salary) from salaries)

5、【group_concat(X,Y)】
SQLite的聚合函数group_concat(X,Y)，其中X是要连接的字段，Y是连接时用的符号，可省略，默认为逗号。此函数必须与 GROUP BY 配合使用。

select dept_no,
group_concat(emp_no,',') as employees
from dept_emp
group by dept_no

6、【substr而非substring】substr(字符串，起始位置【第一个字符的位置为1，而不为0】，长度【省略，则从一直截取到字符串末尾】）
SQL中有length()函数

select first_name from employees
order by substr(first_name,length(first_name)-1,2)

7、【length()函数与replace()函数的结合灵活地解决了统计子串出现次数的问题】查找字符串'10,A,B' 中逗号','出现的次数cnt。

SELECT (length("10,A,B")-length(replace("10,A,B",",","")))/length(",") AS cnt

8、【SQLite 中用 “||” 符号连接字符串】

SELECT last_name || "'" || first_name FROM employees

9、【update的语法】将所有获取奖金的员工当前的薪水增加10%。

属于范围：exists、where in、inner join

update salaries set salary = salary*1.1  #此处不能用salary*1.1 as salary
where emp_no in (
select a.emp_no from emp_bonus a
    inner join salaries b
    on a.emp_no=b.emp_no and b.to_date='9999-01-01')

10、将id=5以及emp_no=10001的行数据替换成id=5以及emp_no=10005,其他数据保持不变，使用replace实现。

REPLACE INTO titles_test VALUES (5, 10005, 'Senior Engineer', '1986-06-26', '9999-01-01')

11、【删除语句DELETE FROM [不用加table字段] ... WHERE ... NOT IN ... 】删除emp_no重复的记录，只保留最小的id对应的记录。

delete from  titles_test
where id not in(
select min(id) from titles_test
group by emp_no)

12、针对actor表创建视图actor_name_view，只包含first_name以及last_name两列，并对这两列重新命名，first_name为first_name_v，last_name修改为last_name_v.

create view actor_name_view 
as 
select first_name as first_name_v,last_name as last_name_v
from actor

13、创建索引：对first_name创建唯一索引uniq_idx_firstname，对last_name创建普通索引idx_lastname。

create unique index uniq_idx_firstname on actor(first_name);
create index idx_lastname on actor(last_name);

14【用 WHERE 来筛选 category_id IS NULL 】使用join查询方式找出没有分类的电影id以及名称

select a.film_id ,a.title from film a
left join film_category b
on a.film_id =b.film_id 
where b.category_id is null   #不能用and b.category_id is null  !!!!!!!!!

注意：最后一句若写成 ON f.film_id = fc.film_id AND fc.category_id IS NULL，则意义变成左连接两表 film_id 相同的记录，且 film_category 原表中的 fc.category 的值为 null。显然，原表中的 fc.category 的值恒不为 null，因此（f.film_id = fc.film_id AND fc.category_id IS NULL）恒为 FALSE，左连接后则只会显示 film 表的数据，而 film_category 表的数据全显示为 null
15、给出每个员工每年薪水涨幅超过5000的员工编号emp_no、薪水变更开始日期from_date以及薪水涨幅值salary_growth，并按照salary_growth逆序排列。

SELECT s2.emp_no, s2.from_date, (s2.salary - s1.salary) AS salary_growth
FROM salaries AS s1, salaries AS s2
WHERE s1.emp_no = s2.emp_no 
AND salary_growth > 5000
AND (strftime("%Y",s2.to_date) - strftime("%Y",s1.to_date) = 1 
     OR strftime("%Y",s2.from_date) - strftime("%Y",s1.from_date) = 1 )
ORDER BY salary_growth DESC

经典题目

【 In/exist的联系与区别】子查询过程中，In和exist函数效率比较：

当进行连接的两个表大小相似，效率差不多；
如果子查询的内表更大，则exist的效率更高（exist先查询外表，然后根据外表中的每一个记录，分别执行exist语句判断子查询的内表是否满足条件，满足条件就返回ture）。
如果子查询的内表小，则in的效率高（in在查询的时候，首先查询子查询的表，然后将内表和外表做一个笛卡尔积 (表中的每一行数据都能够任意组合A表有a行，B表有b行，最后会输出a*b行)，然后按照条件进行筛选。所以相对内表比较小的时候，in的速度较快）。

【Exist的原理】使用exist时，若子查询能够找到匹配的记录，则返回true，外表能够提取查询数据；使用 not exist 时，若子查询找不到匹配记录，则返回true，外表能够提取查询数据。
【字符串常见操作函数】
concat、concat_ws、group_concat函数用法
concat(): 将多个字符串连接成一个字符串，连接符用“”包起来；
concat_ws()：是CONCAT()的特殊形式，将多个字符串连接成一个字符串，在最开始的位置指定连接符（指定一次即可；第一个参数是其它参数的分隔符）。——select concat_ws(',','11','22','33'); 　11,22,33
group concat()：【分组拼接函数】将group by产生的同一个分组中的值连接起来，返回一个字符串；

【语法】group_concat([DISTINCT] 要连接的字段 [Order BY ASC/DESC 排序字段] [Separator '分隔符'])

select id,group_concat(name order by name desc) from aa group by id;
　　|1 | 20,20,10   |
　　|2 | 20|
　　|3 | 500,200|

select id,group_concat(distinct name) from aa group by id;
　　|1 | 10,20|
　　|2 | 20   |
　　|3 | 200,500 |

like(): 需要与通配符一起使用（'%'代表任意字符出现任意次数；'_'仅能匹配单个字符）；
substr(): 用于从字段中提取相应位置的字符；
regexp() : 正则表达式匹配函数；
【列拆分为多行】lateral view explode()：按照一定的格式（比如split(food,'、')先按照顿号分割，然后铺开为多行）分裂数据

select name,sum(calorie)
from
    (select t1.name,fd,t2.calorie
     from p_food t1 lateral view explode(split(food,'、')) as fd
     left join f_calorie t2 on t1.food = t2.food)a
group by name

spark sql如何把一列拆分为多行：Lateral view explode()
【行转列：在行列互换结果表中，其他列里的值分别使用case和max来获取】
【不加聚合max（group by）的效果】有幸去华为面试数据分析岗，看到SQL后我拒绝了
 行列互换问题，怎么办？送你一个万能模版

select stu_name,
max(case when course_name='Chinese' then grades else 0 end) as Chinese,
max(case when course_name = 'English' then grades else 0 end) as  English,
max(case when course_name = 'Physics' then grades else 0 end) as  Physics,
max(case when course_name = 'Chinese' then grades else 0 end) as  Mathematics
from students_grades
group by stu_name;

语法学习

WHERE语句在GROUP BY语句之前；SQL会在分组之前计算WHERE语句。
HAVING语句在GROUP BY语句之后；SQL会在分组之后计算HAVING语句。

“Where” 是一个约束声明，使用Where来约束来之数据库的数据，Where是在结果返回之前起作用的，且Where中不能使用聚合函数。
“Having”是一个过滤声明，是在查询返回结果集以后对查询结果进行的过滤操作，在Having中可以使用聚合函数。

通配符

SQL通配符

注意null值.PNG

函数

文本处理

文本处理
数值计算

disticnt只能放在前面？？？
时间处理

时间处理.PNG

【count()与count(列)】count(列)忽略null值的个数；count()返回样本数量（含null值的个数）；
【union与union all（列数据类型必须相同）】union去重；union all 不去重。
【表的创建与删除】
drop table if exists a;
create table if not exist a as select * from b;

面试题练习

第二题

第二题.png

数据下载

数据集下载链接

编码格式转换

将其由 xlsx 转化为 utf-8 格式的csv保存，文件保存路径'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/xiangji.csv'。

数据加载过程

详见下图

加载数据至mysql图解.png

create table user
(uid varchar(10),
app_name varchar(20),
duration int(10), -- 在hive中建该表，可能需要改为 duration int，
times int(10),  -- 在hive中建该表，可能需要改为 times int，
dayno varchar(30)
);

SHOW VARIABLES LIKE "secure_file_priv";

load data infile 'C:/ProgramData/MySQL/MySQL Server 5.7/Uploads/xiangji.csv' into table user 
fields terminated by ',' ignore 1 lines;

查寻代码

select 
    a.day1,
    count(distinct a.uid) as 活跃用户数,
    count(distinct case when day2-day1=1 then a.uid end) as 次留,
    -- #体会差别：我的写法是sum(case when day2-day1=1 then 1 else 0 end)as 次留,
    count(distinct case when day2-day1=3 then a.uid end) as 3留,
    count(distinct case when day2-day1=7 then a.uid end) as 7留,
    concat(count(distinct case when day2-day1=1 then a.uid end)/count(distinct a.uid)*100,'%') as 次留率,
    concat(count(distinct case when day2-day1=3 then a.uid end)/count(distinct a.uid)*100,'%') 三日留存率,
    concat(count(distinct case when day2-day1=7 then a.uid end)/count(distinct a.uid)*100,'%') 七日留存率
 from
(SELECT uid,date_format(dayno,'%Y%m%d') as day1 FROM data.user
where app_name='相机') a
-- #用date_format把dayno的文本格式改为可计算的形式
left join
(SELECT uid,date_format(dayno,'%Y%m%d') as day2 FROM data.user
where app_name='相机') b
on a.uid=b.uid
group by a.day1

第二题结果显示.png

第三题

行转列(图中左变右)

第三题.png

建表、查寻代码

create table course (
id varchar(20),
teacher_id varchar(20),
week_day varchar(20),
has_course varchar(20));

insert into course value
(1,1,2,"Yes"),
(2,1,3,"Yes"),
(3,2,1,"Yes"),
(4,3,2,"Yes"),
(5,1,2,"Yes");

select 
    teacher_id,
    (case when week_day=  1 then 'yes' else '' end) as mon,
    (case when week_day = 2 then "Yes" else " " end) "tue",
    (case when week_day = 3 then "Yes" else " " end) "thi",
    (case when week_day = 4 then "Yes" else " " end) "thu",
    (case when week_day = 5 then "Yes" else " " end) "fri"
from course

第三题结果显示.PNG

第四题

第四题.png

建表、查寻代码

create table a1 (
name varchar(20),
english int,
maths int,
music int);

insert into a1 values
("Jim",90,88,99);

select name,'english' as subject,english as score
from a1
-- #把表格内容 english 加上字段名 subject
union
select name,'maths' as subject,maths as score
from a1
union
select name,'music' as subject,music as score
from a1;

第五题

建表、查寻代码

第六题

建表、查寻代码

解答第一问

select name,
    max(datetime) as 最近登录时间,
    count(distinct date) as 登录总次数
    -- 错误写法(需要去重后计数)：count(date) ,
    -- 另一种正确写法:count(distinct date_format(lastlogon,'%Y-%m-%d')),
from(
    select name,
        date(date_format(lastlogon,"%Y-%m-%d %H:%i:%s")) as date,
        date_format(lastlogon,"%Y-%m-%d %H:%i:%s") as datetime
        -- 分钟的表示是：%i;小时是：%H
    from userlog)t
group by name
-- 启示distinct/group by:若去重后计数，用distinct(计数)；若去重后求和等操作，用group by(求和).

【启示】去重的两种方法：distinct / group by
（1）若去重后计数，用distinct(计数)；
（2）若去重后求和等操作，用group by(求和)

6第六题结果显示1.PNG

解答第二问

-- #法1：子查询，好理解
drop table if exists tmp_table;
create temporary table tmp_table
select name,lastlogon,
    @rank:=if(@test=name,@rank+1,1) as num_logontime,
    @rank2:=if(@test=name,if(@date=date,@rank2,@rank2+1),1) as num_logonday,
    @test:=name,
    @date:=date
from(
    SELECT name,lastlogon,
        date(date_format(lastlogon,"%Y-%m-%d %H:%i:%s")) as date,
        date_format(lastlogon,"%Y-%m-%d %H:%i:%s") as datetime
    from userlog
    order by name,datetime)t,(select @rank:=0, @test:=null,@rank2:=0, @date:=null)tmp;
select name,lastlogon,num_logontime,num_logonday from tmp_table;

-- #法2：无查询，比法1稍难理解
drop table if exists tmp_table;
create temporary table tmp_table
SELECT name,lastlogon,
    date(date_format(lastlogon,"%Y-%m-%d %H:%i:%s")) as date,
    date_format(lastlogon,"%Y-%m-%d %H:%i:%s") as datetime,
    @rank:=if(@test=name,@rank+1,1) as num_logontime,
    @rank2:=if(@test=name,if(@date=date(date_format(lastlogon,"%Y-%m-%d %H:%i:%s")),@rank2,@rank2+1),1) as num_logonday,
    -- if嵌套语句，第二重判断也要有if。
    @test:=name,
    @date:=date(date_format(lastlogon,"%Y-%m-%d %H:%i:%s"))
FROM data.userlog
order by name,datetime;
select name,lastlogon,num_logontime,num_logonday from tmp_table;

--  #法3：hive的窗口函数方法
row_number() over(partition by name order by datetime) as num_logontime,
dense_rank() over(partition by name order by date) as num_logonday,

6第六题结果显示2.PNG

第七题

第七题.png

建表、查寻代码

create table tableA (
qq int(20),
game varchar(20));

insert into tableA values
(10000,"a"),
(10000,"b"),
(10000,"c"),
(20000,"c"),
(20000,"d");

第一问解答

7第七题结果显示1.png

drop table if exists tableB;

create TEMPORARY table tableB(
select qq,group_concat(game separator"-") as game
from tableA group by qq);

select * from tableB;

第二问解答

mysql函数substring_index的用法
略复杂，先存着。。。

第八题

建表、查寻代码

解答第一问

解答第二问

select imp_date,is_new_state
    ,count(distinct qimei) as 领红包人数
    ,sum(add_money)/count(distinct qimei) as 平均领取金额
    ,count(report_time)/count(distinct qimei) as 平均领取次数
from(
    select a.imp_date,a.qimei,a.add_money,a.report_time,b.is_new
        ,(Case when b.is_new  = 1 then  '新用户'  when b.is_new = 0 then '老用户'  else '领取红包但未登陆'end) as is_new_state 
    from tmp_liujg_packed_based  a
    Left join  tmp_liujg_dau_based b 
    on a.imp_date = b.imp_date and a.qimei = b.qimei    
    where a.imp_date > '20190601')t
group by imp_date,is_new_state;

8第八题2.PNG

解答第三问

计算2019年3月以来的每个月，每个月按领红包取天数为1、2、3……30、31天区分，计算取每个月领取红包的用户数，人均领取金额，人均领取次数

select 
    left(imp_date,6) as 月份,
    count(distinct imp_date) as 每月有红包领取的天数,
    count(distinct qimei) as 每月领取红包的用户数,
    sum(add_money)/ count(distinct qimei) as 每月人均领取金额,
    count(report_time)/count(distinct qimei) as 每月人均领取次数
from tmp_liujg_packed_based
where imp_date>='20190301'
group by left(imp_date,6);

8第八题3.PNG

解答第四问

计算2019年3月以来，每个月领过红包用户和未领红包用户的数量，平均月活跃天数（即本月平均活跃多少天）

Select 
    left(cc.imp_date,6) 月份,
    cc.is_packet_user 红包用户,
    Count(distinct cc.qimei)   用户数量,
    -- Count(is_packet_user) #不理解啥意思？每月活跃天数,
    Count(is_packet_user)/Count(distinct cc.qimei)  月活跃天
from(
    Select a.imp_date, a.qimei,b.qimei hb_qimei,
        Case when b.qimei is not null then '红包用户' else  '非红包用户' end is_packet_user,
        Case when b.qimei is not null then b.qimei else a.qimei end is_qimei
    from tmp_liujg_dau_based a 
    Left join
     (select distinct  
        left(imp_date,6)  imp_date ,
        qimei 
    from tmp_liujg_packed_based  
    where imp_date >= '20190301')b
    On  left(a.imp_date,6) = b.imp_date and a.qimei = b.qimei)cc
Group by  left(cc.imp_date,6),cc.is_packet_user;

8第八题4.PNG

解答第五问

select  distinct
    left(a.imp_date,6) 月份,
    a.qimei,
    b.用户注册日期
from tmp_liujg_dau_based  a
left join(
    select qimei,min(imp_date) as 用户注册日期
    from tmp_liujg_dau_based
    where is_new=1 and  imp_date >= '20190301'
    group by qimei)b
on a.qimei=b.qimei
Where a.imp_date >='20190301'
order by 月份,qimei;

8第八题5.PNG

解答第六问

解答第七问

解答第八问

五级标题

列表第一项
列表第二项

有序列表第一项
有序列表第二项
标题
[图片上传失败...(image-5bb63b-1582723711145)]
斜体
粗体

引用段落

【SQL刷题】SQL语法学习与练习题
20200701效率优化【between and】查看一段时间范围内的数据，between and效率高于关系运...
sql4
Trips and Users 此题有困难，需要学习相关sql语法
MySQL 练习题
01第一天20180703 【SQL练习】经典SQL练习题 02第二天20180704 sql语句练习50题(My...
数据库学习线路图
SQL 语法教程 SQL 教程SQL 简介SQL 语法SQL Select选择SQL SELECT DISTINC...
SQL 语法大全
SQL 语法大全标签（空格分隔）： SQL DB2 SQL语法大全 SQL SELECT SQL SELECT ...
总结：SQL练习【SQL经典练习题】
刚刷完SQL练习【SQL经典练习题】，本篇文章将对我不牢固的知识做简单汇总。没对比就没标准，当练习超经典SQL练习...
sql刷题
262.Trips and Users 难题 case when 记得加end， between如果不是数字的话记...
SQL 刷题
https://www.nowcoder.com/ta/sqlhttps://leetcode-cn.com/pr...
刷题sql
t 175. Combine Two Tables t 176. Second Highest Salary t ...
sql 经典50题
50道SQL练习题及答案与详细分析网上流传较广的50道SQL训练，奋斗了不知道多久终于写完了。前18道题的难度依...

【SQL刷题】SQL语法学习与练习题

20200701效率优化

20200603

七周成为数据分析师课后题

大小厂面试题汇总

牛客网刷题

【202005】

【202004】

经典题目

语法学习

面试题练习

第二题

数据下载

编码格式转换

数据加载过程

查寻代码

第三题

行转列(图中左变右)

建表、查寻代码

第四题

建表、查寻代码

第五题

建表、查寻代码

第六题

建表、查寻代码

解答第一问

解答第二问

第七题

建表、查寻代码

第一问解答

第二问解答

第八题

建表、查寻代码

解答第一问

解答第二问

解答第三问

解答第四问

解答第五问

解答第六问

解答第七问

解答第八问

五级标题

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读