对SQL语句不熟悉欢迎查看我整理的笔记:[SQL] MySQL基础 + python交互
转载请注明:陈熹 chenx6542@foxmail.com (简书:半为花间酒)
若公众号内转载请联系公众号:早起Python
题目:
简单题 #196
![](https://img.haomeiwen.com/i22672581/10b9dbccb91f412f.png)
SQL架构:
Create table If Not Exists Person (Id int, Email varchar(255));
Truncate table Person;
insert into Person (Id, Email) values ('1', 'john@example.com');
insert into Person (Id, Email) values ('2', 'bob@example.com');
insert into Person (Id, Email) values ('3', 'john@example.com');
insert into Person (Id, Email) values ('4', 'bob@example.com');
insert into Person (Id, Email) values ('5', 'john@example.com');
题解:
第一种解法
—— 基于分组的子查询
根据题意,可以分组判断取出每个组的最小Id
,然后利用NOT IN
获取重复Id
将其删除即可
首先写一下子查询
SELECT MIN(Id) Id
FROM Person
GROUP BY Email
这里要注意一个很重要的细节,如果这题如LeetCode #182一样是查找重复邮箱,则以下两种写法都可以,两种写法的区别是有无创建临时表
SELECT *
FROM Person
WHERE Id NOT IN
(
SELECT temp.Id
FROM (
SELECT MIN(Id) Id
FROM Person
GROUP BY Email
) temp
);
# 或
SELECT *
FROM Person
WHERE Id NOT IN
(
SELECT MIN(Id) Id
FROM Person
GROUP BY Email
);
但,如果是本题的DELETE
用第二种无临时表的写法
DELETE
FROM Person
WHERE Id NOT IN
(
SELECT MIN(Id) Id
FROM Person
GROUP BY Email
);
会出现报错:You can't specify target table 'Person' for update in FROM clause
大意就是不能对一个表同时执行查询和更新操作,因此需要创建临时表
最终解法如下:
DELETE
FROM Person
WHERE Id NOT IN
(
SELECT temp.Id
FROM (
SELECT MIN(Id) Id
FROM Person
GROUP BY Email
) temp
);
第二种解法
—— 基于连表后删除
首先用自连接找出Email
相同且Id
不同的所有情况,其中表p1
的Id
均大于p2
,为了直观反映出变化这里先用SELECT
而不是DELETE
SELECT *
FROM Person p1
JOIN Person p2
ON p1.Email = p2.Email
AND p1.Id > p2.Id
-- 也可以用如下代码,效率偏低
SELECT *
FROM Person p1, Person p2
WHERE p1.Email = p2.Email
AND p1.Id > p2.Id
![](https://img.haomeiwen.com/i22672581/75daecc009b11421.png)
可以看到,表p1
包含了所有重复且非最小Id
的记录,此时只需要删除该表即可
故最终代码如下:
DELETE p1
FROM Person p1
JOIN Person p2
ON p1.Email = p2.Email
AND p1.Id > p2.Id
-- SELECT * FROM Person
![](https://img.haomeiwen.com/i22672581/4efbc100105027b5.png)
网友评论