[SQL]SQL Server或MySQL等数据库中如何使用SQL脚本查找出多列重复的数据呢?
1.6K 次浏览
在SQL Server或MySQL等这样的关系性数据库中,用一个字段很容易找到重复的数据记录,比如有以下的示例数据:
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
查询重复记录的SQL脚本语句如下:
SELECT email, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1
以上查询会返回John, Sam, Tom
,因为是以email
字段进行分组的,但现在需要找出name
和email
字段都相同的重复记录,应该如何编写SQL语句呢?
2 个回答
-
这里提供一个包含测试数据脚本的示例(SQL Server 2014):
declare @YourTable table (id int, name varchar(10), email varchar(50)) INSERT @YourTable VALUES (1,'John','John-email') INSERT @YourTable VALUES (2,'John','John-email') INSERT @YourTable VALUES (3,'fred','John-email') INSERT @YourTable VALUES (4,'fred','fred-email') INSERT @YourTable VALUES (5,'sam','sam-email') INSERT @YourTable VALUES (6,'sam','sam-email') SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1
输入结果:
name email CountOf ---------- ----------- ----------- John John-email 2 sam sam-email 2 (2 行受影响)
如果想查询并返回哪些是重复的数据项,则:
SELECT y.id,y.name,y.email FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name AND y.email=dt.email
返回结果:
id name email ----------- ---------- ------------ 1 John John-email 2 John John-email 5 sam sam-email 6 sam sam-email (4 行受影响)
如果要删除重复的数据项,则:
DELETE d FROM @YourTable d INNER JOIN (SELECT y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name AND y.email=dt.email ) dt2 ON d.id=dt2.id WHERE dt2.RowRank!=1 SELECT * FROM @YourTable
输出结果:
id name email ----------- ---------- -------------- 1 John John-email 3 fred John-email 4 fred fred-email 5 sam sam-email (4 行受影响)