首页 / 问答 / [SQL]SQL Server或MySQL等数据库中如何使用SQL脚本查找出多列重复的数据呢?

[SQL]SQL Server或MySQL等数据库中如何使用SQL脚本查找出多列重复的数据呢?

0

在SQL Server或MySQL等这样的关系性数据库中,用一个字段很容易找到重复的数据记录,比如有以下的示例数据:

ID   NAME   EMAIL
1    John   asd@asd.com
2    Sam    asd@asd.com
3    Tom    asd@asd.com
4    Bob    bob@asd.com
5    Tom    asd@asd.com

查询重复记录的SQL脚本语句如下:

SELECT email, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1

以上查询会返回John, Sam, Tom,因为是以email字段进行分组的,但现在需要找出nameemail字段都相同的重复记录,应该如何编写SQL语句呢?

回复 [×]
提交评论
请输入评论内容

2 个回答

  • 0

    name字段也添加到分组GROUP BY中,如下:

    SELECT
        name, email, COUNT(*)
    FROM
        users
    GROUP BY
        name, email
    HAVING COUNT(*) > 1
    
    Rector的个人主页

    Rector

    2021-11-07 回答

    • 0

      这里提供一个包含测试数据脚本的示例(SQL Server 2014):

      declare @YourTable table (id int, name varchar(10), email varchar(50))
      
      INSERT @YourTable VALUES (1,'John','John-email')
      INSERT @YourTable VALUES (2,'John','John-email')
      INSERT @YourTable VALUES (3,'fred','John-email')
      INSERT @YourTable VALUES (4,'fred','fred-email')
      INSERT @YourTable VALUES (5,'sam','sam-email')
      INSERT @YourTable VALUES (6,'sam','sam-email')
      
      SELECT
          name,email, COUNT(*) AS CountOf
          FROM @YourTable
          GROUP BY name,email
          HAVING COUNT(*)>1
      

      输入结果:

      name       email       CountOf
      ---------- ----------- -----------
      John       John-email  2
      sam        sam-email   2
      
      (2 行受影响)
      

      如果想查询并返回哪些是重复的数据项,则:

      SELECT
          y.id,y.name,y.email
          FROM @YourTable y
              INNER JOIN (SELECT
                              name,email, COUNT(*) AS CountOf
                              FROM @YourTable
                              GROUP BY name,email
                              HAVING COUNT(*)>1
                          ) dt ON y.name=dt.name AND y.email=dt.email
      

      返回结果:

      id          name       email
      ----------- ---------- ------------
      1           John       John-email
      2           John       John-email
      5           sam        sam-email
      6           sam        sam-email
      
      (4 行受影响)
      

      如果要删除重复的数据项,则:

      DELETE d
          FROM @YourTable d
              INNER JOIN (SELECT
                              y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
                              FROM @YourTable y
                                  INNER JOIN (SELECT
                                                  name,email, COUNT(*) AS CountOf
                                                  FROM @YourTable
                                                  GROUP BY name,email
                                                  HAVING COUNT(*)>1
                                              ) dt ON y.name=dt.name AND y.email=dt.email
                         ) dt2 ON d.id=dt2.id
              WHERE dt2.RowRank!=1
      SELECT * FROM @YourTable
      

      输出结果:

      id          name       email
      ----------- ---------- --------------
      1           John       John-email
      3           fred       John-email
      4           fred       fred-email
      5           sam        sam-email
      
      (4 行受影响)
      
      Rector的个人主页

      Rector

      2021-11-07 回答

      我来回答