使用一个字段很容易找到重复项:

SELECT email, COUNT(email) 
FROM users
GROUP BY email
HAVING COUNT(email) > 1

所以如果我们有一张桌子

ID   NAME   EMAIL
1    John   asd@asd.com
2    Sam    asd@asd.com
3    Tom    asd@asd.com
4    Bob    bob@asd.com
5    Tom    asd@asd.com

这个查询将告诉我们John、Sam、Tom和Tom,因为他们都有相同的电子邮件。

然而,我想要的是获得相同电子邮件和名称的副本。

也就是说,我想得到“汤姆”,“汤姆”。

我需要这个的原因是:我犯了一个错误,允许插入重复的名称和电子邮件值。现在我需要删除/更改重复项,所以我需要先找到它们。


当前回答

试试看:

                DECLARE @myTable TABLE
                (
                    id INT,
                    name VARCHAR(10),
                    email VARCHAR(50)
                );

                INSERT @myTable
                VALUES
                (1, 'John', 'John-email');
                INSERT @myTable
                VALUES
                (2, 'John', 'John-email');
                INSERT @myTable
                VALUES
                (3, 'fred', 'John-email');
                INSERT @myTable
                VALUES
                (4, 'fred', 'fred-email');
                INSERT @myTable
                VALUES
                (5, 'sam', 'sam-email');
                INSERT @myTable
                VALUES
                (6, 'sam', 'sam-email');


                WITH cte
                AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rowNum,
                           *
                    FROM @myTable)
                SELECT c1.id,
                       c1.name,
                       c1.email
                FROM cte AS c1
                WHERE 1 <
                (
                    SELECT COUNT(c2.rowNum)
                    FROM cte AS c2
                    WHERE c1.name = c2.name
                          AND c1.email = c2.email
                );

其他回答

如果您希望查看表中是否有重复的行,我使用以下查询:

create table my_table(id int, name varchar(100), email varchar(100));

insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (2, 'Aman', 'aman@rms.com');
insert into my_table values (3, 'Tom', 'tom@rms.com');
insert into my_table values (4, 'Raj', 'raj@rms.com');


Select COUNT(1) As Total_Rows from my_table 
Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc 

这个问题在上面的所有答案中都得到了很好的回答。但我想列出所有可能的方式,我们可以通过各种方式来做到这一点,这可能会让我们了解如何做到,寻求者可以选择最适合他/她的需求的解决方案,因为这是SQL开发人员遇到不同业务用例或在访谈中遇到的最常见的查询之一。

创建示例数据

我将仅从这个问题中设置一些示例数据开始。

Create table NewTable (id int, name varchar(10), email varchar(50))
INSERT  NewTable VALUES (1,'John','asd@asd.com')
INSERT  NewTable VALUES (2,'Sam','asd@asd.com')
INSERT  NewTable VALUES (3,'Tom','asd@asd.com')
INSERT  NewTable VALUES (4,'Bob','bob@asd.com')
INSERT  NewTable VALUES (5,'Tom','asd@asd.com')

1.使用groupby子句

SELECT
    name,email, COUNT(*) AS Occurence
    FROM NewTable
    GROUP BY name,email
    HAVING COUNT(*)>1

工作原理:

GROUP BY子句按中的值将行分组姓名和电子邮件栏。然后,COUNT()函数返回数字每个组的出现次数(姓名、电子邮件)。然后,HAVING子句保持仅重复组,这些组包含多个发生

2.使用CTE:

要返回每个重复行的整个行,请使用公共表表达式(CTE)将上述查询的结果与NewTable表连接:

WITH cte AS (
    SELECT
        name, 
        email, 
        COUNT(*) occurrences
    FROM NewTable
    GROUP BY 
        name, 
        email
    HAVING COUNT(*) > 1
)
SELECT 
    t1.Id,
    t1.name, 
    t1.email
FROM  NewTable t1
    INNER JOIN cte ON 
        cte.name = t1.name AND 
        cte.email = t1.email
ORDER BY 
    t1.name, 
    t1.email;

3.使用ROW_NUMBER()函数

WITH cte AS (
    SELECT 
        name, 
        email, 
        ROW_NUMBER() OVER (
            PARTITION BY name,email
            ORDER BY name,email) rownum
    FROM 
        NewTable t1
) 
SELECT 
  * 
FROM 
    cte 
WHERE 
    rownum > 1;

工作原理:

ROW_NUMBER()将NewTable表的行按名称和电子邮件列中的值分配到分区中。重复的行在名称和电子邮件列中具有重复的值,但行号不同外部查询删除每个组中的第一行。

好吧,现在我相信,你可以有正确的想法,如何找到重复,并应用逻辑在所有可能的场景中找到重复。谢谢

 SELECT name, email 
    FROM users
    WHERE email in
    (SELECT email FROM users
    GROUP BY email 
    HAVING COUNT(*)>1)

在使用Microsoft Access的情况下,此方法有效:

CREATE TABLE users (id int, name varchar(10), email varchar(50));

INSERT INTO users VALUES (1, 'John', 'asd@asd.com');
INSERT INTO users VALUES (2, 'Sam', 'asd@asd.com');
INSERT INTO users VALUES (3, 'Tom', 'asd@asd.com');
INSERT INTO users VALUES (4, 'Bob', 'bob@asd.com');
INSERT INTO users VALUES (5, 'Tom', 'asd@asd.com');

SELECT name, email, COUNT(*) AS CountOf
FROM users
GROUP BY name, email
HAVING COUNT(*)>1;

DELETE *
FROM users
WHERE id IN (
    SELECT u1.id 
    FROM users u1, users u2 
    WHERE u1.name = u2.name AND u1.email = u2.email AND u1.id > u2.id
);

感谢Tancrede Chazallet的删除代码。

另一种简单的方法是使用解析函数:

SELECT * from 

(SELECT name, email,

COUNT(name) OVER (PARTITION BY name, email) cnt 

FROM users)

WHERE cnt >1;