使用一个字段很容易找到重复项:
SELECT email, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1
所以如果我们有一张桌子
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
这个查询将告诉我们John、Sam、Tom和Tom,因为他们都有相同的电子邮件。
然而,我想要的是获得相同电子邮件和名称的副本。
也就是说,我想得到“汤姆”,“汤姆”。
我需要这个的原因是:我犯了一个错误,允许插入重复的名称和电子邮件值。现在我需要删除/更改重复项,所以我需要先找到它们。
从表中的重复记录检查。
select * from users s
where rowid < any
(select rowid from users k where s.name = k.name and s.email = k.email);
or
select * from users s
where rowid not in
(select max(rowid) from users k where s.name = k.name and s.email = k.email);
删除表中的重复记录。
delete from users s
where rowid < any
(select rowid from users k where s.name = k.name and s.email = k.email);
or
delete from users s
where rowid not in
(select max(rowid) from users k where s.name = k.name and s.email = k.email);
如果要查找重复的数据(通过一个或多个标准),请选择实际的行。
with MYCTE as (
SELECT DuplicateKey1
,DuplicateKey2 --optional
,count(*) X
FROM MyTable
group by DuplicateKey1, DuplicateKey2
having count(*) > 1
)
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt
http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/
如果您希望查看表中是否有重复的行,我使用以下查询:
create table my_table(id int, name varchar(100), email varchar(100));
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (2, 'Aman', 'aman@rms.com');
insert into my_table values (3, 'Tom', 'tom@rms.com');
insert into my_table values (4, 'Raj', 'raj@rms.com');
Select COUNT(1) As Total_Rows from my_table
Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc