如何删除没有唯一行id存在的重复行?

我的座位是

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

我想留下以下重复删除后:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

我尝试了一些查询,但我认为他们取决于有一个行id,因为我没有得到想要的结果。例如:

DELETE
FROM table
WHERE col1 IN (
    SELECT id
    FROM table
    GROUP BY id
    HAVING (COUNT(col1) > 1)
)

当前回答

您需要根据字段对重复的记录进行分组,然后保留其中一条记录并删除其余记录。 例如:

DELETE prg.Person WHERE Id IN (
SELECT dublicateRow.Id FROM
(
select MIN(Id) MinId, NationalCode
 from  prg.Person group by NationalCode  having count(NationalCode ) > 1
 ) GroupSelect
 JOIN  prg.Person dublicateRow ON dublicateRow.NationalCode = GroupSelect.NationalCode 
 WHERE dublicateRow.Id <> GroupSelect.MinId)

其他回答

我喜欢CTE和ROW_NUMBER,因为两者结合起来可以让我们看到哪些行被删除(或更新),因此只需更改DELETE FROM CTE…SELECT * FROM CTE:

WITH CTE AS(
   SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
       RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)
   FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1

DEMO(结果不同;我想这是由于你的打字错误。

COL1    COL2    COL3    COL4    COL5    COL6    COL7
john    1        1       1       1       1       1
sally   2        2       2       2       2       2

由于使用了PARTITION by col1,这个例子通过一个列col1来确定重复项。如果你想包含多个列,只需将它们添加到PARTITION BY:

ROW_NUMBER()OVER(PARTITION BY Col1, Col2, ... ORDER BY OrderColumn)

微软有一个关于如何删除重复文件的非常简洁的指南。查看http://support.microsoft.com/kb/139444

简而言之,当你只有几行要删除时,下面是删除重复项的最简单方法:

SET rowcount 1;
DELETE FROM t1 WHERE myprimarykey=1;

Myprimarykey是行标识符。

我将rowcount设置为1,因为我只有两行复制。如果我复制了3行,那么我就会将rowcount设置为2,这样它就会删除它看到的前两行,只在表t1中留下一行。

请简单地在SELECT命令后添加关键字DISTINCT, 例如:

SELECT DISTICNT ColumnOne, ColumnTwo, ColumnThree
 FROM YourTable
-- this query will keep only one instance of a duplicate record.
;WITH cte
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY col1, col2, col3-- based on what? --can be multiple columns
                                       ORDER BY ( SELECT 0)) RN
         FROM   Mytable)



delete  FROM cte
WHERE  RN > 1

删除所有重复项,但删除第一个重复项(具有最小ID)

应该同样适用于其他SQL服务器,如Postgres:

DELETE FROM table
WHERE id NOT IN (
   select min(id) from table
   group by col1, col2, col3, col4, col5, col6, col7
)