如何删除没有唯一行id存在的重复行?

我的座位是

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

我想留下以下重复删除后:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

我尝试了一些查询,但我认为他们取决于有一个行id,因为我没有得到想要的结果。例如:

DELETE
FROM table
WHERE col1 IN (
    SELECT id
    FROM table
    GROUP BY id
    HAVING (COUNT(col1) > 1)
)

当前回答

DELETE p1 FROM Person p1,
    Person p2
WHERE
    p1.Email = p2.Email AND p1.Id > p2.Id

其他回答

在mysql中有两个解决方案:

A)使用Delete JOIN语句删除重复的行

DELETE t1 FROM contacts t1
INNER JOIN contacts t2 
WHERE 
    t1.id < t2.id AND 
    t1.email = t2.email;

该查询两次引用联系人表,因此,它使用表别名t1和t2。

输出结果为:

1 查询确定,影响4行(0.10秒)

如果你想删除重复的行并保留最低的id,你可以使用下面的语句:

DELETE c1 FROM contacts c1
INNER JOIN contacts c2 
WHERE
    c1.id > c2.id AND 
    c1.email = c2.email;

   

B)使用中间表删除重复的行

下面是使用中间表删除重复行的步骤:

1。创建一个新表,其结构与要删除重复行的原始表相同。

2。将原始表中的不同行插入到直接表中。

3所示。将原始表中的不同行插入到直接表中。

 

步骤1。创建一个与原表结构相同的新表:

CREATE TABLE source_copy LIKE source;

步骤2。从原表中插入不同的行到新表中:

INSERT INTO source_copy
SELECT * FROM source
GROUP BY col; -- column that has duplicate values

步骤3。删除原始表并将直接表重命名为原始表

DROP TABLE source;
ALTER TABLE source_copy RENAME TO source;

来源:http://www.mysqltutorial.org/mysql-delete-duplicate-rows/

我更喜欢CTE从sql server表中删除重复的行

强烈推荐阅读本文::http://codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/

保持原创性

WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY col1,col2,col3 ORDER BY col1,col2,col3) AS RN
FROM MyTable
)

DELETE FROM CTE WHERE RN<>1

不保留原创

WITH CTE AS
(SELECT *,R=RANK() OVER (ORDER BY col1,col2,col3)
FROM MyTable)
 
DELETE CTE
WHERE R IN (SELECT R FROM CTE GROUP BY R HAVING COUNT(*)>1)
DELETE FROM TBL1  WHERE ID  IN
(SELECT ID FROM TBL1  a WHERE ID!=
(select MAX(ID) from TBL1  where DUPVAL=a.DUPVAL 
group by DUPVAL
having count(DUPVAL)>1))

另一种在不丢失信息的情况下删除重复行的方法如下:

delete from dublicated_table t1 (nolock)
join (
    select t2.dublicated_field
    , min(len(t2.field_kept)) as min_field_kept
    from dublicated_table t2 (nolock)
    group by t2.dublicated_field having COUNT(*)>1
) t3 
on t1.dublicated_field=t3.dublicated_field 
    and len(t1.field_kept)=t3.min_field_kept

在尝试了上面建议的解决方案后,它适用于小型中型表。 我可以为非常大的表提出这个解决方案。因为它在迭代中运行。

Drop all dependency views of the LargeSourceTable you can find the dependecies by using sql managment studio, right click on the table and click "View Dependencies" Rename the table: sp_rename 'LargeSourceTable', 'LargeSourceTable_Temp'; GO Create the LargeSourceTable again, but now, add a primary key with all the columns that define the duplications add WITH (IGNORE_DUP_KEY = ON) For example: CREATE TABLE [dbo].[LargeSourceTable] ( ID int IDENTITY(1,1), [CreateDate] DATETIME CONSTRAINT [DF_LargeSourceTable_CreateDate] DEFAULT (getdate()) NOT NULL, [Column1] CHAR (36) NOT NULL, [Column2] NVARCHAR (100) NOT NULL, [Column3] CHAR (36) NOT NULL, PRIMARY KEY (Column1, Column2) WITH (IGNORE_DUP_KEY = ON) ); GO Create again the views that you dropped in the first place for the new created table Now, Run the following sql script, you will see the results in 1,000,000 rows per page, you can change the row number per page to see the results more often. Note, that I set the IDENTITY_INSERT on and off because one the columns contains auto incremental id, which I'm also copying

设置IDENTITY_INSERT LargeSourceTable ON 声明@PageNumber为INT, @RowspPage为INT 声明@TotalRows为INT 声明@dt varchar(19) SET @PageNumber = 0 SET @RowspPage = 1000000 select @TotalRows = count (*) from LargeSourceTable_TEMP

While ((@PageNumber - 1) * @RowspPage < @TotalRows )
Begin
    begin transaction tran_inner
        ; with cte as
        (
            SELECT * FROM LargeSourceTable_TEMP ORDER BY ID
            OFFSET ((@PageNumber) * @RowspPage) ROWS
            FETCH NEXT @RowspPage ROWS ONLY
        )

        INSERT INTO LargeSourceTable 
        (
             ID                     
            ,[CreateDate]       
            ,[Column1]   
            ,[Column2] 
            ,[Column3]       
        )       
        select 
             ID                     
            ,[CreateDate]       
            ,[Column1]   
            ,[Column2] 
            ,[Column3]       
        from cte

    commit transaction tran_inner

    PRINT 'Page: ' + convert(varchar(10), @PageNumber)
    PRINT 'Transfered: ' + convert(varchar(20), @PageNumber * @RowspPage)
    PRINT 'Of: ' + convert(varchar(20), @TotalRows)

    SELECT @dt = convert(varchar(19), getdate(), 121)
    RAISERROR('Inserted on: %s', 0, 1, @dt) WITH NOWAIT
    SET @PageNumber = @PageNumber + 1
End

SET IDENTITY_INSERT LargeSourceTable OFF