我需要从一个相当大的SQL Server表(即300,000+行)中删除重复的行。
当然,由于RowID标识字段的存在,这些行不会完全重复。
MyTable
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null
我该怎么做呢?
我需要从一个相当大的SQL Server表(即300,000+行)中删除重复的行。
当然,由于RowID标识字段的存在,这些行不会完全重复。
MyTable
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null
我该怎么做呢?
当前回答
DELETE
FROM MyTable
WHERE NOT EXISTS (
SELECT min(RowID)
FROM Mytable
WHERE (SELECT RowID
FROM Mytable
GROUP BY Col1, Col2, Col3
))
);
其他回答
alter table MyTable add sno int identity(1,1)
delete from MyTable where sno in
(
select sno from (
select *,
RANK() OVER ( PARTITION BY RowID,Col3 ORDER BY sno DESC )rank
From MyTable
)T
where rank>1
)
alter table MyTable
drop column sno
对于表结构
MyTable
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null
删除重复项的查询:
DELETE t1
FROM MyTable t1
INNER JOIN MyTable t2
WHERE t1.RowID > t2.RowID
AND t1.Col1 = t2.Col1
AND t1.Col2=t2.Col2
AND t1.Col3=t2.Col3;
我假设RowID是一种自动递增,其余列有重复的值。
现在让我们看看elasticalsearch表,这个表有重复的行,Id是相同的uniq字段。我们知道如果某个id存在于某个组条件下,那么我们可以删除该组作用域之外的其他行。我的举止表明了这一标准。
很多情况下,这个线程是在类似的状态,我。只需根据删除重复(重复)行的情况更改目标组条件。
DELETE
FROM elasticalsearch
WHERE Id NOT IN
(SELECT min(Id)
FROM elasticalsearch
GROUP BY FirmId,FilterSearchString
)
干杯
我想这会很有帮助。这里,ROW_NUMBER() OVER(分区由res1。Title ORDER BY res1.Id)作为num来区分重复的行。
delete FROM
(SELECT res1.*,ROW_NUMBER() OVER(PARTITION BY res1.Title ORDER BY res1.Id)as num
FROM
(select * from [dbo].[tbl_countries])as res1
)as res2
WHERE res2.num > 1
我有一个表,需要保存不重复的行。 我不确定速度和效率。
DELETE FROM myTable WHERE RowID IN (
SELECT MIN(RowID) AS IDNo FROM myTable
GROUP BY Col1, Col2, Col3
HAVING COUNT(*) = 2 )