如何删除没有唯一行id存在的重复行?
我的座位是
col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
sally 2 2 2 2 2 2
我想留下以下重复删除后:
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
我尝试了一些查询,但我认为他们取决于有一个行id,因为我没有得到想要的结果。例如:
DELETE
FROM table
WHERE col1 IN (
SELECT id
FROM table
GROUP BY id
HAVING (COUNT(col1) > 1)
)
如果你有能力临时添加一个列到表中,这是一个适合我的解决方案:
ALTER TABLE dbo.DUPPEDTABLE ADD RowID INT NOT NULL IDENTITY(1,1)
然后使用MIN和GROUP BY的组合执行DELETE
DELETE b
FROM dbo.DUPPEDTABLE b
WHERE b.RowID NOT IN (
SELECT MIN(RowID) AS RowID
FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
GROUP BY a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE
);
验证DELETE执行正确:
SELECT a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE, COUNT(*)--MIN(RowID) AS RowID
FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
GROUP BY a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE
ORDER BY COUNT(*) DESC
结果中不应有计数大于1的行。最后,删除rowid列:
ALTER TABLE dbo.DUPPEDTABLE DROP COLUMN RowID;
我喜欢CTE和ROW_NUMBER,因为两者结合起来可以让我们看到哪些行被删除(或更新),因此只需更改DELETE FROM CTE…SELECT * FROM CTE:
WITH CTE AS(
SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)
FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1
DEMO(结果不同;我想这是由于你的打字错误。
COL1 COL2 COL3 COL4 COL5 COL6 COL7
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
由于使用了PARTITION by col1,这个例子通过一个列col1来确定重复项。如果你想包含多个列,只需将它们添加到PARTITION BY:
ROW_NUMBER()OVER(PARTITION BY Col1, Col2, ... ORDER BY OrderColumn)
如果你有能力临时添加一个列到表中,这是一个适合我的解决方案:
ALTER TABLE dbo.DUPPEDTABLE ADD RowID INT NOT NULL IDENTITY(1,1)
然后使用MIN和GROUP BY的组合执行DELETE
DELETE b
FROM dbo.DUPPEDTABLE b
WHERE b.RowID NOT IN (
SELECT MIN(RowID) AS RowID
FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
GROUP BY a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE
);
验证DELETE执行正确:
SELECT a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE, COUNT(*)--MIN(RowID) AS RowID
FROM dbo.DUPPEDTABLE a WITH (NOLOCK)
GROUP BY a.ITEM_NUMBER,
a.CHARACTERISTIC,
a.INTVALUE,
a.FLOATVALUE,
a.STRINGVALUE
ORDER BY COUNT(*) DESC
结果中不应有计数大于1的行。最后,删除rowid列:
ALTER TABLE dbo.DUPPEDTABLE DROP COLUMN RowID;