我有一个组织的SQL Server数据库,有许多重复的行。我想运行一个选择语句来获取所有这些和被欺骗的数量,同时还返回与每个组织相关的id。
这样的陈述:
SELECT orgName, COUNT(*) AS dupes
FROM organizations
GROUP BY orgName
HAVING (COUNT(*) > 1)
将返回如下内容
orgName | dupes
ABC Corp | 7
Foo Federation | 5
Widget Company | 2
但我也想要他们的id。有什么办法可以做到吗?也许就像
orgName | dupeCount | id
ABC Corp | 1 | 34
ABC Corp | 2 | 5
...
Widget Company | 1 | 10
Widget Company | 2 | 2
原因是还有一个单独的用户表链接到这些组织,我想把它们统一起来(因此删除dupes,用户链接到同一个组织,而不是dupe组织)。但我想手动部分,所以我不会搞砸任何事情,但我仍然需要一个语句返回所有的dupe组织的id,这样我就可以通过用户列表。
你可以这样做:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
如果你只想返回可以删除的记录(每个记录只留下一个),你可以使用:
SELECT
id, orgName
FROM (
SELECT
orgName, id,
ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
FROM organizations
) AS d
WHERE intRow != 1
编辑:SQL Server 2000没有ROW_NUMBER()函数。相反,你可以使用:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id
你可以这样做:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
如果你只想返回可以删除的记录(每个记录只留下一个),你可以使用:
SELECT
id, orgName
FROM (
SELECT
orgName, id,
ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY id) AS intRow
FROM organizations
) AS d
WHERE intRow != 1
编辑:SQL Server 2000没有ROW_NUMBER()函数。相反,你可以使用:
SELECT
o.id, o.orgName, d.intCount
FROM (
SELECT orgName, COUNT(*) as intCount, MIN(id) AS minId
FROM organizations
GROUP BY orgName
HAVING COUNT(*) > 1
) AS d
INNER JOIN organizations o ON o.orgName = d.orgName
WHERE d.minId != o.id