我有一个组织的SQL Server数据库,有许多重复的行。我想运行一个选择语句来获取所有这些和被欺骗的数量,同时还返回与每个组织相关的id。

这样的陈述:

SELECT     orgName, COUNT(*) AS dupes  
FROM         organizations  
GROUP BY orgName  
HAVING      (COUNT(*) > 1)

将返回如下内容

orgName        | dupes  
ABC Corp       | 7  
Foo Federation | 5  
Widget Company | 2 

但我也想要他们的id。有什么办法可以做到吗?也许就像

orgName        | dupeCount | id  
ABC Corp       | 1         | 34  
ABC Corp       | 2         | 5  
...  
Widget Company | 1         | 10  
Widget Company | 2         | 2  

原因是还有一个单独的用户表链接到这些组织,我想把它们统一起来(因此删除dupes,用户链接到同一个组织,而不是dupe组织)。但我想手动部分,所以我不会搞砸任何事情,但我仍然需要一个语句返回所有的dupe组织的id,这样我就可以通过用户列表。


当前回答

select a.orgName,b.duplicate, a.id
from organizations a
inner join (
    SELECT orgName, COUNT(*) AS duplicate
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) b on o.orgName = oc.orgName
group by a.orgName,a.id

其他回答

我想我知道你需要什么 我需要混合各种答案,我认为我得到了他想要的解决方案:

select o.id,o.orgName, oc.dupeCount, oc.id,oc.orgName
from organizations o
inner join (
    SELECT MAX(id) as id, orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName

有了Max id,你就会得到副本的id和他要求的原件的id:

id org name , dublicate count (missing out in this case) 
id doublicate org name , doub count (missing out again because does not help in this case)

唯一可悲的是你把它写成了这种形式

id , name , dubid , name

希望它仍然有用

假设我们有一个表'Student',有两列:

student_id int student_name varchar 记录: +------------+---------------------+ | student_id | student_name | +------------+---------------------+ | 101 | usman | | 101 | usman | | 101 | usman | | 102 | usmanyaqoob | | 103 | muhammadusmanyaqoob | | 103 | muhammadusmanyaqoob | +------------+---------------------+

现在我们希望看到重复的记录 使用这个查询:

select student_name,student_id ,count(*) c from student group by student_id,student_name having c>1;

+---------------------+------------+---+
| student_name        | student_id | c |
+---------------------+------------+---+
| usman               |        101 | 3 |
| muhammadusmanyaqoob |        103 | 2 |
+---------------------+------------+---+

你可以试试这个,这对你是最好的

 WITH CTE AS
    (
    SELECT *,RN=ROW_NUMBER() OVER (PARTITION BY orgName ORDER BY orgName DESC) FROM organizations 
    )
    select * from CTE where RN>1
    go
Select * from (Select orgName,id,
ROW_NUMBER() OVER(Partition By OrgName ORDER by id DESC) Rownum
From organizations )tbl Where Rownum>1

所以rowum> 1的记录将是表中的重复记录。'分区由'第一组记录,然后通过给他们序列号序列化他们。 所以rownum> 1将是重复的记录,可以这样删除。

select orgname, count(*) as dupes, id 
from organizations
where orgname in (
    select orgname
    from organizations
    group by orgname
    having (count(*) > 1)
)
group by orgname, id