我有一个varchar列的表,我想找到在这个列中有重复值的所有记录。我可以使用什么查询来查找重复项?


当前回答

我没有看到任何JOIN方法,它在复制方面有很多用途。

这种方法会给您带来实际的双倍结果。

SELECT t1.* FROM my_table as t1 
LEFT JOIN my_table as t2 
ON t1.name=t2.name and t1.id!=t2.id 
WHERE t2.id IS NOT NULL 
ORDER BY t1.name

其他回答

SELECT t.*,(select count(*) from city as tt
  where tt.name=t.name) as count
  FROM `city` as t
  where (
     select count(*) from city as tt
     where tt.name=t.name
  ) > 1 order by count desc

用你的表格替换城市。 将name替换为字段名

如果要删除具有多个字段的重复行,首先将它们取消为唯一不同的行指定的新唯一键,然后使用group by命令删除具有相同新唯一键的重复行:

Create TEMPORARY table tmp select concat(f1,f2) as cfs,t1.* from mytable as t1;
Create index x_tmp_cfs on tmp(cfs);
Create table unduptable select f1,f2,... from tmp group by cfs;

一个非常晚的贡献…万一这能帮助到以后的任何人…我有一个任务是在一个银行应用程序中找到匹配的交易对(实际上是账户到账户转账的双方),以识别每个账户间转账交易的“从”和“到”,所以我们最终得到了这个:

SELECT 
    LEAST(primaryid, secondaryid) AS transactionid1,
    GREATEST(primaryid, secondaryid) AS transactionid2
FROM (
    SELECT table1.transactionid AS primaryid, 
        table2.transactionid AS secondaryid
    FROM financial_transactions table1
    INNER JOIN financial_transactions table2 
    ON table1.accountid = table2.accountid
    AND table1.transactionid <> table2.transactionid 
    AND table1.transactiondate = table2.transactiondate
    AND table1.sourceref = table2.destinationref
    AND table1.amount = (0 - table2.amount)
) AS DuplicateResultsTable
GROUP BY transactionid1
ORDER BY transactionid1;

The result is that the DuplicateResultsTable provides rows containing matching (i.e. duplicate) transactions, but it also provides the same transaction id's in reverse the second time it matches the same pair, so the outer SELECT is there to group by the first transaction ID, which is done by using LEAST and GREATEST to make sure the two transactionid's are always in the same order in the results, which makes it safe to GROUP by the first one, thus eliminating all the duplicate matches. Ran through nearly a million records and identified 12,000+ matches in just under 2 seconds. Of course the transactionid is the primary index, which really helped.

下面的代码将找到所有使用了不止一次的product_id。每个product_id只能得到一条记录。

SELECT product_id FROM oc_product_reward GROUP BY product_id HAVING count( product_id ) >1

代码取自:http://chandreshrana.blogspot.in/2014/12/find-duplicate-records-based-on-any.html

我没有看到任何JOIN方法,它在复制方面有很多用途。

这种方法会给您带来实际的双倍结果。

SELECT t1.* FROM my_table as t1 
LEFT JOIN my_table as t2 
ON t1.name=t2.name and t1.id!=t2.id 
WHERE t2.id IS NOT NULL 
ORDER BY t1.name