我有一个varchar列的表,我想找到在这个列中有重复值的所有记录。我可以使用什么查询来查找重复项?
当前回答
为了获得所有包含复制的数据,我使用了以下方法:
SELECT * FROM TableName INNER JOIN(
SELECT DupliactedData FROM TableName GROUP BY DupliactedData HAVING COUNT(DupliactedData) > 1 order by DupliactedData)
temp ON TableName.DupliactedData = temp.DupliactedData;
TableName =您正在使用的表。
DupliactedData =您正在寻找的重复数据。
其他回答
我看到上面的结果和查询将工作良好,如果你需要检查单列值是重复的。比如电子邮件。
但如果你需要检查更多的列,并希望检查结果的组合,那么这个查询将正常工作:
SELECT COUNT(CONCAT(name,email)) AS tot,
name,
email
FROM users
GROUP BY CONCAT(name,email)
HAVING tot>1 (This query will SHOW the USER list which ARE greater THAN 1
AND also COUNT)
我更喜欢使用窗口函数(MySQL 8.0+)来查找副本,因为我可以看到整行:
WITH cte AS (
SELECT *
,COUNT(*) OVER(PARTITION BY col_name) AS num_of_duplicates_group
,ROW_NUMBER() OVER(PARTITION BY col_name ORDER BY col_name2) AS pos_in_group
FROM table
)
SELECT *
FROM cte
WHERE num_of_duplicates_group > 1;
DB小提琴演示
SELECT
t.*,
(SELECT COUNT(*) FROM city AS tt WHERE tt.name=t.name) AS count
FROM `city` AS t
WHERE
(SELECT count(*) FROM city AS tt WHERE tt.name=t.name) > 1 ORDER BY count DESC
假设您的表名为TableABC,您想要的列是Col, T1的主键是key。
SELECT a.Key, b.Key, a.Col
FROM TableABC a, TableABC b
WHERE a.Col = b.Col
AND a.Key <> b.Key
与上面的答案相比,这种方法的优点是它给出了Key。
如果要删除具有多个字段的重复行,首先将它们取消为唯一不同的行指定的新唯一键,然后使用group by命令删除具有相同新唯一键的重复行:
Create TEMPORARY table tmp select concat(f1,f2) as cfs,t1.* from mytable as t1;
Create index x_tmp_cfs on tmp(cfs);
Create table unduptable select f1,f2,... from tmp group by cfs;