我有一个表与以下字段:

id (Unique)
url (Unique)
title
company
site_id

现在,我需要删除具有相同标题、company和site_id的行。一种方法是使用下面的SQL和脚本(PHP):

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1

运行此查询后,可以使用服务器端脚本删除重复项。

但是,我想知道这是否只能使用SQL查询。


当前回答

我有一个忘记在id行添加主键的表。虽然它在id上有auto_increment。但是有一天,一个东西在数据库中重放了mysql bin日志,插入了一些重复的行。

我删除重复的行

选择唯一的重复行并导出它们 选择T1。(select count(*) as c,id from table_name group by id) T2 on T1。id = T2。T2.c > 1 group by T1.id; 按id删除重复的行 从导出的数据中插入该行。 然后在id上添加主键

其他回答

我找到了一个简单的方法。(保持最新的)

DELETE t1 FROM table_name t1 INNER JOIN table_name t2 
WHERE t1.primary_id < t2.primary_id 
AND t1.check_duplicate_col_1 = t2.check_duplicate_col_1 
AND t1.check_duplicate_col_2 = t2.check_duplicate_col_2
...

一种简单易懂且不需要主键的解决方案:

add a new boolean column alter table mytable add tokeep boolean; add a constraint on the duplicated columns AND the new column alter table mytable add constraint preventdupe unique (mycol1, mycol2, tokeep); set the boolean column to true. This will succeed only on one of the duplicated rows because of the new constraint update ignore mytable set tokeep = true; delete rows that have not been marked as tokeep delete from mytable where tokeep is null; drop the added column alter table mytable drop tokeep;

我建议您保留您添加的约束,以便将来防止出现新的重复。

删除表中重复的记录。

delete from job s 
where rowid < any 
(select rowid from job k 
where s.site_id = k.site_id and 
s.title = k.title and 
s.company = k.company);

or

delete from job s 
where rowid not in 
(select max(rowid) from job k 
where s.site_id = k.site_id and
s.title = k.title and 
s.company = k.company);

MySQL对引用要删除的表有限制。你可以用一个临时表来解决这个问题,比如:

create temporary table tmpTable (id int);

insert  into tmpTable
        (id)
select  id
from    YourTable yt
where   exists
        (
        select  *
        from    YourTabe yt2
        where   yt2.title = yt.title
                and yt2.company = yt.company
                and yt2.site_id = yt.site_id
                and yt2.id > yt.id
        );

delete  
from    YourTable
where   ID in (select id from tmpTable);

以下是Kostanos在评论中的建议: 上面唯一缓慢的查询是DELETE,适用于数据库非常大的情况。这个查询可以更快:

DELETE FROM YourTable USING YourTable, tmpTable WHERE YourTable.id=tmpTable.id

我一直在访问这个页面,任何时候我谷歌“从mysql中删除重复”,但对于我的忽略解决方案不起作用,因为我有一个InnoDB mysql表

这段代码在任何时候都能更好地工作

CREATE TABLE tableToclean_temp LIKE tableToclean;
ALTER TABLE tableToclean_temp ADD UNIQUE INDEX (fontsinuse_id);
INSERT IGNORE INTO tableToclean_temp SELECT * FROM tableToclean;
DROP TABLE tableToclean;
RENAME TABLE tableToclean_temp TO tableToclean;

tableToclean =需要清理的表的名称

tableToclean_temp =创建和删除的临时表