我有一个表与以下字段:

id (Unique)
url (Unique)
title
company
site_id

现在,我需要删除具有相同标题、company和site_id的行。一种方法是使用下面的SQL和脚本(PHP):

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1

运行此查询后,可以使用服务器端脚本删除重复项。

但是,我想知道这是否只能使用SQL查询。


当前回答

使用Delete JOIN语句删除重复的行:

DELETE t1 FROM table_name t1
JOIN table_name t2
WHERE
    t1.id < t2.id AND
    t1.title = t2.title AND t1.company = t2.company AND t1.site_id = t2.site_id;

其他回答

更快的方法是在临时表中插入不同的行。使用delete,我花了几个小时从一个800万行的表中删除重复项。使用insert和distinct,只花了13分钟。

CREATE TABLE tempTableName LIKE tableName;  
CREATE INDEX ix_all_id ON tableName(cellId,attributeId,entityRowId,value);  
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value) SELECT DISTINCT cellId,attributeId,entityRowId,value FROM tableName;  
TRUNCATE TABLE tableName;
INSERT INTO tableName SELECT * FROM tempTableName; 
DROP TABLE tempTableName;  

如果IGNORE语句在我的例子中不起作用,你可以使用下面的语句:

CREATE TABLE your_table_deduped LIKE your_table;


INSERT your_table_deduped
SELECT *
FROM your_table
GROUP BY index1_id,
         index2_id;

RENAME TABLE your_table TO your_table_with_dupes;

RENAME TABLE your_table_deduped TO your_table;

#OPTIONAL
ALTER TABLE `your_table` ADD UNIQUE `unique_index` (`index1_id`, `index2_id`);

#OPTIONAL
DROP TABLE your_table_with_dupes;

删除表中重复的记录。

delete from job s 
where rowid < any 
(select rowid from job k 
where s.site_id = k.site_id and 
s.title = k.title and 
s.company = k.company);

or

delete from job s 
where rowid not in 
(select max(rowid) from job k 
where s.site_id = k.site_id and
s.title = k.title and 
s.company = k.company);

这个解决方案将把重复的数据移到一个表中,唯一的数据移到另一个表中。

-- speed up creating uniques table if dealing with many rows
CREATE INDEX temp_idx ON jobs(site_id, company, title, location);

-- create the table with unique rows
INSERT jobs_uniques SELECT * FROM
    (
    SELECT * 
    FROM jobs
    GROUP BY site_id, company, title, location
    HAVING count(1) > 1
    UNION
    SELECT *
    FROM jobs
    GROUP BY site_id, company, title, location
    HAVING count(1) = 1
) x

-- create the table with duplicate rows
INSERT jobs_dupes 
SELECT * 
FROM jobs
WHERE id NOT IN
(SELECT id FROM jobs_uniques)

-- confirm the difference between uniques and dupes tables
SELECT COUNT(1)
AS jobs, 
(SELECT COUNT(1) FROM jobs_dupes) + (SELECT COUNT(1) FROM jobs_uniques)
AS sum
FROM jobs

我有一个忘记在id行添加主键的表。虽然它在id上有auto_increment。但是有一天,一个东西在数据库中重放了mysql bin日志,插入了一些重复的行。

我删除重复的行

选择唯一的重复行并导出它们 选择T1。(select count(*) as c,id from table_name group by id) T2 on T1。id = T2。T2.c > 1 group by T1.id; 按id删除重复的行 从导出的数据中插入该行。 然后在id上添加主键