我有一个表与以下字段:

id (Unique)
url (Unique)
title
company
site_id

现在,我需要删除具有相同标题、company和site_id的行。一种方法是使用下面的SQL和脚本(PHP):

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1

运行此查询后,可以使用服务器端脚本删除重复项。

但是,我想知道这是否只能使用SQL查询。


当前回答

如果您试图删除其中一个副本并保留另一个副本,那么这是完美的。注意,如果没有子查询,您将得到#1093错误。

DELETE FROM table_name
WHERE id IN (
    SELECT * FROM (SELECT n.id FROM table_name n
    WHERE n.column2 != "value"
    GROUP BY n.column HAVING COUNT(n.column ) > 1) x
)

其他回答

我想更具体地说明我删除了哪些记录,下面是我的解决方案:

delete
from jobs c1
where not c1.location = 'Paris'
and  c1.site_id > 64218
and exists 
(  
select * from jobs c2 
where c2.site_id = c1.site_id
and   c2.company = c1.company
and   c2.location = c1.location
and   c2.title = c1.title
and   c2.site_id > 63412
and   c2.site_id < 64219
)

你可以很容易地从这个代码中删除重复的记录。

$qry = mysql_query("SELECT * from cities");
while($qry_row = mysql_fetch_array($qry))
{
$qry2 = mysql_query("SELECT * from cities2 where city = '".$qry_row['city']."'");

if(mysql_num_rows($qry2) > 1){
    while($row = mysql_fetch_array($qry2)){
        $city_arry[] = $row;

        }

    $total = sizeof($city_arry) - 1;
        for($i=1; $i<=$total; $i++){


            mysql_query( "delete from cities2 where town_id = '".$city_arry[$i][0]."'");

            }
    }
    //exit;
}

这将删除标题、公司和网站值相同的重复行。最后一个重复项将被保留,其余重复项将被删除(如果你想保留第一个重复项并删除其他重复项,将id的比较项更改为大于t1。Id > t2.id)

DELETE t1 FROM tablename t1
INNER JOIN tablename t2 
WHERE 
    t1.id < t2.id AND
    t1.title = t2.title AND
    t1.company=t2.company AND
    t1.site_ID=t2.site_ID;

如果你有一个有大量记录的大表,那么上述解决方案将不起作用或花费太多时间。然后我们有一个不同的解

-- Create temporary table

CREATE TABLE temp_table LIKE table1;

-- Add constraint
ALTER TABLE temp_table ADD UNIQUE(title, company,site_id);

-- Copy data
INSERT IGNORE INTO temp_table SELECT * FROM table1;

-- Rename and drop
RENAME TABLE table1 TO old_table1, temp_table TO table1;
DROP TABLE old_table1;

简单和快速的所有情况:

CREATE TEMPORARY TABLE IF NOT EXISTS _temp_duplicates AS (SELECT dub.id FROM table_with_duplications dub GROUP BY dub.field_must_be_uniq_1, dub.field_must_be_uniq_2 HAVING COUNT(*)  > 1);

DELETE FROM table_with_duplications WHERE id IN (SELECT id FROM _temp_duplicates);