我有一个表与以下字段:

id (Unique)
url (Unique)
title
company
site_id

现在,我需要删除具有相同标题、company和site_id的行。一种方法是使用下面的SQL和脚本(PHP):

SELECT title, site_id, location, id, count( * ) 
FROM jobs
GROUP BY site_id, company, title, location
HAVING count( * ) >1

运行此查询后,可以使用服务器端脚本删除重复项。

但是,我想知道这是否只能使用SQL查询。


当前回答

我有这个查询片段的SQLServer,但我认为它可以用在其他DBMS与小的变化:

DELETE
FROM Table
WHERE Table.idTable IN  (  
    SELECT MAX(idTable)
    FROM idTable
    GROUP BY field1, field2, field3
    HAVING COUNT(*) > 1)

我忘了告诉您,这个查询不会删除重复行中id最低的行。如果这对你有用,试试这个查询:

DELETE
FROM jobs
WHERE jobs.id IN  (  
    SELECT MAX(id)
    FROM jobs
    GROUP BY site_id, company, title, location
    HAVING COUNT(*) > 1)

其他回答

简单和快速的所有情况:

CREATE TEMPORARY TABLE IF NOT EXISTS _temp_duplicates AS (SELECT dub.id FROM table_with_duplications dub GROUP BY dub.field_must_be_uniq_1, dub.field_must_be_uniq_2 HAVING COUNT(*)  > 1);

DELETE FROM table_with_duplications WHERE id IN (SELECT id FROM _temp_duplicates);

如果不想更改列属性,那么可以使用下面的查询。

因为你有一个列,它有唯一的id(例如,auto_increment列),你可以用它来删除重复项:

DELETE `a`
FROM
    `jobs` AS `a`,
    `jobs` AS `b`
WHERE
    -- IMPORTANT: Ensures one version remains
    -- Change "ID" to your unique column's name
    `a`.`ID` < `b`.`ID`

    -- Any duplicates you want to check for
    AND (`a`.`title` = `b`.`title` OR `a`.`title` IS NULL AND `b`.`title` IS NULL)
    AND (`a`.`company` = `b`.`company` OR `a`.`company` IS NULL AND `b`.`company` IS NULL)
    AND (`a`.`site_id` = `b`.`site_id` OR `a`.`site_id` IS NULL AND `b`.`site_id` IS NULL);

在MySQL中,你可以用空安全的相等操作符(又名“宇宙飞船操作符”)来简化它:

DELETE `a`
FROM
    `jobs` AS `a`,
    `jobs` AS `b`
WHERE
    -- IMPORTANT: Ensures one version remains
    -- Change "ID" to your unique column's name
    `a`.`ID` < `b`.`ID`

    -- Any duplicates you want to check for
    AND `a`.`title` <=> `b`.`title`
    AND `a`.`company` <=> `b`.`company`
    AND `a`.`site_id` <=> `b`.`site_id`;

这将删除标题、公司和网站值相同的重复行。最后一个重复项将被保留,其余重复项将被删除(如果你想保留第一个重复项并删除其他重复项,将id的比较项更改为大于t1。Id > t2.id)

DELETE t1 FROM tablename t1
INNER JOIN tablename t2 
WHERE 
    t1.id < t2.id AND
    t1.title = t2.title AND
    t1.company=t2.company AND
    t1.site_ID=t2.site_ID;

你可以很容易地从这个代码中删除重复的记录。

$qry = mysql_query("SELECT * from cities");
while($qry_row = mysql_fetch_array($qry))
{
$qry2 = mysql_query("SELECT * from cities2 where city = '".$qry_row['city']."'");

if(mysql_num_rows($qry2) > 1){
    while($row = mysql_fetch_array($qry2)){
        $city_arry[] = $row;

        }

    $total = sizeof($city_arry) - 1;
        for($i=1; $i<=$total; $i++){


            mysql_query( "delete from cities2 where town_id = '".$city_arry[$i][0]."'");

            }
    }
    //exit;
}

我想更具体地说明我删除了哪些记录,下面是我的解决方案:

delete
from jobs c1
where not c1.location = 'Paris'
and  c1.site_id > 64218
and exists 
(  
select * from jobs c2 
where c2.site_id = c1.site_id
and   c2.company = c1.company
and   c2.location = c1.location
and   c2.title = c1.title
and   c2.site_id > 63412
and   c2.site_id < 64219
)