我有以下UPSERT在PostgreSQL 9.5:

INSERT INTO chats ("user", "contact", "name") 
           VALUES ($1, $2, $3), 
                  ($2, $1, NULL) 
ON CONFLICT("user", "contact") DO NOTHING
RETURNING id;

如果没有冲突,则返回如下内容:

----------
    | id |
----------
  1 | 50 |
----------
  2 | 51 |
----------

但是如果有冲突,它不会返回任何行:

----------
    | id |
----------

如果没有冲突,我想返回新的id列,或者返回冲突列的现有id列。 这能做到吗?如果有,怎么做?


当前回答

我也遇到了同样的问题,我使用“做更新”而不是“什么都不做”来解决它,尽管我没有什么可更新的。在你的情况下,它会是这样的:

INSERT INTO chats ("user", "contact", "name") 
       VALUES ($1, $2, $3), 
              ($2, $1, NULL) 
ON CONFLICT("user", "contact") 
DO UPDATE SET 
    name=EXCLUDED.name 
RETURNING id;

这个查询将返回所有的行,无论它们是刚刚插入的还是之前已经存在的。

其他回答

WITH e AS(
    INSERT INTO chats ("user", "contact", "name") 
           VALUES ($1, $2, $3), 
                  ($2, $1, NULL) 
    ON CONFLICT("user", "contact") DO NOTHING
    RETURNING id
)
SELECT * FROM e
UNION
    SELECT id FROM chats WHERE user=$1, contact=$2;

使用ON冲突DO NOTHING的主要目的是避免抛出错误,但它不会导致行返回。因此,我们需要另一个SELECT来获取现有的id。

在这个SQL中,如果它在冲突上失败,它将不返回任何东西,然后第二个SELECT将获得现有的行;如果插入成功,那么将会有两条相同的记录,然后我们需要UNION来合并结果。

对于单个项的插入,我可能会在返回id时使用一个coalesce:

WITH new_chats AS (
    INSERT INTO chats ("user", "contact", "name")
    VALUES ($1, $2, $3)
    ON CONFLICT("user", "contact") DO NOTHING
    RETURNING id
) SELECT COALESCE(
    (SELECT id FROM new_chats),
    (SELECT id FROM chats WHERE user = $1 AND contact = $2)
);

对于多个项的插入,你可以把这些值放在一个临时WITH中,并在以后引用它们:

WITH chats_values("user", "contact", "name") AS (
    VALUES ($1, $2, $3),
           ($4, $5, $6)
), new_chats AS (
    INSERT INTO chats ("user", "contact", "name")
    SELECT * FROM chat_values
    ON CONFLICT("user", "contact") DO NOTHING
    RETURNING id
) SELECT id
    FROM new_chats
   UNION
  SELECT chats.id
    FROM chats, chats_values
   WHERE chats.user = chats_values.user
     AND chats.contact = chats_values.contact;

注意:根据Erwin的评论,在您的应用程序将尝试'upsert'相同的数据并发(两个worker尝试同时插入<unique_field> = 1),并且这样的数据还不存在于表中,您应该在运行'upsert'之前更改事务的隔离级别:

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

在这种特定情况下,两个事务中的一个将被中止。如果这种情况在您的应用程序中经常发生,您可能希望只执行2个单独的查询,否则,处理错误并重新执行查询会更容易、更快。

根据Erwin上面的回答(顺便说一句,这是一个很棒的答案,如果没有它,我永远不会走到这里!),这就是我最终的结果。它解决了两个额外的潜在问题——通过对输入集执行不同的选择,它允许重复(否则会抛出错误),并且它确保返回的id与输入集完全匹配,包括相同的顺序和允许重复。

Additionally, and one part that was important for me, it significantly reduces the number of unnecessary sequence advancements using the new_rows CTE to only try inserting the ones that aren't already in there. Considering the possibility of concurrent writes, it will still hit some conflicts in that reduced set, but the later steps will take care of that. In most cases, sequence gaps aren't a big deal, but when you're doing billions of upserts, with a high percentage of conflicts, it can make the difference between using an int or a bigint for the ID.

尽管它又大又丑,但性能非常好。我用数百万个upserts、高并发性和大量冲突对它进行了广泛测试。坚如磐石。

我将其打包为一个函数,但如果这不是您想要的,那么应该很容易看到如何将其转换为纯SQL。我还将示例数据更改为简单的内容。

CREATE TABLE foo
(
  bar varchar PRIMARY KEY,
  id  serial
);
CREATE TYPE ids_type AS (id integer);
CREATE TYPE bars_type AS (bar varchar);

CREATE OR REPLACE FUNCTION upsert_foobars(_vals bars_type[])
  RETURNS SETOF ids_type AS
$$
BEGIN
  RETURN QUERY
    WITH
      all_rows AS (
        SELECT bar, ordinality
        FROM UNNEST(_vals) WITH ORDINALITY
      ),
      dist_rows AS (
        SELECT DISTINCT bar
        FROM all_rows
      ),
      new_rows AS (
        SELECT d.bar
        FROM dist_rows d
             LEFT JOIN foo f USING (bar)
        WHERE f.bar IS NULL
      ),
      ins AS (
        INSERT INTO foo (bar)
          SELECT bar
          FROM new_rows
          ORDER BY bar
          ON CONFLICT DO NOTHING
          RETURNING bar, id
      ),
      sel AS (
        SELECT bar, id
        FROM ins
        UNION ALL
        SELECT f.bar, f.id
        FROM dist_rows
             JOIN foo f USING (bar)
      ),
      ups AS (
        INSERT INTO foo AS f (bar)
          SELECT d.bar
          FROM dist_rows d
               LEFT JOIN sel s USING (bar)
          WHERE s.bar IS NULL
          ORDER BY bar
          ON CONFLICT ON CONSTRAINT foo_pkey DO UPDATE
            SET bar = f.bar
          RETURNING bar, id
      ),
      fin AS (
        SELECT bar, id
        FROM sel
        UNION ALL
        TABLE ups
      )
    SELECT f.id
    FROM all_rows a
         JOIN fin f USING (bar)
    ORDER BY a.ordinality;
END
$$ LANGUAGE plpgsql;

我修改了Erwin Brandstetter的惊人答案,它不会增加序列,也不会写锁任何行。我对PostgreSQL比较陌生,所以如果你看到这个方法的任何缺点,请随时告诉我:

WITH input_rows(usr, contact, name) AS (
   VALUES
      (text 'foo1', text 'bar1', text 'bob1')  -- type casts in first row
    , ('foo2', 'bar2', 'bob2')
    -- more?
   )
, new_rows AS (
   SELECT 
     c.usr
     , c.contact
     , c.name
     , r.id IS NOT NULL as row_exists
   FROM input_rows AS r
   LEFT JOIN chats AS c ON r.usr=c.usr AND r.contact=c.contact
   )
INSERT INTO chats (usr, contact, name)
SELECT usr, contact, name
FROM new_rows
WHERE NOT row_exists
RETURNING id, usr, contact, name

这假设表聊天对列(usr, contact)有唯一的约束。

更新:添加了来自spatar的建议修订(如下)。谢谢!

根据Revinand的评论,还有一个更新:

WITH input_rows(usr, contact, name) AS (
   VALUES
      (text 'foo1', text 'bar1', text 'bob1')  -- type casts in first row
    , ('foo2', 'bar2', 'bob2')
    -- more?
   )
, new_rows AS (
   INSERT INTO chats (usr, contact, name)
   SELECT 
     c.usr
     , c.contact
     , c.name
   FROM input_rows AS r
   LEFT JOIN chats AS c ON r.usr=c.usr AND r.contact=c.contact
   WHERE r.id IS NULL
   RETURNING id, usr, contact, name
   )
SELECT id, usr, contact, name, 'new' as row_type
FROM new_rows
UNION ALL
SELECT id, usr, contact, name, 'update' as row_type
FROM input_rows AS ir
INNER JOIN chats AS c ON ir.usr=c.usr AND ir.contact=c.contact

我没有测试上面的内容,但是如果您发现新插入的行被多次返回,那么您可以将UNION ALL更改为只是UNION,或者(更好),完全删除第一个查询。

Upsert是INSERT查询的扩展,在约束冲突的情况下可以用两种不同的行为定义:DO NOTHING或DO UPDATE。

INSERT INTO upsert_table VALUES (2, 6, 'upserted')
   ON CONFLICT DO NOTHING RETURNING *;

 id | sub_id | status
----+--------+--------
 (0 rows)

还要注意,RETURNING没有返回任何东西,因为没有插入任何元组。现在使用DO UPDATE,可以在与之冲突的元组上执行操作。首先请注意,定义一个约束是很重要的,它将用于定义存在冲突。

INSERT INTO upsert_table VALUES (2, 2, 'inserted')
   ON CONFLICT ON CONSTRAINT upsert_table_sub_id_key
   DO UPDATE SET status = 'upserted' RETURNING *;

 id | sub_id |  status
----+--------+----------
  2 |      2 | upserted
(1 row)