我想运行这个查询:

SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY purchases.purchased_at DESC

但是我得到了这个错误:

SELECT DISTINCT ON表达式必须匹配表达式的初始顺序

添加address_id作为第一个ORDER BY表达式可以消除错误,但我真的不想在address_id上添加排序。有没有可能不按address_id排序?


当前回答

子查询可以解决这个问题:

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ) p
ORDER  BY purchased_at DESC;

ORDER BY中的前导表达式必须与DISTINCT ON中的列一致,因此不能在同一个SELECT中按不同的列排序。

如果你想从每个集合中选择一个特定的行,只在子查询中使用一个额外的ORDER BY:

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ORDER  BY address_id, purchased_at DESC  -- get "latest" row per address_id
    ) p
ORDER  BY purchased_at DESC;

如果purchased_at可以为NULL,请使用DESC NULLS LAST -并匹配您的索引以获得最佳性能。看到的:

按列ASC排序,但空值首先? 为什么ORDER BY NULLS LAST会影响主键上的查询计划?

相关,有更多解释:

在每个GROUP BY GROUP中选择第一行? 按列ASC排序,但空值首先?

其他回答

你也可以通过使用group by子句来做到这一点

   SELECT purchases.address_id, purchases.* FROM "purchases"
    WHERE "purchases"."product_id" = 1 GROUP BY address_id,
purchases.purchased_at ORDER purchases.purchased_at DESC

它也可以使用以下查询和其他答案来解决。

WITH purchase_data AS (
        SELECT address_id, purchased_at, product_id,
                row_number() OVER (PARTITION BY address_id ORDER BY purchased_at DESC) AS row_number
        FROM purchases
        WHERE product_id = 1)
SELECT address_id, purchased_at, product_id
FROM purchase_data where row_number = 1

对于任何使用Flask-SQLAlchemy的人来说,这对我来说都是有效的

from app import db
from app.models import Purchases
from sqlalchemy.orm import aliased
from sqlalchemy import desc

stmt = Purchases.query.distinct(Purchases.address_id).subquery('purchases')
alias = aliased(Purchases, stmt)
distinct = db.session.query(alias)
distinct.order_by(desc(alias.purchased_at))

Window函数可以一次性解决这个问题:

SELECT DISTINCT ON (address_id) 
   LAST_VALUE(purchases.address_id) OVER wnd AS address_id
FROM "purchases"
WHERE "purchases"."product_id" = 1
WINDOW wnd AS (
   PARTITION BY address_id ORDER BY purchases.purchased_at DESC
   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

子查询可以解决这个问题:

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ) p
ORDER  BY purchased_at DESC;

ORDER BY中的前导表达式必须与DISTINCT ON中的列一致,因此不能在同一个SELECT中按不同的列排序。

如果你想从每个集合中选择一个特定的行,只在子查询中使用一个额外的ORDER BY:

SELECT *
FROM  (
    SELECT DISTINCT ON (address_id) *
    FROM   purchases
    WHERE  product_id = 1
    ORDER  BY address_id, purchased_at DESC  -- get "latest" row per address_id
    ) p
ORDER  BY purchased_at DESC;

如果purchased_at可以为NULL,请使用DESC NULLS LAST -并匹配您的索引以获得最佳性能。看到的:

按列ASC排序,但空值首先? 为什么ORDER BY NULLS LAST会影响主键上的查询计划?

相关,有更多解释:

在每个GROUP BY GROUP中选择第一行? 按列ASC排序,但空值首先?