我想运行这个查询:
SELECT DISTINCT ON (address_id) purchases.address_id, purchases.*
FROM purchases
WHERE purchases.product_id = 1
ORDER BY purchases.purchased_at DESC
但是我得到了这个错误:
SELECT DISTINCT ON表达式必须匹配表达式的初始顺序
添加address_id作为第一个ORDER BY表达式可以消除错误,但我真的不想在address_id上添加排序。有没有可能不按address_id排序?
文档中说:
DISTINCT ON(表达式[,…])只保留给定表达式求值为相等的每一组行的第一行。[…注意,每个集合的“第一行”是不可预测的,除非使用ORDER BY来确保所需的行首先出现。[…DISTINCT ON表达式必须匹配最左边的ORDER BY表达式。
官方文档
因此,您必须将address_id添加到order中。
或者,如果你正在寻找包含每个address_id的最近购买的产品的完整行,并且结果按purchased_at排序,那么你正在尝试解决每组最大N的问题,可以通过以下方法解决:
适用于大多数dbms的通用解决方案:
SELECT t1.* FROM purchases t1
JOIN (
SELECT address_id, max(purchased_at) max_purchased_at
FROM purchases
WHERE product_id = 1
GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC
基于@hkf的回答,一个更面向postgresql的解决方案:
SELECT * FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC
此处澄清、扩展和解决的问题:按某列排序并在另一列上不同地选择行
子查询可以解决这个问题:
SELECT *
FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
) p
ORDER BY purchased_at DESC;
ORDER BY中的前导表达式必须与DISTINCT ON中的列一致,因此不能在同一个SELECT中按不同的列排序。
如果你想从每个集合中选择一个特定的行,只在子查询中使用一个额外的ORDER BY:
SELECT *
FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
ORDER BY address_id, purchased_at DESC -- get "latest" row per address_id
) p
ORDER BY purchased_at DESC;
如果purchased_at可以为NULL,请使用DESC NULLS LAST -并匹配您的索引以获得最佳性能。看到的:
按列ASC排序,但空值首先?
为什么ORDER BY NULLS LAST会影响主键上的查询计划?
相关,有更多解释:
在每个GROUP BY GROUP中选择第一行?
按列ASC排序,但空值首先?
文档中说:
DISTINCT ON(表达式[,…])只保留给定表达式求值为相等的每一组行的第一行。[…注意,每个集合的“第一行”是不可预测的,除非使用ORDER BY来确保所需的行首先出现。[…DISTINCT ON表达式必须匹配最左边的ORDER BY表达式。
官方文档
因此,您必须将address_id添加到order中。
或者,如果你正在寻找包含每个address_id的最近购买的产品的完整行,并且结果按purchased_at排序,那么你正在尝试解决每组最大N的问题,可以通过以下方法解决:
适用于大多数dbms的通用解决方案:
SELECT t1.* FROM purchases t1
JOIN (
SELECT address_id, max(purchased_at) max_purchased_at
FROM purchases
WHERE product_id = 1
GROUP BY address_id
) t2
ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at
ORDER BY t1.purchased_at DESC
基于@hkf的回答,一个更面向postgresql的解决方案:
SELECT * FROM (
SELECT DISTINCT ON (address_id) *
FROM purchases
WHERE product_id = 1
ORDER BY address_id, purchased_at DESC
) t
ORDER BY purchased_at DESC
此处澄清、扩展和解决的问题:按某列排序并在另一列上不同地选择行