正如标题所示,我想选择用GROUP BY分组的每组行中的第一行。

具体来说,如果我有一个如下所示的采购表:

SELECT * FROM purchases;

我的输出:

id customer total
1 Joe 5
2 Sally 3
3 Joe 2
4 Sally 1

我想查询每个客户的最大购买量(总购买量)。类似于:

SELECT FIRST(id), customer, FIRST(total)
FROM  purchases
GROUP BY customer
ORDER BY total DESC;

预期输出:

FIRST(id) customer FIRST(total)
1 Joe 5
2 Sally 3

当前回答

对PostgreSQL、U-SQL、IBM DB2和Google BigQuery SQL使用ARRAY_AGG函数:

SELECT customer, (ARRAY_AGG(id ORDER BY total DESC))[1], MAX(total)
FROM purchases
GROUP BY customer

其他回答

对PostgreSQL、U-SQL、IBM DB2和Google BigQuery SQL使用ARRAY_AGG函数:

SELECT customer, (ARRAY_AGG(id ORDER BY total DESC))[1], MAX(total)
FROM purchases
GROUP BY customer

这是我们如何通过使用windows函数实现的:

    create table purchases (id int4, customer varchar(10), total integer);
    insert into purchases values (1, 'Joe', 5);
    insert into purchases values (2, 'Sally', 3);
    insert into purchases values (3, 'Joe', 2);
    insert into purchases values (4, 'Sally', 1);
    
    select ID, CUSTOMER, TOTAL from (
    select ID, CUSTOMER, TOTAL,
    row_number () over (partition by CUSTOMER order by TOTAL desc) RN
    from purchases) A where RN = 1;

这可以通过MAX FUNCTION on total和GROUP by id和customer轻松实现。

SELECT id, customer, MAX(total) FROM  purchases GROUP BY id, customer
ORDER BY total DESC;

非常快速的解决方案

SELECT a.* 
FROM
    purchases a 
    JOIN ( 
        SELECT customer, min( id ) as id 
        FROM purchases 
        GROUP BY customer 
    ) b USING ( id );

如果表是按id索引的,则速度非常快:

create index purchases_id on purchases (id);

在PostgreSQL中,另一种可能是将first_value窗口函数与SELECT DISTINCT结合使用:

select distinct customer_id,
                first_value(row(id, total)) over(partition by customer_id order by total desc, id)
from            purchases;

我创建了一个组合(id,total),因此两个值都由同一个聚合返回。当然,您可以始终应用first_value()两次。