假设我有一个顾客表和一个购买表。每笔购买属于一个客户。我想在一个SELECT语句中获得所有客户及其最后一次购买的列表。最佳做法是什么?关于建立索引有什么建议吗?

请在回答中使用这些表/列名:

客户:id,姓名 购买:id, customer_id, item_id,日期

在更复杂的情况下,通过将最后一次购买放入customer表来对数据库进行非规范化(性能方面)是否有益?

如果(purchase) id保证按日期排序,是否可以使用LIMIT 1之类的东西来简化语句?


当前回答

表:

Customer => id, name
Purchase => id, customer_id, item_id, date

查询:

SELECT C.id, C.name, P.id, P.date
  FROM customer AS C
  LEFT JOIN purchase AS P ON 
    (
      P.customer_id = C.id 
      AND P.id IN (
        SELECT MAX(PP.id) FROM purchase AS PP GROUP BY PP.customer_id
      )
    )

你也可以指定一些条件到子选择查询

其他回答

在SQLite上测试:

SELECT c.*, p.*, max(p.date)
FROM customer c
LEFT OUTER JOIN purchase p
ON c.id = p.customer_id
GROUP BY c.id

max()聚合函数将确保从每个组中选择最新的购买(但假设日期列的格式是max()给出最新的—通常情况下是这样)。如果你想处理同一日期的购买,那么你可以使用max(p。目前为止,p.id)。

在索引方面,我将使用一个关于购买的索引(customer_id,日期,[您想在选择中返回的任何其他购买列])。

LEFT OUTER JOIN(相对于INNER JOIN)将确保从未购买过的客户也包括在内。

在SQL Server上你可以使用:

SELECT *
FROM customer c
INNER JOIN purchase p on c.id = p.customer_id
WHERE p.id = (
    SELECT TOP 1 p2.id
    FROM purchase p2
    WHERE p.customer_id = p2.customer_id
    ORDER BY date DESC
)

SQL Server小提琴:http://sqlfiddle.com/#!18/262fd / 2

在MySQL上你可以使用:

SELECT c.name, date
FROM customer c
INNER JOIN purchase p on c.id = p.customer_id
WHERE p.id = (
    SELECT p2.id
    FROM purchase p2
    WHERE p.customer_id = p2.customer_id
    ORDER BY date DESC
    LIMIT 1
)

MySQL小提琴:http://sqlfiddle.com/#!9/202613/7

请尝尝这个,

SELECT 
c.Id,
c.name,
(SELECT pi.price FROM purchase pi WHERE pi.Id = MAX(p.Id)) AS [LastPurchasePrice]
FROM customer c INNER JOIN purchase p 
ON c.Id = p.customerId 
GROUP BY c.Id,c.name;

先不讲代码,逻辑/算法如下:

Go to the transaction table with multiple records for the same client. Select records of clientID and the latestDate of client's activity using group by clientID and max(transactionDate) select clientID, max(transactionDate) as latestDate from transaction group by clientID inner join the transaction table with the outcome from Step 2, then you will have the full records of the transaction table with only each client's latest record. select * from transaction t inner join ( select clientID, max(transactionDate) as latestDate from transaction group by clientID) d on t.clientID = d.clientID and t.transactionDate = d.latestDate) You can use the result from step 3 to join any table you want to get different results.

这是经常出现在StackOverflow上的“每组最多n个”问题的一个例子。

以下是我通常建议的解决方法:

SELECT c.*, p1.*
FROM customer c
JOIN purchase p1 ON (c.id = p1.customer_id)
LEFT OUTER JOIN purchase p2 ON (c.id = p2.customer_id AND 
    (p1.date < p2.date OR (p1.date = p2.date AND p1.id < p2.id)))
WHERE p2.id IS NULL;

解释:给定第p1行,不应该有第p2行具有相同的客户和较晚的日期(或者在领带的情况下,较晚的id)。当我们发现这是正确的,那么p1是该客户最近的购买。

关于索引,我将在purchase列(customer_id、date、id)上创建一个复合索引。这可能允许使用覆盖索引来完成外部连接。请确保在您的平台上进行测试,因为优化是依赖于实现的。使用RDBMS的特性来分析优化计划。例如,MySQL上的EXPLAIN。


有些人使用子查询而不是我上面展示的解决方案,但我发现我的解决方案更容易解决关系。