USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN(43659,43664);
我读过那个条款,我不明白为什么我需要它。
Over函数是做什么的?Partitioning By有什么作用?
为什么我不能写Group By SalesOrderID查询?
OVER子句的强大之处在于,无论是否使用GROUP BY,都可以在不同的范围内进行聚合(“窗口”)
示例:获取每个SalesOrderID的计数和所有计数
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) AS 'Count'
,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)
GROUP BY
SalesOrderID, ProductID, OrderQty
获得不同的计数,没有GROUP BY
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)
让我用一个例子来解释,你就能明白它是如何工作的。
假设你有下面的表DIM_EQUIPMENT:
VIN MAKE MODEL YEAR COLOR
-----------------------------------------
1234ASDF Ford Taurus 2008 White
1234JKLM Chevy Truck 2005 Green
5678ASDF Ford Mustang 2008 Yellow
在SQL下面运行
SELECT VIN,
MAKE,
MODEL,
YEAR,
COLOR ,
COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT
结果如下所示
VIN MAKE MODEL YEAR COLOR COUNT2
----------------------------------------------
1234JKLM Chevy Truck 2005 Green 1
5678ASDF Ford Mustang 2008 Yellow 2
1234ASDF Ford Taurus 2008 White 2
看看发生了什么。
你能够计数没有组通过年和匹配ROW。
另一种有趣的方法是获得相同的结果,如果下面使用WITH子句,WITH作为内联视图,可以简化查询,特别是复杂的查询,但这里不是这样,因为我只是试图展示用法
WITH EQ AS
( SELECT YEAR AS YEAR2, COUNT(*) AS COUNT2 FROM DIM_EQUIPMENT GROUP BY YEAR
)
SELECT VIN,
MAKE,
MODEL,
YEAR,
COLOR,
COUNT2
FROM DIM_EQUIPMENT,
EQ
WHERE EQ.YEAR2=DIM_EQUIPMENT.YEAR;
如果您只想通过SalesOrderID分组,那么您将无法在SELECT子句中包括ProductID和OrderQty列。
PARTITION BY子句可以分解聚合函数。一个明显而有用的例子是,如果你想为订单中的订单行生成行号:
SELECT
O.order_id,
O.order_date,
ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
OL.product_id
FROM
Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id
(我的语法可能有点错误)
然后你会得到类似这样的东西:
order_id order_date line_item_no product_id
-------- ---------- ------------ ----------
1 2011-05-02 1 5
1 2011-05-02 2 4
1 2011-05-02 3 7
2 2011-05-12 1 8
2 2011-05-12 2 1
当OVER子句与PARTITION BY结合使用时,表示前面的函数调用必须通过计算查询返回的行进行分析。可以把它看作内联的GROUP BY语句。
OVER (PARTITION BY SalesOrderID)表示对于SUM, AVG等…函数,返回值OVER查询返回记录的子集,并将该子集由外键SalesOrderID进行分区。
因此,我们将对每个UNIQUE SalesOrderID的每个OrderQty记录求和,并且该列名将被称为'Total'。
这是一种比使用多个内联视图查找相同信息更有效的方法。您可以将此查询放在一个内联视图中,然后在Total上进行筛选。
SELECT ...,
FROM (your query) inlineview
WHERE Total < 200
您可以使用GROUP BY SalesOrderID。区别在于,使用GROUP BY,您只能获得GROUP BY中不包括的列的聚合值。
相反,使用带窗口的聚合函数而不是GROUP BY,可以检索聚合值和非聚合值。也就是说,尽管您在示例查询中没有这样做,但您可以在相同的salesorderid组上检索单个OrderQty值及其总和、计数、平均值等。
下面是一个实际的例子,说明了窗口聚合的优点。假设您需要计算每个值占总数的百分比。如果没有窗口聚合,你必须首先派生一个聚合值列表,然后将其连接回原始行集,即如下所示:
SELECT
orig.[Partition],
orig.Value,
orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
INNER JOIN (
SELECT
[Partition],
SUM(Value) AS TotalValue
FROM OriginalRowset
GROUP BY [Partition]
) agg ON orig.[Partition] = agg.[Partition]
现在看看你如何对一个带窗口的聚合做同样的事情:
SELECT
[Partition],
Value,
Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig
更简单,更干净,不是吗?
OVER子句的强大之处在于,无论是否使用GROUP BY,都可以在不同的范围内进行聚合(“窗口”)
示例:获取每个SalesOrderID的计数和所有计数
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) AS 'Count'
,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)
GROUP BY
SalesOrderID, ProductID, OrderQty
获得不同的计数,没有GROUP BY
SELECT
SalesOrderID, ProductID, OrderQty
,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail
WHERE
SalesOrderID IN(43659,43664)