USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
    ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
    ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);

我读过那个条款,我不明白为什么我需要它。 Over函数是做什么的?Partitioning By有什么作用? 为什么我不能写Group By SalesOrderID查询?


当前回答

OVER子句的强大之处在于,无论是否使用GROUP BY,都可以在不同的范围内进行聚合(“窗口”)

示例:获取每个SalesOrderID的计数和所有计数

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) AS 'Count'
    ,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)
GROUP BY
     SalesOrderID, ProductID, OrderQty

获得不同的计数,没有GROUP BY

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
    ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
    ,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)

其他回答

让我用一个例子来解释,你就能明白它是如何工作的。

假设你有下面的表DIM_EQUIPMENT:

VIN         MAKE    MODEL   YEAR    COLOR
-----------------------------------------
1234ASDF    Ford    Taurus  2008    White
1234JKLM    Chevy   Truck   2005    Green
5678ASDF    Ford    Mustang 2008    Yellow

在SQL下面运行

SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR ,
  COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT

结果如下所示

VIN         MAKE    MODEL   YEAR    COLOR     COUNT2
 ----------------------------------------------  
1234JKLM    Chevy   Truck   2005    Green     1
5678ASDF    Ford    Mustang 2008    Yellow    2
1234ASDF    Ford    Taurus  2008    White     2

看看发生了什么。

你能够计数没有组通过年和匹配ROW。

另一种有趣的方法是获得相同的结果,如果下面使用WITH子句,WITH作为内联视图,可以简化查询,特别是复杂的查询,但这里不是这样,因为我只是试图展示用法

 WITH EQ AS
  ( SELECT YEAR AS YEAR2, COUNT(*) AS COUNT2 FROM DIM_EQUIPMENT GROUP BY YEAR
  )
SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR,
  COUNT2
FROM DIM_EQUIPMENT,
  EQ
WHERE EQ.YEAR2=DIM_EQUIPMENT.YEAR;

如果您只想通过SalesOrderID分组,那么您将无法在SELECT子句中包括ProductID和OrderQty列。

PARTITION BY子句可以分解聚合函数。一个明显而有用的例子是,如果你想为订单中的订单行生成行号:

SELECT
    O.order_id,
    O.order_date,
    ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
    OL.product_id
FROM
    Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id

(我的语法可能有点错误)

然后你会得到类似这样的东西:

order_id    order_date    line_item_no    product_id
--------    ----------    ------------    ----------
    1       2011-05-02         1              5
    1       2011-05-02         2              4
    1       2011-05-02         3              7
    2       2011-05-12         1              8
    2       2011-05-12         2              1

当OVER子句与PARTITION BY结合使用时,表示前面的函数调用必须通过计算查询返回的行进行分析。可以把它看作内联的GROUP BY语句。

OVER (PARTITION BY SalesOrderID)表示对于SUM, AVG等…函数,返回值OVER查询返回记录的子集,并将该子集由外键SalesOrderID进行分区。

因此,我们将对每个UNIQUE SalesOrderID的每个OrderQty记录求和,并且该列名将被称为'Total'。

这是一种比使用多个内联视图查找相同信息更有效的方法。您可以将此查询放在一个内联视图中,然后在Total上进行筛选。

SELECT ...,
FROM (your query) inlineview
WHERE Total < 200

您可以使用GROUP BY SalesOrderID。区别在于,使用GROUP BY,您只能获得GROUP BY中不包括的列的聚合值。

相反,使用带窗口的聚合函数而不是GROUP BY,可以检索聚合值和非聚合值。也就是说,尽管您在示例查询中没有这样做,但您可以在相同的salesorderid组上检索单个OrderQty值及其总和、计数、平均值等。

下面是一个实际的例子,说明了窗口聚合的优点。假设您需要计算每个值占总数的百分比。如果没有窗口聚合,你必须首先派生一个聚合值列表,然后将其连接回原始行集,即如下所示:

SELECT
  orig.[Partition],
  orig.Value,
  orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
  INNER JOIN (
    SELECT
      [Partition],
      SUM(Value) AS TotalValue
    FROM OriginalRowset
    GROUP BY [Partition]
  ) agg ON orig.[Partition] = agg.[Partition]

现在看看你如何对一个带窗口的聚合做同样的事情:

SELECT
  [Partition],
  Value,
  Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig

更简单,更干净,不是吗?

OVER子句的强大之处在于,无论是否使用GROUP BY,都可以在不同的范围内进行聚合(“窗口”)

示例:获取每个SalesOrderID的计数和所有计数

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) AS 'Count'
    ,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)
GROUP BY
     SalesOrderID, ProductID, OrderQty

获得不同的计数,没有GROUP BY

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
    ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
    ,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)