根据MSDN, Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用create Aggregate函数、用户定义函数或其他方法)。

最好的方法(如果可能的话)是什么——允许在聚合查询中计算中值(假设是数值数据类型)?


当前回答

以下解决方案在这些假设下有效:

无重复值 没有取消

代码:

IF OBJECT_ID('dbo.R', 'U') IS NOT NULL
  DROP TABLE dbo.R

CREATE TABLE R (
    A FLOAT NOT NULL);

INSERT INTO R VALUES (1);
INSERT INTO R VALUES (2);
INSERT INTO R VALUES (3);
INSERT INTO R VALUES (4);
INSERT INTO R VALUES (5);
INSERT INTO R VALUES (6);

-- Returns Median(R)
select SUM(A) / CAST(COUNT(A) AS FLOAT)
from R R1 
where ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) + 1 = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A) + 1) ; 

其他回答

犹斯丁上面的例子很好。但是主键的需求应该非常清楚地说明。我曾在野外见过没有密钥的代码,结果很糟糕。

我对Percentile_Cont的抱怨是它不会从数据集中给你一个实际的值。 要从数据集中获得一个实际值的“中值”,请使用Percentile_Disc。

SELECT SalesOrderID, OrderQty,
    PERCENTILE_DISC(0.5) 
        WITHIN GROUP (ORDER BY OrderQty)
        OVER (PARTITION BY SalesOrderID) AS MedianCont
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC

通常情况下,我们不仅需要为整个表计算Median,还需要为与某个ID相关的聚合计算Median。换句话说,计算表中每个ID的中位数,其中每个ID有许多记录。(基于@gdoron编辑的解决方案:性能良好,适用于许多SQL)

SELECT our_id, AVG(1.0 * our_val) as Median
FROM
( SELECT our_id, our_val, 
  COUNT(*) OVER (PARTITION BY our_id) AS cnt,
  ROW_NUMBER() OVER (PARTITION BY our_id ORDER BY our_val) AS rnk
  FROM our_table
) AS x
WHERE rnk IN ((cnt + 1)/2, (cnt + 2)/2) GROUP BY our_id;

希望能有所帮助。

中找到

这是查找属性中值的最简单方法。

Select round(S.salary,4) median from employee S 
where (select count(salary) from station 
where salary < S.salary ) = (select count(salary) from station
where salary > S.salary)

我只是在寻找一个基于集的中位数的解决方案时偶然发现了这一页。在研究了一些解决方案之后,我想到了以下几点。希望是有用的。

DECLARE @test TABLE(
    i int identity(1,1),
    id int,
    score float
)

INSERT INTO @test (id,score) VALUES (1,10)
INSERT INTO @test (id,score) VALUES (1,11)
INSERT INTO @test (id,score) VALUES (1,15)
INSERT INTO @test (id,score) VALUES (1,19)
INSERT INTO @test (id,score) VALUES (1,20)

INSERT INTO @test (id,score) VALUES (2,20)
INSERT INTO @test (id,score) VALUES (2,21)
INSERT INTO @test (id,score) VALUES (2,25)
INSERT INTO @test (id,score) VALUES (2,29)
INSERT INTO @test (id,score) VALUES (2,30)

INSERT INTO @test (id,score) VALUES (3,20)
INSERT INTO @test (id,score) VALUES (3,21)
INSERT INTO @test (id,score) VALUES (3,25)
INSERT INTO @test (id,score) VALUES (3,29)

DECLARE @counts TABLE(
    id int,
    cnt int
)

INSERT INTO @counts (
    id,
    cnt
)
SELECT
    id,
    COUNT(*)
FROM
    @test
GROUP BY
    id

SELECT
    drv.id,
    drv.start,
    AVG(t.score)
FROM
    (
        SELECT
            MIN(t.i)-1 AS start,
            t.id
        FROM
            @test t
        GROUP BY
            t.id
    ) drv
    INNER JOIN @test t ON drv.id = t.id
    INNER JOIN @counts c ON t.id = c.id
WHERE
    t.i = ((c.cnt+1)/2)+drv.start
    OR (
        t.i = (((c.cnt+1)%2) * ((c.cnt+2)/2))+drv.start
        AND ((c.cnt+1)%2) * ((c.cnt+2)/2) <> 0
    )
GROUP BY
    drv.id,
    drv.start

从员工表中得到工资的中位数

with cte as (select salary, ROW_NUMBER() over (order by salary asc) as num from employees)

select avg(salary) from cte where num in ((select (count(*)+1)/2 from employees), (select (count(*)+2)/2 from employees));