根据MSDN, Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用create Aggregate函数、用户定义函数或其他方法)。

最好的方法(如果可能的话)是什么——允许在聚合查询中计算中值(假设是数值数据类型)?


当前回答

这段代码有点长,但很容易理解

medii是有列val的表,它有数据集, Smedi是一个cte,它将列idx作为行号,val作为medi表中的'val',该表是升序排序的。 这是基本的数学,如果行号是奇数,那么它的中值来自smedi。 当它是偶数时,它是中间两个值的平均值。

with smedi(idx,vals) as(
                select ROW_NUMBER() over(order by val),val from medi
                )
select (case
            when (select count(*) from medi)%2!=0 then (select vals from smedi where (((select count(*) from medi)/2))=idx)
            else (select avg(vals) from smedi where idx in ((select count(*)/2 from medi),(select (count(*)/2)+1 from medi)))
            end)

其他回答

以下解决方案在这些假设下有效:

无重复值 没有取消

代码:

IF OBJECT_ID('dbo.R', 'U') IS NOT NULL
  DROP TABLE dbo.R

CREATE TABLE R (
    A FLOAT NOT NULL);

INSERT INTO R VALUES (1);
INSERT INTO R VALUES (2);
INSERT INTO R VALUES (3);
INSERT INTO R VALUES (4);
INSERT INTO R VALUES (5);
INSERT INTO R VALUES (6);

-- Returns Median(R)
select SUM(A) / CAST(COUNT(A) AS FLOAT)
from R R1 
where ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) + 1 = 
      (select count(A) from R R2 where R1.A < R2.A)) OR
      ((select count(A) from R R2 where R1.A > R2.A) = 
      (select count(A) from R R2 where R1.A < R2.A) + 1) ; 

我想自己想出一个解决办法,但我的大脑绊倒了。我觉得很管用,但别让我早上解释。: P

DECLARE @table AS TABLE
(
    Number int not null
);

insert into @table select 2;
insert into @table select 4;
insert into @table select 9;
insert into @table select 15;
insert into @table select 22;
insert into @table select 26;
insert into @table select 37;
insert into @table select 49;

DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, Number) AS
(
    SELECT RowNo, Number FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY Number) AS RowNo, Number FROM @table) AS Foo
)
SELECT AVG(Number) FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2)

在我的解决方案表中是一个只有分数列的学生表,我正在计算分数的中位数,这个解决方案是基于SQL server 2019的

with total_c as ( --Total_c CTE counts total number of rows in a table
    select count(*) as n from student
),
even as ( --Even CTE extract two middle rows if the number of rows are even
    select marks from student 
    order by marks 
    offset (select n from total_c)/2 -1 rows
    fetch next 2 rows only
),
odd as ( --Odd CTE extract middle row if the number of rows are odd
    select marks from student 
    order by marks 
    offset (select n + 1 from total_c)/2 -1 rows
    fetch next 1 rows only
    )
--Case statement helps to select odd or even CTE based on number of rows
select                                                        
case when n%2 = 0 then (select avg(cast(marks as float)) from even)
    else (select marks from odd)
end as med_marks
from total_c

对于大规模数据集,您可以尝试以下GIST:

https://gist.github.com/chrisknoll/1b38761ce8c5016ec5b2

它通过聚合您在集合中找到的不同值(例如年龄或出生年份等)来工作,并使用SQL窗口函数来定位您在查询中指定的任何百分比位置。

犹斯丁上面的例子很好。但是主键的需求应该非常清楚地说明。我曾在野外见过没有密钥的代码,结果很糟糕。

我对Percentile_Cont的抱怨是它不会从数据集中给你一个实际的值。 要从数据集中获得一个实际值的“中值”,请使用Percentile_Disc。

SELECT SalesOrderID, OrderQty,
    PERCENTILE_DISC(0.5) 
        WITHIN GROUP (ORDER BY OrderQty)
        OVER (PARTITION BY SalesOrderID) AS MedianCont
FROM Sales.SalesOrderDetail
WHERE SalesOrderID IN (43670, 43669, 43667, 43663)
ORDER BY SalesOrderID DESC