根据MSDN, Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用create Aggregate函数、用户定义函数或其他方法)。

最好的方法(如果可能的话)是什么——允许在聚合查询中计算中值(假设是数值数据类型)?


当前回答

MS SQL Server 2012(及以后版本)有PERCENTILE_DISC函数,计算排序值的特定百分比。PERCENTILE_DISC(0.5)将计算中位数- https://msdn.microsoft.com/en-us/library/hh231327.aspx

其他回答

MS SQL Server 2012(及以后版本)有PERCENTILE_DISC函数,计算排序值的特定百分比。PERCENTILE_DISC(0.5)将计算中位数- https://msdn.microsoft.com/en-us/library/hh231327.aspx

这是我能想到的最简单的答案。我的数据处理得很好。如果你想排除某些值,只需在内部select中添加where子句。

SELECT TOP 1 
    ValueField AS MedianValue
FROM
    (SELECT TOP(SELECT COUNT(1)/2 FROM tTABLE)
        ValueField
    FROM 
        tTABLE
    ORDER BY 
        ValueField) A
ORDER BY
    ValueField DESC

在我的解决方案表中是一个只有分数列的学生表,我正在计算分数的中位数,这个解决方案是基于SQL server 2019的

with total_c as ( --Total_c CTE counts total number of rows in a table
    select count(*) as n from student
),
even as ( --Even CTE extract two middle rows if the number of rows are even
    select marks from student 
    order by marks 
    offset (select n from total_c)/2 -1 rows
    fetch next 2 rows only
),
odd as ( --Odd CTE extract middle row if the number of rows are odd
    select marks from student 
    order by marks 
    offset (select n + 1 from total_c)/2 -1 rows
    fetch next 1 rows only
    )
--Case statement helps to select odd or even CTE based on number of rows
select                                                        
case when n%2 = 0 then (select avg(cast(marks as float)) from even)
    else (select marks from odd)
end as med_marks
from total_c

试试下面的逻辑来找出中位数:

考虑一个包含以下数字的表格: 1、1、2、3、4、5所示

中位数是2.5

with tempa as 
(
    select num,count(num) over() as Cnt,
        row_number() over (order by num) as Rnum
    from temp),
tempb as
    (
        select round(cnt/2) as ref_value
        from tempa where mod(cnt,2)<>0
        union all
        select round(cnt/2) from tempa where mod(cnt,2)=0
        union all
        select round(cnt/2+1)
        from tempa where mod(cnt,2)=0
    )
select avg(num) from tempa
where rnum in (select * from tempb);
    

对于大规模数据集,您可以尝试以下GIST:

https://gist.github.com/chrisknoll/1b38761ce8c5016ec5b2

它通过聚合您在集合中找到的不同值(例如年龄或出生年份等)来工作,并使用SQL窗口函数来定位您在查询中指定的任何百分比位置。