根据MSDN, Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用create Aggregate函数、用户定义函数或其他方法)。

最好的方法(如果可能的话)是什么——允许在聚合查询中计算中值(假设是数值数据类型)?


当前回答

通常情况下,我们不仅需要为整个表计算Median,还需要为与某个ID相关的聚合计算Median。换句话说,计算表中每个ID的中位数,其中每个ID有许多记录。(基于@gdoron编辑的解决方案:性能良好,适用于许多SQL)

SELECT our_id, AVG(1.0 * our_val) as Median
FROM
( SELECT our_id, our_val, 
  COUNT(*) OVER (PARTITION BY our_id) AS cnt,
  ROW_NUMBER() OVER (PARTITION BY our_id ORDER BY our_val) AS rnk
  FROM our_table
) AS x
WHERE rnk IN ((cnt + 1)/2, (cnt + 2)/2) GROUP BY our_id;

希望能有所帮助。

其他回答

以下是我的解决方案:

with tempa as

 (

    select value,row_number() over (order by value) as Rn,/* Assigning a 
                                                           row_number */
           count(value) over () as Cnt /*Taking total count of the values */
    from numbers
    where value is not null /* Excluding the null values */
 ),

tempb as

  (

    /* Since we don't know whether the number of rows is odd or even, we shall 
     consider both the scenarios */

    select round(cnt/2) as Ref from tempa where mod(cnt,2)=1
    union all
    select round(cnt/2) a Ref from tempa where mod(cnt,2)=0
     union all
    select round(cnt/2) + 1 as Ref from tempa where mod(cnt,2)=0
   )
  select avg(value) as Median_Value

  from tempa where rn in

    ( select Ref from tempb);

我尝试了几种替代方案,但由于我的数据记录有重复的值,ROW_NUMBER版本似乎不是我的选择。这里是我使用的查询(NTILE版本):

SELECT distinct
   CustomerId,
   (
       MAX(CASE WHEN Percent50_Asc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId)  +
       MIN(CASE WHEN Percent50_desc=1 THEN TotalDue END) OVER (PARTITION BY CustomerId) 
   )/2 MEDIAN
FROM
(
   SELECT
      CustomerId,
      TotalDue,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue ASC) AS Percent50_Asc,
     NTILE(2) OVER (
         PARTITION BY CustomerId
         ORDER BY TotalDue DESC) AS Percent50_desc
   FROM Sales.SalesOrderHeader SOH
) x
ORDER BY CustomerId;

在我的解决方案表中是一个只有分数列的学生表,我正在计算分数的中位数,这个解决方案是基于SQL server 2019的

with total_c as ( --Total_c CTE counts total number of rows in a table
    select count(*) as n from student
),
even as ( --Even CTE extract two middle rows if the number of rows are even
    select marks from student 
    order by marks 
    offset (select n from total_c)/2 -1 rows
    fetch next 2 rows only
),
odd as ( --Odd CTE extract middle row if the number of rows are odd
    select marks from student 
    order by marks 
    offset (select n + 1 from total_c)/2 -1 rows
    fetch next 1 rows only
    )
--Case statement helps to select odd or even CTE based on number of rows
select                                                        
case when n%2 = 0 then (select avg(cast(marks as float)) from even)
    else (select marks from odd)
end as med_marks
from total_c

这段代码有点长,但很容易理解

medii是有列val的表,它有数据集, Smedi是一个cte,它将列idx作为行号,val作为medi表中的'val',该表是升序排序的。 这是基本的数学,如果行号是奇数,那么它的中值来自smedi。 当它是偶数时,它是中间两个值的平均值。

with smedi(idx,vals) as(
                select ROW_NUMBER() over(order by val),val from medi
                )
select (case
            when (select count(*) from medi)%2!=0 then (select vals from smedi where (((select count(*) from medi)/2))=idx)
            else (select avg(vals) from smedi where idx in ((select count(*)/2 from medi),(select (count(*)/2)+1 from medi)))
            end)

使用一条语句——一种方法是使用ROW_NUMBER(), COUNT()窗口函数并过滤子查询。下面是薪资中位数:

 SELECT AVG(e_salary) 
 FROM                                                             
    (SELECT 
      ROW_NUMBER() OVER(ORDER BY e_salary) as row_no, 
      e_salary,
      (COUNT(*) OVER()+1)*0.5 AS row_half
     FROM Employee) t
 WHERE row_no IN (FLOOR(row_half),CEILING(row_half))

我在网上看到过类似的解决方案,使用地板和天花板,但尝试使用单一的语句。(编辑)