根据MSDN, Median在Transact-SQL中不能作为聚合函数使用。但是,我想知道是否可以创建此功能(使用create Aggregate函数、用户定义函数或其他方法)。

最好的方法(如果可能的话)是什么——允许在聚合查询中计算中值(假设是数值数据类型)?


当前回答

试试下面的逻辑来找出中位数:

考虑一个包含以下数字的表格: 1、1、2、3、4、5所示

中位数是2.5

with tempa as 
(
    select num,count(num) over() as Cnt,
        row_number() over (order by num) as Rnum
    from temp),
tempb as
    (
        select round(cnt/2) as ref_value
        from tempa where mod(cnt,2)<>0
        union all
        select round(cnt/2) from tempa where mod(cnt,2)=0
        union all
        select round(cnt/2+1)
        from tempa where mod(cnt,2)=0
    )
select avg(num) from tempa
where rnum in (select * from tempb);
    

其他回答

我想自己想出一个解决办法,但我的大脑绊倒了。我觉得很管用,但别让我早上解释。: P

DECLARE @table AS TABLE
(
    Number int not null
);

insert into @table select 2;
insert into @table select 4;
insert into @table select 9;
insert into @table select 15;
insert into @table select 22;
insert into @table select 26;
insert into @table select 37;
insert into @table select 49;

DECLARE @Count AS INT
SELECT @Count = COUNT(*) FROM @table;

WITH MyResults(RowNo, Number) AS
(
    SELECT RowNo, Number FROM
        (SELECT ROW_NUMBER() OVER (ORDER BY Number) AS RowNo, Number FROM @table) AS Foo
)
SELECT AVG(Number) FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2)
DECLARE @Obs int
DECLARE @RowAsc table
(
ID      INT IDENTITY,
Observation  FLOAT
)
INSERT INTO @RowAsc
SELECT Observations FROM MyTable
ORDER BY 1 
SELECT @Obs=COUNT(*)/2 FROM @RowAsc
SELECT Observation AS Median FROM @RowAsc WHERE ID=@Obs

如果你使用的是SQL 2005或更好的版本,这是一个很好的,简单的中位数计算表中的单列:

SELECT
(
 (SELECT MAX(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score) AS BottomHalf)
 +
 (SELECT MIN(Score) FROM
   (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score DESC) AS TopHalf)
) / 2 AS Median

对于连续变量/测量'table1'中的'col1'

select col1  
from
    (select top 50 percent col1, 
    ROW_NUMBER() OVER(ORDER BY col1 ASC) AS Rowa,
    ROW_NUMBER() OVER(ORDER BY col1 DESC) AS Rowd
    from table1 ) tmp
where tmp.Rowa = tmp.Rowd

虽然Justin grant的解决方案看起来很可靠,但我发现当您在给定的分区键中有许多重复值时,ASC重复值的行号最终会不按顺序排列,因此它们不能正确对齐。

以下是我的研究结果的一个片段:

KEY VALUE ROWA ROWD  

13  2     22   182
13  1     6    183
13  1     7    184
13  1     8    185
13  1     9    186
13  1     10   187
13  1     11   188
13  1     12   189
13  0     1    190
13  0     2    191
13  0     3    192
13  0     4    193
13  0     5    194

我使用Justin的代码作为这个解决方案的基础。尽管考虑到使用多个派生表效率不高,但它确实解决了我遇到的行排序问题。任何改进都会受到欢迎,因为我在T-SQL方面不是那么有经验。

SELECT PKEY, cast(AVG(VALUE)as decimal(5,2)) as MEDIANVALUE
FROM
(
  SELECT PKEY,VALUE,ROWA,ROWD,
  'FLAG' = (CASE WHEN ROWA IN (ROWD,ROWD-1,ROWD+1) THEN 1 ELSE 0 END)
  FROM
  (
    SELECT
    PKEY,
    cast(VALUE as decimal(5,2)) as VALUE,
    ROWA,
    ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY ROWA DESC) as ROWD 

    FROM
    (
      SELECT
      PKEY, 
      VALUE,
      ROW_NUMBER() OVER (PARTITION BY PKEY ORDER BY VALUE ASC,PKEY ASC ) as ROWA 
      FROM [MTEST]
    )T1
  )T2
)T3
WHERE FLAG = '1'
GROUP BY PKEY
ORDER BY PKEY