用MySQL计算中位数最简单(希望不会太慢)的方法是什么?我已经使用AVG(x)来寻找平均值,但我很难找到一个简单的方法来计算中位数。现在,我将所有的行返回到PHP,进行排序,然后选择中间的行,但是肯定有一些简单的方法可以在一个MySQL查询中完成它。
示例数据:
id | val
--------
1 4
2 7
3 2
4 2
5 9
6 8
7 3
对val排序得到2 2 3 4 7 8 9,因此中位数应该是4,而SELECT AVG(val) == 5。
根据魔术贴的答案,对于那些必须根据另一个参数分组的东西做中位数的人来说
SELECT grp_field, t1。val FROM (
SELECT grp_field, @rownum:=IF(@s = grp_field, @rownum + 1,0) AS row_number,
@s:=IF(@s = grp_field, @s, grp_field) AS sec, d.val
FROM data d, (SELECT @rownum:=0, @s:=0
ORDER BY grp_field, d.val
)作为t1 JOIN (
SELECT grp_field, count(*)为total_rows
数据d
GROUP BY grp_field
)为t2
在t1。Grp_field = t2.grp_field
在t1.row_number =地板(total_rows / 2) + 1;
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(
GROUP_CONCAT(field ORDER BY field),
',',
((
ROUND(
LENGTH(GROUP_CONCAT(field)) -
LENGTH(
REPLACE(
GROUP_CONCAT(field),
',',
''
)
)
) / 2) + 1
)),
',',
-1
)
FROM
table
上面的方法似乎对我有用。
你也可以选择在存储过程中这样做:
DROP PROCEDURE IF EXISTS median;
DELIMITER //
CREATE PROCEDURE median (table_name VARCHAR(255), column_name VARCHAR(255), where_clause VARCHAR(255))
BEGIN
-- Set default parameters
IF where_clause IS NULL OR where_clause = '' THEN
SET where_clause = 1;
END IF;
-- Prepare statement
SET @sql = CONCAT(
"SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.", column_name, " AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.", column_name, "
FROM ", table_name," AS x, (SELECT @row:=0) AS r
WHERE ", where_clause, " ORDER BY x.", column_name, "
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM ", table_name, " x
WHERE ", where_clause, "
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2
AND t1.row <= ((t2.count/2)+1)) AS t3
");
-- Execute statement
PREPARE stmt FROM @sql;
EXECUTE stmt;
END//
DELIMITER ;
-- Sample usage:
-- median(table_name, column_name, where_condition);
CALL median('products', 'price', NULL);
不幸的是,无论是TheJacobTaylor还是velcrow的答案都不会返回当前版本MySQL的准确结果。
从上面来看,魔术贴的答案是接近的,但它不能正确计算具有偶数行数的结果集。中位数定义为1)奇数集上的中间数,或2)偶数集上两个中间数的平均值。
所以,这里是魔术贴的解决方案修补处理奇数和偶数集:
SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.median_column AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.median_column
FROM median_table AS x, (SELECT @row:=0) AS r
WHERE 1
-- put some where clause here
ORDER BY x.median_column
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM median_table x
WHERE 1
-- put same where clause here
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;
要使用它,请遵循以下3个简单步骤:
将上面代码中的“median_table”(出现2次)替换为您的表名
将“median_column”(3次)替换为您希望为其查找中位数的列名
如果你有一个WHERE条件,用WHERE条件替换“WHERE 1”(2次)
在阅读了所有之前的内容后,它们与我的实际需求不匹配,所以我实现了自己的一个不需要任何过程或复杂的语句,只是我GROUP_CONCAT所有来自我想要获得MEDIAN的列的值,并应用COUNT DIV BY 2,我从列表中间提取值,就像下面的查询一样:
(POS是我想要获得其中位数的列的名称)
(query) SELECT
SUBSTRING_INDEX (
SUBSTRING_INDEX (
GROUP_CONCAT(pos ORDER BY CAST(pos AS SIGNED INTEGER) desc SEPARATOR ';')
, ';', COUNT(*)/2 )
, ';', -1 ) AS `pos_med`
FROM table_name
GROUP BY any_criterial
我希望这能对一些人有用,就像这个网站上的许多其他评论对我一样。