在Python中如何找到列表的中值?列表可以是任意大小的,并且数字不保证是任何特定的顺序。
如果列表包含偶数个元素,则函数应返回中间两个元素的平均值。
以下是一些例子(为了便于展示,进行了排序):
median([1]) == 1
median([1, 1]) == 1
median([1, 1, 2, 4]) == 1.5
median([0, 2, 5, 6, 8, 9, 9]) == 6
median([0, 0, 0, 0, 4, 4, 6, 8]) == 2
在Python中如何找到列表的中值?列表可以是任意大小的,并且数字不保证是任何特定的顺序。
如果列表包含偶数个元素,则函数应返回中间两个元素的平均值。
以下是一些例子(为了便于展示,进行了排序):
median([1]) == 1
median([1, 1]) == 1
median([1, 1, 2, 4]) == 1.5
median([0, 2, 5, 6, 8, 9, 9]) == 6
median([0, 0, 0, 0, 4, 4, 6, 8]) == 2
当前回答
如果需要更快的平均情况运行时间,可以尝试快速选择算法。Quickselect具有平均(和最佳)情况性能O(n),尽管在糟糕的一天它可能会以O(n²)结束。
下面是一个随机选择枢轴的实现:
import random
def select_nth(n, items):
pivot = random.choice(items)
lesser = [item for item in items if item < pivot]
if len(lesser) > n:
return select_nth(n, lesser)
n -= len(lesser)
numequal = items.count(pivot)
if numequal > n:
return pivot
n -= numequal
greater = [item for item in items if item > pivot]
return select_nth(n, greater)
你可以简单地把它变成一个方法来寻找中位数:
def median(items):
if len(items) % 2:
return select_nth(len(items)//2, items)
else:
left = select_nth((len(items)-1) // 2, items)
right = select_nth((len(items)+1) // 2, items)
return (left + right) / 2
这是非常未优化的,但即使是一个优化的版本也不太可能超过Tim Sort (CPython的内置排序),因为它真的很快。我以前试过,但失败了。
其他回答
sorted()函数对此非常有用。使用排序函数 要对列表排序,只需返回中间值(或两个中间值的平均值) 如果列表包含偶数个元素,则为。
def median(lst):
sortedLst = sorted(lst)
lstLen = len(lst)
index = (lstLen - 1) // 2
if (lstLen % 2):
return sortedLst[index]
else:
return (sortedLst[index] + sortedLst[index + 1])/2.0
下面是不使用中值函数就能找到中值的乏味方法:
def median(*arg):
order(arg)
numArg = len(arg)
half = int(numArg/2)
if numArg/2 ==half:
print((arg[half-1]+arg[half])/2)
else:
print(int(arg[half]))
def order(tup):
ordered = [tup[i] for i in range(len(tup))]
test(ordered)
while(test(ordered)):
test(ordered)
print(ordered)
def test(ordered):
whileloop = 0
for i in range(len(ordered)-1):
print(i)
if (ordered[i]>ordered[i+1]):
print(str(ordered[i]) + ' is greater than ' + str(ordered[i+1]))
original = ordered[i+1]
ordered[i+1]=ordered[i]
ordered[i]=original
whileloop = 1 #run the loop again if you had to switch values
return whileloop
更普遍的中位数(和百分位数)方法是:
def get_percentile(data, percentile):
# Get the number of observations
cnt=len(data)
# Sort the list
data=sorted(data)
# Determine the split point
i=(cnt-1)*percentile
# Find the `floor` of the split point
diff=i-int(i)
# Return the weighted average of the value above and below the split point
return data[int(i)]*(1-diff)+data[int(i)+1]*(diff)
# Data
data=[1,2,3,4,5]
# For the median
print(get_percentile(data=data, percentile=.50))
# > 3
print(get_percentile(data=data, percentile=.75))
# > 4
# Note the weighted average difference when an int is not returned by the percentile
print(get_percentile(data=data, percentile=.51))
# > 3.04
中值函数
def median(midlist):
midlist.sort()
lens = len(midlist)
if lens % 2 != 0:
midl = (lens / 2)
res = midlist[midl]
else:
odd = (lens / 2) -1
ev = (lens / 2)
res = float(midlist[odd] + midlist[ev]) / float(2)
return res
(适用于python-2.x):
def median(lst):
n = len(lst)
s = sorted(lst)
return (s[n//2-1]/2.0+s[n//2]/2.0, s[n//2])[n % 2] if n else None
>>> median([-5, -5, -3, -4, 0, -1])
-3.5
numpy.median ():
>>> from numpy import median
>>> median([1, -4, -1, -1, 1, -3])
-1.0
python 3。X,使用statistics.median:
>>> from statistics import median
>>> median([5, 2, 3, 8, 9, -2])
4.0