如何在Python中找到列表的平均值?

[1, 2, 3, 4]  ⟶  2.5

当前回答

对于Python 3.8+,使用统计信息。浮点数稳定性的平均值。(快)。

对于Python 3.4+,使用统计信息。平均数值稳定性与浮子。(慢)。

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(xs)  # = 20.11111111111111

对于较旧版本的Python 3,请使用

sum(xs) / len(xs)

对于Python 2,将len转换为浮点数以获得浮点除法:

sum(xs) / float(len(xs))

其他回答

如果您使用的是python >= 3.4,则有一个统计库

https://docs.python.org/3/library/statistics.html

你可以像这样使用它的mean方法。让我们假设你有一个数字列表,你想找到平均值:-

list = [11, 13, 12, 15, 17]
import statistics as s
s.mean(list)

它还有其他方法,比如stdev,方差,模式,调和平均值,中位数等,这些方法都非常有用。

如果你想要的不仅仅是平均值(又名平均),你可以看看scipy的统计:

from scipy import stats
l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(stats.describe(l))

# DescribeResult(nobs=9, minmax=(2, 78), mean=20.11111111111111, 
# variance=572.3611111111111, skewness=1.7791785448425341, 
# kurtosis=1.9422716419666397)

或者使用熊猫系列。意思是方法:

pd.Series(sequence).mean()

演示:

>>> import pandas as pd
>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> pd.Series(l).mean()
20.11111111111111
>>> 

从文档中可以看出:

系列。意思是(axis= no, skipna= no, level= no, numic_only = no, kwargs

这里是这个的文档:

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.mean.html

整个文档:

https://pandas.pydata.org/pandas-docs/stable/10min.html

对于Python 3.8+,使用统计信息。浮点数稳定性的平均值。(快)。

对于Python 3.4+,使用统计信息。平均数值稳定性与浮子。(慢)。

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(xs)  # = 20.11111111111111

对于较旧版本的Python 3,请使用

sum(xs) / len(xs)

对于Python 2,将len转换为浮点数以获得浮点除法:

sum(xs) / float(len(xs))

编辑:

我添加了另外两种获取列表平均值的方法(仅适用于Python 3.8+)。下面是我做的比较:

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd
import math

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()


def mean7():
    return statistics.fmean(l)


def mean8():
    return math.fsum(l) / len(l)


for func in [mean1, mean2, mean3, mean4, mean5, mean6, mean7, mean8 ]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

以下是我得到的结果:

mean1 took:  0.09751558300000002
mean2 took:  0.005496791999999973
mean3 took:  0.07754683299999998
mean4 took:  0.055743208000000044
mean5 took:  0.018134082999999968
mean6 took:  0.6663848750000001
mean7 took:  0.004305374999999945
mean8 took:  0.003203333000000086

有趣!看起来math.fsum(l) / len(l)是最快的方法,然后是statistics.fmean(l),然后是sum(l) / len(l)。好了!

感谢阿斯克勒庇俄斯为我展示了另外两种方式!


旧的回答:

就效率和速度而言,以下是我测试其他答案的结果:

# test mean caculation

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()



for func in [mean1, mean2, mean3, mean4, mean5, mean6]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

结果是:

mean1 took:  0.17030245899968577
mean2 took:  0.002183011999932205
mean3 took:  0.09744236000005913
mean4 took:  0.07070840100004716
mean5 took:  0.022754742999950395
mean6 took:  1.6689282460001778

所以很明显赢家是: Sum (l) / len(l)