在Python中创建一个初始容量的列表

这样的代码经常发生:

l = []
while foo:
    # baz
    l.append(bar)
    # qux

如果您要向列表中添加数千个元素，这将非常缓慢，因为列表必须不断调整大小以适应新元素。

在Java中，可以创建具有初始容量的ArrayList。如果你知道你的清单有多大，这将会更有效率。

我知道这样的代码通常可以被重构成一个列表理解式。但是，如果for/while循环非常复杂，这是不可行的。对于我们Python程序员来说，是否也有类似的方法?

当前回答

Python列表没有内置的预分配。如果你真的需要做一个列表，并且需要避免附加的开销(并且你应该验证你做了)，你可以这样做:

l = [None] * 1000 # Make a list of 1000 None's
for i in xrange(1000):
    # baz
    l[i] = bar
    # qux

也许你可以通过使用生成器来避免列表:

def my_things():
    while foo:
        #baz
        yield bar
        #qux

for thing in my_things():
    # do something with thing

这样，列表就不会全部存储在内存中，而只是根据需要生成。

2008-11-22 21:07:18

其他回答

对于某些应用程序，字典可能是您正在寻找的。例如，在find_totient方法中，我发现使用字典更方便，因为我没有零索引。

def totient(n):
    totient = 0

    if n == 1:
        totient = 1
    else:
        for i in range(1, n):
            if math.gcd(i, n) == 1:
                totient += 1
    return totient

def find_totients(max):
    totients = dict()
    for i in range(1,max+1):
        totients[i] = totient(i)

    print('Totients:')
    for i in range(1,max+1):
        print(i,totients[i])

这个问题也可以用预分配的列表来解决:

def find_totients(max):
    totients = None*(max+1)
    for i in range(1,max+1):
        totients[i] = totient(i)

    print('Totients:')
    for i in range(1,max+1):
        print(i,totients[i])

我觉得这不是很优雅，而且容易产生错误，因为我存储的是None，如果我不小心错误地使用它们，它可能会抛出异常，而且因为我需要考虑映射让我避免的边缘情况。

没错，字典的效率不会那么高，但正如其他人评论的那样，速度上的微小差异并不总是值得冒重大维护风险。

2016-10-27 16:33:12

如果使用NumPy，就会出现Python中的预分配问题，因为NumPy有更多类似c的数组。在这种情况下，预分配关注的是数据的形状和默认值。

如果要在大量列表上进行数值计算并希望获得性能，可以考虑NumPy。

2014-07-30 02:22:33

Python的列表不支持预分配。Numpy允许您预分配内存，但在实践中，如果您的目标是加速程序，那么这样做似乎不值得。

该测试只是将一个整数写入列表，但在实际应用程序中，每次迭代都可能执行更复杂的操作，这进一步降低了内存分配的重要性。

import timeit
import numpy as np

def list_append(size=1_000_000):
    result = []
    for i in range(size):
        result.append(i)
    return result

def list_prealloc(size=1_000_000):
    result = [None] * size
    for i in range(size):
        result[i] = i
    return result

def numpy_prealloc(size=1_000_000):
    result = np.empty(size, np.int32)
    for i in range(size):
        result[i] = i
    return result

setup = 'from __main__ import list_append, list_prealloc, numpy_prealloc'
print(timeit.timeit('list_append()', setup=setup, number=10))     # 0.79
print(timeit.timeit('list_prealloc()', setup=setup, number=10))   # 0.62
print(timeit.timeit('numpy_prealloc()', setup=setup, number=10))  # 0.73

2022-09-02 14:07:34

Python列表没有内置的预分配。如果你真的需要做一个列表，并且需要避免附加的开销(并且你应该验证你做了)，你可以这样做:

l = [None] * 1000 # Make a list of 1000 None's
for i in xrange(1000):
    # baz
    l[i] = bar
    # qux

也许你可以通过使用生成器来避免列表:

def my_things():
    while foo:
        #baz
        yield bar
        #qux

for thing in my_things():
    # do something with thing

这样，列表就不会全部存储在内存中，而只是根据需要生成。

2008-11-22 21:07:18

我运行了S.Lott的代码，通过预分配获得了同样10%的性能提升。我使用发电机尝试了Ned Batchelder的想法，并能够看到发电机的性能优于doAllocate。对于我的项目来说，10%的改进很重要，所以感谢每个人，因为这对我有帮助。

def doAppend(size=10000):
    result = []
    for i in range(size):
        message = "some unique object %d" % ( i, )
        result.append(message)
    return result

def doAllocate(size=10000):
    result = size*[None]
    for i in range(size):
        message = "some unique object %d" % ( i, )
        result[i] = message
    return result

def doGen(size=10000):
    return list("some unique object %d" % ( i, ) for i in xrange(size))

size = 1000
@print_timing
def testAppend():
    for i in xrange(size):
        doAppend()

@print_timing
def testAlloc():
    for i in xrange(size):
        doAllocate()

@print_timing
def testGen():
    for i in xrange(size):
        doGen()


testAppend()
testAlloc()
testGen()

输出

testAppend took 14440.000ms
testAlloc took 13580.000ms
testGen took 13430.000ms

2009-10-21 19:09:38

在Python中创建一个初始容量的列表

推荐文章

最新文章

标签