如何将任意长度的列表拆分为大小相等的块?


请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。

对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。


当前回答

itertools模块中的配方提供了两种方法来实现这一点,具体取决于您希望如何处理最终的奇数大小的批次(保留它、用填充值填充它、忽略它或引发异常):

from itertools import islice, izip_longest

def batched(iterable, n):
    "Batch data into lists of length n. The last batch may be shorter."
    # batched('ABCDEFG', 3) --> ABC DEF G
    it = iter(iterable)
    while True:
        batch = list(islice(it, n))
        if not batch:
            return
        yield batch

def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
    # grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
    # grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
    args = [iter(iterable)] * n
    if incomplete == 'fill':
        return zip_longest(*args, fillvalue=fillvalue)
    if incomplete == 'strict':
        return zip(*args, strict=True)
    if incomplete == 'ignore':
        return zip(*args)
    else:
        raise ValueError('Expected fill, strict, or ignore')

其他回答

这是一个生成大小均匀的块的生成器:

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]
import pprint
pprint.pprint(list(chunks(range(10, 75), 10)))
[[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

对于Python 2,使用xrange代替range:

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in xrange(0, len(lst), n):
        yield lst[i:i + n]

下面是一行理解列表。不过,上面的方法更可取,因为使用命名函数使代码更容易理解。对于Python 3:

[lst[i:i + n] for i in range(0, len(lst), n)]

对于Python 2:

[lst[i:i + n] for i in xrange(0, len(lst), n)]
[AA[i:i+SS] for i in range(len(AA))[::SS]]

其中AA是数组,SS是块大小。例如:

>>> AA=range(10,21);SS=3
>>> [AA[i:i+SS] for i in range(len(AA))[::SS]]
[[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20]]
# or [range(10, 13), range(13, 16), range(16, 19), range(19, 21)] in py3

要扩展py3中的范围,请执行以下操作

(py3) >>> [list(AA[i:i+SS]) for i in range(len(AA))[::SS]]
[[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20]]

此时,我认为我们需要强制性的匿名递归函数。

Y = lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
chunks = Y(lambda f: lambda n: [n[0][:n[1]]] + f((n[0][n[1]:], n[1])) if len(n[0]) > 0 else [])

在这一点上,我认为我们需要一个递归生成器,以防万一。。。

在python 2中:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

在python 3中:

def chunks(li, n):
    if li == []:
        return
    yield li[:n]
    yield from chunks(li[n:], n)

此外,在大规模外星人入侵的情况下,装饰递归生成器可能会变得很方便:

def dec(gen):
    def new_gen(li, n):
        for e in gen(li, n):
            if e == []:
                return
            yield e
    return new_gen

@dec
def chunks(li, n):
    yield li[:n]
    for e in chunks(li[n:], n):
        yield e

这个问题让我想起Raku(以前的Perl6).comb(n)方法。它将字符串分成n个大小的块。(还有更多,但我会省略细节。)

在Python3中实现一个类似的函数作为lambda表达式非常简单:

comb = lambda s,n: (s[i:i+n] for i in range(0,len(s),n))

然后你可以这样称呼它:

some_list = list(range(0, 20))  # creates a list of 20 elements
generator = comb(some_list, 4)  # creates a generator that will generate lists of 4 elements
for sublist in generator:
    print(sublist)  # prints a sublist of four elements, as it's generated

当然,您不必将生成器分配给变量;你可以直接这样循环:

for sublist in comb(some_list, 4):
    print(sublist)  # prints a sublist of four elements, as it's generated

另外,此comb()函数还对字符串进行操作:

list( comb('catdogant', 3) )  # returns ['cat', 'dog', 'ant']