如何将列表分成大小相等的块？

如何将任意长度的列表拆分为大小相等的块？

请参阅如果数据结果将直接用于循环，并且不需要存储，则如何以块形式遍历列表。

对于字符串输入的同一问题，请参见每n个字符拆分字符串？。相同的技术通常适用，但也有一些变化。

当前回答

用户@tzot的解决方案zip_langest（*[iter（lst）]*n，fillvalue=padvalue）非常优雅，但如果lst的长度不能被n整除，它会填充最后一个子列表，以保持其长度与其他子列表的长度匹配。然而，如果这不可取，那么只需使用zip（）生成类似的循环zip，并将lst的剩余元素（不能生成“完整”子列表）附加到输出即可。

输出示例为ABCDEFG，3->ABC DEF G。

单线版本（Python>=3.8）：

list(map(list, zip(*[iter(lst)]*n))) + ([rest] if (rest:=lst[len(lst)//n*n : ]) else [])

A函数：

def chunkify(lst, chunk_size):
    nested = list(map(list, zip(*[iter(lst)]*chunk_size)))
    rest = lst[len(lst)//chunk_size*chunk_size: ]
    if rest:
        nested.append(rest)
    return nested

生成器（尽管每个批次都是一个元组）：

def chunkify(lst, chunk_size):
    for tup in zip(*[iter(lst)]*chunk_size):
        yield tup
    rest = tuple(lst[len(lst)//chunk_size*chunk_size: ])
    if rest:
        yield rest

它比这里的一些最流行的答案产生相同的输出更快。

my_list, n = list(range(1_000_000)), 12

%timeit list(chunks(my_list, n))                                         # @Ned_Batchelder
# 36.4 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [my_list[i:i+n] for i in range(0, len(my_list), n)]              # @Ned_Batchelder
# 34.6 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit it = iter(my_list); list(iter(lambda: list(islice(it, n)), []))  # @senderle
# 60.6 ms ± 5.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit list(mit.chunked(my_list, n))                                    # @pylang
# 59.4 ms ± 4.92 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit chunkify(my_list, n)
# 25.8 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

同样，从Python 3.12开始，这个功能将作为itertools模块中的批处理方法来实现（目前是一个配方），因此这个答案很可能会被Python 3.12淘汰。

2022-07-13 03:38:10

其他回答

我很惊讶没有人想到使用iter的双参数形式：

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

演示：

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

这适用于任何可迭代的对象，并延迟生成输出。它返回元组而不是迭代器，但我认为它还是有一定的优雅。它也不会垫；如果您需要填充，上面的一个简单变体就足够了：

from itertools import islice, chain, repeat

def chunk_pad(it, size, padval=None):
    it = chain(iter(it), repeat(padval))
    return iter(lambda: tuple(islice(it, size)), (padval,) * size)

演示：

>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk_pad(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

与基于izip_longest的解决方案一样，上面的解决方案也始终适用。据我所知，对于可选pad的函数，没有单行或双线itertools配方。通过结合以上两种方法，这一方法非常接近：

_no_padding = object()

def chunk(it, size, padval=_no_padding):
    if padval == _no_padding:
        it = iter(it)
        sentinel = ()
    else:
        it = chain(iter(it), repeat(padval))
        sentinel = (padval,) * size
    return iter(lambda: tuple(islice(it, size)), sentinel)

演示：

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

我相信这是提议的提供可选填充的最短的分块器。

正如Tomasz Gandor所观察到的，如果两个填充块遇到一长串填充值，它们会意外停止。以下是以合理方式解决该问题的最后一个变体：

_no_padding = object()
def chunk(it, size, padval=_no_padding):
    it = iter(it)
    chunker = iter(lambda: tuple(islice(it, size)), ())
    if padval == _no_padding:
        yield from chunker
    else:
        for ch in chunker:
            yield ch if len(ch) == size else ch + (padval,) * (size - len(ch))

演示：

>>> list(chunk([1, 2, (), (), 5], 2))
[(1, 2), ((), ()), (5,)]
>>> list(chunk([1, 2, None, None, 5], 2, None))
[(1, 2), (None, None), (5, None)]

2014-02-26 15:02:00

如果您知道列表大小：

def SplitList(mylist, chunk_size):
    return [mylist[offs:offs+chunk_size] for offs in range(0, len(mylist), chunk_size)]

如果没有（迭代器）：

def IterChunks(sequence, chunk_size):
    res = []
    for item in sequence:
        res.append(item)
        if len(res) >= chunk_size:
            yield res
            res = []
    if res:
        yield res  # yield the last, incomplete, portion

在后一种情况下，如果您可以确保序列始终包含给定大小的整数个块（即没有不完整的最后一个块），则可以用更漂亮的方式重新表述。

2008-11-23 12:40:39

由于我必须这样做，下面是我的解决方案，给出了一个生成器和一个批量大小：

def pop_n_elems_from_generator(g, n):
    elems = []
    try:
        for idx in xrange(0, n):
            elems.append(g.next())
        return elems
    except StopIteration:
        return elems

2015-10-16 22:09:29

我知道这有点过时，但还没有人提到numpy.array_split：

import numpy as np

lst = range(50)
np.array_split(lst, 5)

结果：

[array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),
 array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]),
 array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]),
 array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])]

2013-06-05 08:54:26

我想我没有看到这个选项，所以只需添加另一个：）：

def chunks(iterable, chunk_size):
  i = 0;
  while i < len(iterable):
    yield iterable[i:i+chunk_size]
    i += chunk_size

2017-11-03 12:38:56

如何将列表分成大小相等的块？

推荐文章

最新文章

标签