如何将列表分成大小相等的块？

如何将任意长度的列表拆分为大小相等的块？

请参阅如果数据结果将直接用于循环，并且不需要存储，则如何以块形式遍历列表。

对于字符串输入的同一问题，请参见每n个字符拆分字符串？。相同的技术通常适用，但也有一些变化。

当前回答

延迟加载版本

导入pprintpprint.pprint（列表（块（范围（10，75），10））[范围（10、20），范围（20、30），范围（30、40），范围（40、50），范围（50、60），范围（60、70），范围（70，75）]将此实现的结果与接受答案的示例使用结果进行比较。

上面的许多函数都假定整个可迭代函数的长度是预先知道的，或者至少计算起来很便宜。

对于一些流式对象，这意味着首先将完整数据加载到内存中（例如下载整个文件）以获取长度信息。

但是，如果您还不知道完整大小，可以使用以下代码：

def chunks(iterable, size):
    """
    Yield successive chunks from iterable, being `size` long.

    https://stackoverflow.com/a/55776536/3423324
    :param iterable: The object you want to split into pieces.
    :param size: The size each of the resulting pieces should have.
    """
    i = 0
    while True:
        sliced = iterable[i:i + size]
        if len(sliced) == 0:
            # to suppress stuff like `range(max, max)`.
            break
        # end if
        yield sliced
        if len(sliced) < size:
            # our slice is not the full length, so we must have passed the end of the iterator
            break
        # end if
        i += size  # so we start the next chunk at the right place.
    # end while
# end def

这之所以有效，是因为如果您传递了一个iterable的结尾，slice命令将返回less/no元素：

"abc"[0:2] == 'ab'
"abc"[2:4] == 'c'
"abc"[4:6] == ''

我们现在使用切片的结果，并计算生成的块的长度。如果它低于我们的预期，我们知道我们可以结束迭代。

这样，除非访问，否则不会执行迭代器。

2019-04-20 18:28:54

其他回答

您可以使用numpy的array_split函数，例如np.array_split（np.array（data），20），将其拆分为20个大小几乎相等的块。

要确保块的大小完全相等，请使用np.split。

2016-11-20 04:32:29

用户@tzot的解决方案zip_langest（*[iter（lst）]*n，fillvalue=padvalue）非常优雅，但如果lst的长度不能被n整除，它会填充最后一个子列表，以保持其长度与其他子列表的长度匹配。然而，如果这不可取，那么只需使用zip（）生成类似的循环zip，并将lst的剩余元素（不能生成“完整”子列表）附加到输出即可。

输出示例为ABCDEFG，3->ABC DEF G。

单线版本（Python>=3.8）：

list(map(list, zip(*[iter(lst)]*n))) + ([rest] if (rest:=lst[len(lst)//n*n : ]) else [])

A函数：

def chunkify(lst, chunk_size):
    nested = list(map(list, zip(*[iter(lst)]*chunk_size)))
    rest = lst[len(lst)//chunk_size*chunk_size: ]
    if rest:
        nested.append(rest)
    return nested

生成器（尽管每个批次都是一个元组）：

def chunkify(lst, chunk_size):
    for tup in zip(*[iter(lst)]*chunk_size):
        yield tup
    rest = tuple(lst[len(lst)//chunk_size*chunk_size: ])
    if rest:
        yield rest

它比这里的一些最流行的答案产生相同的输出更快。

my_list, n = list(range(1_000_000)), 12

%timeit list(chunks(my_list, n))                                         # @Ned_Batchelder
# 36.4 ms ± 1.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [my_list[i:i+n] for i in range(0, len(my_list), n)]              # @Ned_Batchelder
# 34.6 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit it = iter(my_list); list(iter(lambda: list(islice(it, n)), []))  # @senderle
# 60.6 ms ± 5.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit list(mit.chunked(my_list, n))                                    # @pylang
# 59.4 ms ± 4.92 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit chunkify(my_list, n)
# 25.8 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

同样，从Python 3.12开始，这个功能将作为itertools模块中的批处理方法来实现（目前是一个配方），因此这个答案很可能会被Python 3.12淘汰。

2022-07-13 03:38:10

下面是一个处理任意可迭代项的生成器：

def split_seq(iterable, size):
    it = iter(iterable)
    item = list(itertools.islice(it, size))
    while item:
        yield item
        item = list(itertools.islice(it, size))

例子：

>>> import pprint
>>> pprint.pprint(list(split_seq(xrange(75), 10)))
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
 [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
 [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
 [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
 [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
 [70, 71, 72, 73, 74]]

2008-11-23 12:41:37

我意识到这个问题已经过时了（在谷歌上被它绊倒了），但肯定像下面这样的问题比任何复杂的建议都要简单和清晰得多，而且只使用切片：

def chunker(iterable, chunksize):
    for i,c in enumerate(iterable[::chunksize]):
        yield iterable[i*chunksize:(i+1)*chunksize]

>>> for chunk in chunker(range(0,100), 10):
...     print list(chunk)
... 
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
... etc ...

2012-08-27 22:58:05

这里有一个使用itertools.groupby的想法：

def chunks(l, n):
    c = itertools.count()
    return (it for _, it in itertools.groupby(l, lambda x: next(c)//n))

这将返回一个生成器。如果需要列表列表，只需将最后一行替换为

    return [list(it) for _, it in itertools.groupby(l, lambda x: next(c)//n)]

返回列表列表示例：

>>> chunks('abcdefghij', 4)
[['a', 'b', 'c', 'd'], ['e', 'f', 'g', 'h'], ['i', 'j']]

（因此，是的，这会受到“矮子问题”的影响，在特定情况下，这可能是问题，也可能不是问题。）

2017-03-08 17:03:46

如何将列表分成大小相等的块？

推荐文章

最新文章

标签