如何将任意长度的列表拆分为大小相等的块?


请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。

对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。


当前回答

例如,如果块大小为3,则可以执行以下操作:

zip(*[iterable[i::3] for i in range(3)]) 

来源:http://code.activestate.com/recipes/303060-group-a-list-into-sequential-n-tuples/

当我的区块大小是固定的数字时,我会使用这个,我可以键入,例如“3”,并且永远不会改变。

其他回答

我很惊讶没有人想到使用iter的双参数形式:

from itertools import islice

def chunk(it, size):
    it = iter(it)
    return iter(lambda: tuple(islice(it, size)), ())

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]

这适用于任何可迭代的对象,并延迟生成输出。它返回元组而不是迭代器,但我认为它还是有一定的优雅。它也不会垫;如果您需要填充,上面的一个简单变体就足够了:

from itertools import islice, chain, repeat

def chunk_pad(it, size, padval=None):
    it = chain(iter(it), repeat(padval))
    return iter(lambda: tuple(islice(it, size)), (padval,) * size)

演示:

>>> list(chunk_pad(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk_pad(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

与基于izip_longest的解决方案一样,上面的解决方案也始终适用。据我所知,对于可选pad的函数,没有单行或双线itertools配方。通过结合以上两种方法,这一方法非常接近:

_no_padding = object()

def chunk(it, size, padval=_no_padding):
    if padval == _no_padding:
        it = iter(it)
        sentinel = ()
    else:
        it = chain(iter(it), repeat(padval))
        sentinel = (padval,) * size
    return iter(lambda: tuple(islice(it, size)), sentinel)

演示:

>>> list(chunk(range(14), 3))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)]
>>> list(chunk(range(14), 3, None))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)]
>>> list(chunk(range(14), 3, 'a'))
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')]

我相信这是提议的提供可选填充的最短的分块器。

正如Tomasz Gandor所观察到的,如果两个填充块遇到一长串填充值,它们会意外停止。以下是以合理方式解决该问题的最后一个变体:

_no_padding = object()
def chunk(it, size, padval=_no_padding):
    it = iter(it)
    chunker = iter(lambda: tuple(islice(it, size)), ())
    if padval == _no_padding:
        yield from chunker
    else:
        for ch in chunker:
            yield ch if len(ch) == size else ch + (padval,) * (size - len(ch))

演示:

>>> list(chunk([1, 2, (), (), 5], 2))
[(1, 2), ((), ()), (5,)]
>>> list(chunk([1, 2, None, None, 5], 2, None))
[(1, 2), (None, None), (5, None)]

上面的答案(由koffein给出)有一个小问题:列表总是被分割成相等数量的分割,而不是每个分区的项目数相等。这是我的版本。“//chs+1”考虑到项目的数量可能不能完全除以分区大小,因此最后一个分区将仅被部分填充。

# Given 'l' is your list

chs = 12 # Your chunksize
partitioned = [ l[i*chs:(i*chs)+chs] for i in range((len(l) // chs)+1) ]

我不喜欢按块大小拆分元素的想法,例如,脚本可以将101到3个块划分为[50,50,1]。为了我的需要,我需要按比例分配,保持秩序不变。首先我写了自己的剧本,效果很好,而且很简单。但我后来看到了这个答案,剧本比我的好,我想是这样的。这是我的脚本:

def proportional_dividing(N, n):
    """
    N - length of array (bigger number)
    n - number of chunks (smaller number)
    output - arr, containing N numbers, diveded roundly to n chunks
    """
    arr = []
    if N == 0:
        return arr
    elif n == 0:
        arr.append(N)
        return arr
    r = N // n
    for i in range(n-1):
        arr.append(r)
    arr.append(N-r*(n-1))

    last_n = arr[-1]
    # last number always will be r <= last_n < 2*r
    # when last_n == r it's ok, but when last_n > r ...
    if last_n > r:
        # ... and if difference too big (bigger than 1), then
        if abs(r-last_n) > 1:
            #[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 7] # N=29, n=12
            # we need to give unnecessary numbers to first elements back
            diff = last_n - r
            for k in range(diff):
                arr[k] += 1
            arr[-1] = r
            # and we receive [3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2]
    return arr

def split_items(items, chunks):
    arr = proportional_dividing(len(items), chunks)
    splitted = []
    for chunk_size in arr:
        splitted.append(items[:chunk_size])
        items = items[chunk_size:]
    print(splitted)
    return splitted

items = [1,2,3,4,5,6,7,8,9,10,11]
chunks = 3
split_items(items, chunks)
split_items(['a','b','c','d','e','f','g','h','i','g','k','l', 'm'], 3)
split_items(['a','b','c','d','e','f','g','h','i','g','k','l', 'm', 'n'], 3)
split_items(range(100), 4)
split_items(range(99), 4)
split_items(range(101), 4)

和输出:

[[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11]]
[['a', 'b', 'c', 'd'], ['e', 'f', 'g', 'h'], ['i', 'g', 'k', 'l', 'm']]
[['a', 'b', 'c', 'd', 'e'], ['f', 'g', 'h', 'i', 'g'], ['k', 'l', 'm', 'n']]
[range(0, 25), range(25, 50), range(50, 75), range(75, 100)]
[range(0, 25), range(25, 50), range(50, 75), range(75, 99)]
[range(0, 25), range(25, 50), range(50, 75), range(75, 101)]

senderle答案的一个线性版本:

from itertools import islice
from functools import partial

seq = [1,2,3,4,5,6,7]
size = 3
result = list(iter(partial(lambda it: tuple(islice(it, size)), iter(seq)), ()))
assert result == [(1, 2, 3), (4, 5, 6), (7,)]

假设列表是第一个

import math

# length of the list len(lst) is ln
# size of a chunk is size

for num in range ( math.ceil(ln/size) ):
    start, end = num*size, min((num+1)*size, ln)
    print(lst[start:end])