我有一个Python脚本,它把一个整数列表作为输入,我需要一次处理四个整数。不幸的是,我无法控制输入,否则我将它作为一个四元素元组列表传入。目前,我以这种方式迭代它:

for i in range(0, len(ints), 4):
    # dummy op for example code
    foo += ints[i] * ints[i + 1] + ints[i + 2] * ints[i + 3]

不过,它看起来很像“C-think”,这让我怀疑有一种更python的方式来处理这种情况。该列表在迭代后被丢弃,因此不需要保留。也许这样会更好?

while ints:
    foo += ints[0] * ints[1] + ints[2] * ints[3]
    ints[0:4] = []

不过,感觉还是不太对。: - /

相关问题:在Python中如何将列表分割成大小均匀的块?


当前回答

from itertools import izip_longest

def chunker(iterable, chunksize, filler):
    return izip_longest(*[iter(iterable)]*chunksize, fillvalue=filler)

其他回答

def chunker(iterable, n):
    """Yield iterable in chunk sizes.

    >>> chunks = chunker('ABCDEF', n=4)
    >>> chunks.next()
    ['A', 'B', 'C', 'D']
    >>> chunks.next()
    ['E', 'F']
    """
    it = iter(iterable)
    while True:
        chunk = []
        for i in range(n):
            try:
                chunk.append(next(it))
            except StopIteration:
                yield chunk
                raise StopIteration
        yield chunk

if __name__ == '__main__':
    import doctest

    doctest.testmod()
from itertools import izip_longest

def chunker(iterable, chunksize, filler):
    return izip_longest(*[iter(iterable)]*chunksize, fillvalue=filler)
chunk_size = 4
for i in range(0, len(ints), chunk_size):
    chunk = ints[i:i+chunk_size]
    # process chunk of size <= chunk_size

这个问题的理想解决方案是使用迭代器(而不仅仅是序列)。它还应该是快速的。

这是itertools文档提供的解决方案:

def grouper(n, iterable, fillvalue=None):
    #"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

在我的mac book air上使用ipython的%timeit,我每次循环得到47.5 us。

然而,这真的不适合我,因为结果被填充为偶数大小的组。没有填充的解决方案稍微复杂一些。最天真的解决方案可能是:

def grouper(size, iterable):
    i = iter(iterable)
    while True:
        out = []
        try:
            for _ in range(size):
                out.append(i.next())
        except StopIteration:
            yield out
            break
        
        yield out

简单,但相当慢:每循环693个

我能想到的最好的解决方案是使用islice进行内循环:

def grouper(size, iterable):
    it = iter(iterable)
    while True:
        group = tuple(itertools.islice(it, None, size))
        if not group:
            break
        yield group

对于同样的数据集,我每循环得到305 us。

由于无法更快地得到一个纯粹的解决方案,我提供了以下解决方案,但有一个重要的警告:如果您的输入数据中有filldata的实例,则可能会得到错误的答案。

def grouper(n, iterable, fillvalue=None):
    #"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    # itertools.zip_longest on Python 3
    for x in itertools.izip_longest(*args, fillvalue=fillvalue):
        if x[-1] is fillvalue:
            yield tuple(v for v in x if v is not fillvalue)
        else:
            yield x

我真的不喜欢这个答案,但它明显更快。每回路124 us

首先,我将它设计为将字符串拆分为子字符串以解析包含十六进制的字符串。 今天我把它变成复杂的,但仍然简单的生成器。

def chunker(iterable, size, reductor, condition):
    it = iter(iterable)
    def chunk_generator():
        return (next(it) for _ in range(size))
    chunk = reductor(chunk_generator())
    while condition(chunk):
        yield chunk
        chunk = reductor(chunk_generator())

参数:

明显的

Iterable是任何包含/生成/迭代输入数据的Iterable /迭代器/生成器, 当然,大小是你想要得到的块的大小,

更有趣的

reductor is a callable, which receives generator iterating over content of chunk. I'd expect it to return sequence or string, but I don't demand that. You can pass as this argument for example list, tuple, set, frozenset, or anything fancier. I'd pass this function, returning string (provided that iterable contains / generates / iterates over strings): def concatenate(iterable): return ''.join(iterable) Note that reductor can cause closing generator by raising exception. condition is a callable which receives anything what reductor returned. It decides to approve & yield it (by returning anything evaluating to True), or to decline it & finish generator's work (by returning anything other or raising exception). When number of elements in iterable is not divisible by size, when it gets exhausted, reductor will receive generator generating less elements than size. Let's call these elements lasts elements. I invited two functions to pass as this argument: lambda x:x - the lasts elements will be yielded. lambda x: len(x)==<size> - the lasts elements will be rejected. replace <size> using number equal to size