我有一个Python脚本,它把一个整数列表作为输入,我需要一次处理四个整数。不幸的是,我无法控制输入,否则我将它作为一个四元素元组列表传入。目前,我以这种方式迭代它:
for i in range(0, len(ints), 4):
# dummy op for example code
foo += ints[i] * ints[i + 1] + ints[i + 2] * ints[i + 3]
不过,它看起来很像“C-think”,这让我怀疑有一种更python的方式来处理这种情况。该列表在迭代后被丢弃,因此不需要保留。也许这样会更好?
while ints:
foo += ints[0] * ints[1] + ints[2] * ints[3]
ints[0:4] = []
不过,感觉还是不太对。: - /
相关问题:在Python中如何将列表分割成大小均匀的块?
首先,我将它设计为将字符串拆分为子字符串以解析包含十六进制的字符串。
今天我把它变成复杂的,但仍然简单的生成器。
def chunker(iterable, size, reductor, condition):
it = iter(iterable)
def chunk_generator():
return (next(it) for _ in range(size))
chunk = reductor(chunk_generator())
while condition(chunk):
yield chunk
chunk = reductor(chunk_generator())
参数:
明显的
Iterable是任何包含/生成/迭代输入数据的Iterable /迭代器/生成器,
当然,大小是你想要得到的块的大小,
更有趣的
reductor is a callable, which receives generator iterating over content of chunk.
I'd expect it to return sequence or string, but I don't demand that.
You can pass as this argument for example list, tuple, set, frozenset,
or anything fancier. I'd pass this function, returning string
(provided that iterable contains / generates / iterates over strings):
def concatenate(iterable):
return ''.join(iterable)
Note that reductor can cause closing generator by raising exception.
condition is a callable which receives anything what reductor returned.
It decides to approve & yield it (by returning anything evaluating to True),
or to decline it & finish generator's work (by returning anything other or raising exception).
When number of elements in iterable is not divisible by size, when it gets exhausted, reductor will receive generator generating less elements than size.
Let's call these elements lasts elements.
I invited two functions to pass as this argument:
lambda x:x - the lasts elements will be yielded.
lambda x: len(x)==<size> - the lasts elements will be rejected.
replace <size> using number equal to size
我希望通过将迭代器从列表中删除,我不是简单地复制列表的一部分。生成器可以被切片,它们将自动仍然是一个生成器,而列表将被切片成1000个条目的大块,这是较低的效率。
def iter_group(iterable, batch_size:int):
length = len(iterable)
start = batch_size*-1
end = 0
while(end < length):
start += batch_size
end += batch_size
if type(iterable) == list:
yield (iterable[i] for i in range(start,min(length-1,end)))
else:
yield iterable[start:end]
用法:
items = list(range(1,1251))
for item_group in iter_group(items, 1000):
for item in item_group:
print(item)
我从来不想填充我的块,所以这个要求是必要的。我发现在任何可迭代对象上工作的能力也是必需的。鉴于此,我决定扩展公认的答案,https://stackoverflow.com/a/434411/1074659。
如果由于需要比较和筛选填充值而不需要填充,则这种方法的性能会受到轻微的影响。然而,对于大块大小,这个实用程序是非常高性能的。
#!/usr/bin/env python3
from itertools import zip_longest
_UNDEFINED = object()
def chunker(iterable, chunksize, fillvalue=_UNDEFINED):
"""
Collect data into chunks and optionally pad it.
Performance worsens as `chunksize` approaches 1.
Inspired by:
https://docs.python.org/3/library/itertools.html#itertools-recipes
"""
args = [iter(iterable)] * chunksize
chunks = zip_longest(*args, fillvalue=fillvalue)
yield from (
filter(lambda val: val is not _UNDEFINED, chunk)
if chunk[-1] is _UNDEFINED
else chunk
for chunk in chunks
) if fillvalue is _UNDEFINED else chunks
下面是一个支持生成器的无导入chunker:
def chunks(seq, size):
it = iter(seq)
while True:
ret = tuple(next(it) for _ in range(size))
if len(ret) == size:
yield ret
else:
raise StopIteration()
使用示例:
>>> def foo():
... i = 0
... while True:
... i += 1
... yield i
...
>>> c = chunks(foo(), 3)
>>> c.next()
(1, 2, 3)
>>> c.next()
(4, 5, 6)
>>> list(chunks('abcdefg', 2))
[('a', 'b'), ('c', 'd'), ('e', 'f')]
关于J.F. Sebastian给出的解决方案:
def chunker(iterable, chunksize):
return zip(*[iter(iterable)]*chunksize)
它很聪明,但有一个缺点——总是返回元组。如何获得字符串代替?
当然,你可以写“.join(chunker(…))”,但无论如何都要构造临时元组。
你可以通过编写自己的zip来摆脱临时元组,就像这样:
class IteratorExhausted(Exception):
pass
def translate_StopIteration(iterable, to=IteratorExhausted):
for i in iterable:
yield i
raise to # StopIteration would get ignored because this is generator,
# but custom exception can leave the generator.
def custom_zip(*iterables, reductor=tuple):
iterators = tuple(map(translate_StopIteration, iterables))
while True:
try:
yield reductor(next(i) for i in iterators)
except IteratorExhausted: # when any of iterators get exhausted.
break
Then
def chunker(data, size, reductor=tuple):
return custom_zip(*[iter(data)]*size, reductor=reductor)
使用示例:
>>> for i in chunker('12345', 2):
... print(repr(i))
...
('1', '2')
('3', '4')
>>> for i in chunker('12345', 2, ''.join):
... print(repr(i))
...
'12'
'34'