如何将任意长度的列表拆分为大小相等的块?
请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。
对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。
如何将任意长度的列表拆分为大小相等的块?
请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。
对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。
当前回答
我非常喜欢tzot和J.F.Sebastian提出的Python文档版本,但它有两个缺点:
它不是很明确我通常不希望在最后一个块中有填充值
我在代码中经常使用这个:
from itertools import islice
def chunks(n, iterable):
iterable = iter(iterable)
while True:
yield tuple(islice(iterable, n)) or iterable.next()
更新:一个懒块版本:
from itertools import chain, islice
def chunks(n, iterable):
iterable = iter(iterable)
while True:
yield chain([next(iterable)], islice(iterable, n-1))
其他回答
def split_seq(seq, num_pieces):
start = 0
for i in xrange(num_pieces):
stop = start + len(seq[i::num_pieces])
yield seq[start:stop]
start = stop
用法:
seq = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for seq in split_seq(seq, 3):
print seq
任何可迭代的通用分块器,使用户可以选择如何在结尾处处理部分分块。
在Python 3上测试。
分块.py
from enum import Enum
class PartialChunkOptions(Enum):
INCLUDE = 0
EXCLUDE = 1
PAD = 2
ERROR = 3
class PartialChunkException(Exception):
pass
def chunker(iterable, n, on_partial=PartialChunkOptions.INCLUDE, pad=None):
"""
A chunker yielding n-element lists from an iterable, with various options
about what to do about a partial chunk at the end.
on_partial=PartialChunkOptions.INCLUDE (the default):
include the partial chunk as a short (<n) element list
on_partial=PartialChunkOptions.EXCLUDE
do not include the partial chunk
on_partial=PartialChunkOptions.PAD
pad to an n-element list
(also pass pad=<pad_value>, default None)
on_partial=PartialChunkOptions.ERROR
raise a RuntimeError if a partial chunk is encountered
"""
on_partial = PartialChunkOptions(on_partial)
iterator = iter(iterable)
while True:
vals = []
for i in range(n):
try:
vals.append(next(iterator))
except StopIteration:
if vals:
if on_partial == PartialChunkOptions.INCLUDE:
yield vals
elif on_partial == PartialChunkOptions.EXCLUDE:
pass
elif on_partial == PartialChunkOptions.PAD:
yield vals + [pad] * (n - len(vals))
elif on_partial == PartialChunkOptions.ERROR:
raise PartialChunkException
return
return
yield vals
测试.py
import chunker
chunk_size = 3
for it in (range(100, 107),
range(100, 109)):
print("\nITERABLE TO CHUNK: {}".format(it))
print("CHUNK SIZE: {}".format(chunk_size))
for option in chunker.PartialChunkOptions.__members__.values():
print("\noption {} used".format(option))
try:
for chunk in chunker.chunker(it, chunk_size, on_partial=option):
print(chunk)
except chunker.PartialChunkException:
print("PartialChunkException was raised")
print("")
test.py的输出
ITERABLE TO CHUNK: range(100, 107)
CHUNK SIZE: 3
option PartialChunkOptions.INCLUDE used
[100, 101, 102]
[103, 104, 105]
[106]
option PartialChunkOptions.EXCLUDE used
[100, 101, 102]
[103, 104, 105]
option PartialChunkOptions.PAD used
[100, 101, 102]
[103, 104, 105]
[106, None, None]
option PartialChunkOptions.ERROR used
[100, 101, 102]
[103, 104, 105]
PartialChunkException was raised
ITERABLE TO CHUNK: range(100, 109)
CHUNK SIZE: 3
option PartialChunkOptions.INCLUDE used
[100, 101, 102]
[103, 104, 105]
[106, 107, 108]
option PartialChunkOptions.EXCLUDE used
[100, 101, 102]
[103, 104, 105]
[106, 107, 108]
option PartialChunkOptions.PAD used
[100, 101, 102]
[103, 104, 105]
[106, 107, 108]
option PartialChunkOptions.ERROR used
[100, 101, 102]
[103, 104, 105]
[106, 107, 108]
呵呵,单行版本
In [48]: chunk = lambda ulist, step: map(lambda i: ulist[i:i+step], xrange(0, len(ulist), step))
In [49]: chunk(range(1,100), 10)
Out[49]:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
[31, 32, 33, 34, 35, 36, 37, 38, 39, 40],
[41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60],
[61, 62, 63, 64, 65, 66, 67, 68, 69, 70],
[71, 72, 73, 74, 75, 76, 77, 78, 79, 80],
[81, 82, 83, 84, 85, 86, 87, 88, 89, 90],
[91, 92, 93, 94, 95, 96, 97, 98, 99]]
与任何可迭代的内部数据是生成器对象(不是列表)一个衬垫
In [259]: get_in_chunks = lambda itr,n: ( (v for _,v in g) for _,g in itertools.groupby(enumerate(itr),lambda (ind,_): ind/n)) In [260]: list(list(x) for x in get_in_chunks(range(30),7)) Out[260]: [[0, 1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13], [14, 15, 16, 17, 18, 19, 20], [21, 22, 23, 24, 25, 26, 27], [28, 29]]
由于我必须这样做,下面是我的解决方案,给出了一个生成器和一个批量大小:
def pop_n_elems_from_generator(g, n):
elems = []
try:
for idx in xrange(0, n):
elems.append(g.next())
return elems
except StopIteration:
return elems