如何将任意长度的列表拆分为大小相等的块?
请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。
对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。
如何将任意长度的列表拆分为大小相等的块?
请参阅如果数据结果将直接用于循环,并且不需要存储,则如何以块形式遍历列表。
对于字符串输入的同一问题,请参见每n个字符拆分字符串?。相同的技术通常适用,但也有一些变化。
当前回答
另一个更明确的版本。
def chunkList(initialList, chunkSize):
"""
This function chunks a list into sub lists
that have a length equals to chunkSize.
Example:
lst = [3, 4, 9, 7, 1, 1, 2, 3]
print(chunkList(lst, 3))
returns
[[3, 4, 9], [7, 1, 1], [2, 3]]
"""
finalList = []
for i in range(0, len(initialList), chunkSize):
finalList.append(initialList[i:i+chunkSize])
return finalList
其他回答
不调用len(),这对大型列表很有用:
def splitter(l, n):
i = 0
chunk = l[:n]
while chunk:
yield chunk
i += n
chunk = l[i:i+n]
这是可迭代的:
def isplitter(l, n):
l = iter(l)
chunk = list(islice(l, n))
while chunk:
yield chunk
chunk = list(islice(l, n))
上述产品的功能风味:
def isplitter2(l, n):
return takewhile(bool,
(tuple(islice(start, n))
for start in repeat(iter(l))))
OR:
def chunks_gen_sentinel(n, seq):
continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
return iter(imap(tuple, continuous_slices).next,())
OR:
def chunks_gen_filter(n, seq):
continuous_slices = imap(islice, repeat(iter(seq)), repeat(0), repeat(n))
return takewhile(bool,imap(tuple, continuous_slices))
参见本参考
>>> orange = range(1, 1001)
>>> otuples = list( zip(*[iter(orange)]*10))
>>> print(otuples)
[(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ... (991, 992, 993, 994, 995, 996, 997, 998, 999, 1000)]
>>> olist = [list(i) for i in otuples]
>>> print(olist)
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], ..., [991, 992, 993, 994, 995, 996, 997, 998, 999, 1000]]
>>>
蟒蛇3
def chunk(input, size):
return map(None, *([iter(input)] * size))
我很好奇不同方法的性能,这里是:
在Python 3.5.1上测试
import time
batch_size = 7
arr_len = 298937
#---------slice-------------
print("\r\nslice")
start = time.time()
arr = [i for i in range(0, arr_len)]
while True:
if not arr:
break
tmp = arr[0:batch_size]
arr = arr[batch_size:-1]
print(time.time() - start)
#-----------index-----------
print("\r\nindex")
arr = [i for i in range(0, arr_len)]
start = time.time()
for i in range(0, round(len(arr) / batch_size + 1)):
tmp = arr[batch_size * i : batch_size * (i + 1)]
print(time.time() - start)
#----------batches 1------------
def batch(iterable, n=1):
l = len(iterable)
for ndx in range(0, l, n):
yield iterable[ndx:min(ndx + n, l)]
print("\r\nbatches 1")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
tmp = x
print(time.time() - start)
#----------batches 2------------
from itertools import islice, chain
def batch(iterable, size):
sourceiter = iter(iterable)
while True:
batchiter = islice(sourceiter, size)
yield chain([next(batchiter)], batchiter)
print("\r\nbatches 2")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in batch(arr, batch_size):
tmp = x
print(time.time() - start)
#---------chunks-------------
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
print("\r\nchunks")
arr = [i for i in range(0, arr_len)]
start = time.time()
for x in chunks(arr, batch_size):
tmp = x
print(time.time() - start)
#-----------grouper-----------
from itertools import zip_longest # for Python 3.x
#from six.moves import zip_longest # for both (uses the six compat library)
def grouper(iterable, n, padvalue=None):
"grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue)
arr = [i for i in range(0, arr_len)]
print("\r\ngrouper")
start = time.time()
for x in grouper(arr, batch_size):
tmp = x
print(time.time() - start)
结果:
slice
31.18285083770752
index
0.02184295654296875
batches 1
0.03503894805908203
batches 2
0.22681021690368652
chunks
0.019841909408569336
grouper
0.006506919860839844
我非常喜欢tzot和J.F.Sebastian提出的Python文档版本,但它有两个缺点:
它不是很明确我通常不希望在最后一个块中有填充值
我在代码中经常使用这个:
from itertools import islice
def chunks(n, iterable):
iterable = iter(iterable)
while True:
yield tuple(islice(iterable, n)) or iterable.next()
更新:一个懒块版本:
from itertools import chain, islice
def chunks(n, iterable):
iterable = iter(iterable)
while True:
yield chain([next(iterable)], islice(iterable, n-1))