我需要一个滚动窗口(又名滑动窗口)可迭代的序列/迭代器/生成器。(默认的Python迭代可以被认为是一种特殊情况,其中窗口长度为1。)我目前正在使用以下代码。我怎样才能做得更优雅和/或更有效?
def rolling_window(seq, window_size):
it = iter(seq)
win = [it.next() for cnt in xrange(window_size)] # First window
yield win
for e in it: # Subsequent windows
win[:-1] = win[1:]
win[-1] = e
yield win
if __name__=="__main__":
for w in rolling_window(xrange(6), 3):
print w
"""Example output:
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
"""
对于window_size == 2的特定情况(即,在序列中迭代相邻的重叠对),请参见如何从列表中迭代重叠(当前,下一个)值对?
我使用下面的代码作为一个简单的滑动窗口,它使用生成器来大幅提高可读性。根据我的经验,到目前为止,它的速度足以用于生物信息学序列分析。
我把它包括在这里是因为我还没有看到这种方法被使用过。同样,我对它的比较性能没有任何评价。
def slidingWindow(sequence,winSize,step=1):
"""Returns a generator that will iterate through
the defined chunks of input sequence. Input sequence
must be sliceable."""
# Verify the inputs
if not ((type(winSize) == type(0)) and (type(step) == type(0))):
raise Exception("**ERROR** type(winSize) and type(step) must be int.")
if step > winSize:
raise Exception("**ERROR** step must not be larger than winSize.")
if winSize > len(sequence):
raise Exception("**ERROR** winSize must not be larger than sequence length.")
# Pre-compute number of chunks to emit
numOfChunks = ((len(sequence)-winSize)/step)+1
# Do the work
for i in range(0,numOfChunks*step,step):
yield sequence[i:i+winSize]
多个迭代器!
def window(seq, size, step=1):
# initialize iterators
iters = [iter(seq) for i in range(size)]
# stagger iterators (without yielding)
[next(iters[i]) for j in range(size) for i in range(-1, -j-1, -1)]
while(True):
yield [next(i) for i in iters]
# next line does nothing for step = 1 (skips iterations for step > 1)
[next(i) for i in iters for j in range(step-1)]
next(it)在序列结束时引发StopIteration,出于一些我无法理解的很酷的原因,yield语句在这里除外它,函数返回,忽略没有形成完整窗口的剩余值。
无论如何,这是目前为止行数最少的解决方案,它唯一的要求是seq实现__iter__或__getitem__,并且除了@dansalmo的解决方案之外,不依赖于itertools或集合:)
>>> n, m = 6, 3
>>> k = n - m+1
>>> print ('{}\n'*(k)).format(*[range(i, i+m) for i in xrange(k)])
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
深度学习中滑动窗口数据的优化函数
def SlidingWindow(X, window_length, stride):
indexer = np.arange(window_length)[None, :] + stride*np.arange(int(len(X)/stride)-window_length+4)[:, None]
return X.take(indexer)
应用于多维数组
import numpy as np
def SlidingWindow(X, window_length, stride1):
stride= X.shape[1]*stride1
window_length = window_length*X.shape[1]
indexer = np.arange(window_length)[None, :] + stride1*np.arange(int(len(X)/stride1)-window_length-1)[:, None]
return X.take(indexer)
让我们让它变懒!
from itertools import islice, tee
def window(iterable, size):
iterators = tee(iterable, size)
iterators = [islice(iterator, i, None) for i, iterator in enumerate(iterators)]
yield from zip(*iterators)
list(window(range(5), 3))
# [(0, 1, 2), (1, 2, 3), (2, 3, 4)]