我有这样的代码:
good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]
目标是根据mylist的内容是否满足条件,将它们拆分为另外两个列表。
我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?
我有这样的代码:
good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]
目标是根据mylist的内容是否满足条件,将它们拆分为另外两个列表。
我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?
当前回答
一个基于生成器的版本,如果你能忍受一个或两个原始列表的反转。
设置…
random.seed(1234)
a = list(range(10))
random.shuffle(a)
a
[2, 8, 3, 5, 6, 4, 9, 0, 1, 7]
至于分裂……
(list((a.pop(j) for j, y in [(len(a)-i-1, x) for i,x in enumerate(a[::-1])] if y%2 == 0))[::-1], a)
([2, 8, 6, 4, 0], [3, 5, 9, 1, 7])
Another list of tuples of locations and each element is built in reverse order. In a generator wrapped round that each element is tested against the predicate (here test for even) and if True then the element is poped using previously computed locations. We are working backwards along the list so poping elements out does not change positions closer to the beginning of the list. A wrapping list() evaluates the generator and a final revers [::-1] puts the elements back in the right order. The original list "a" now only contains elements that for which the predicate is False.
其他回答
有时候你并不需要清单上的另一半。 例如:
import sys
from itertools import ifilter
trustedPeople = sys.argv[1].split(',')
newName = sys.argv[2]
myFriends = ifilter(lambda x: x.startswith('Shi'), trustedPeople)
print '%s is %smy friend.' % (newName, newName not in myFriends 'not ' or '')
如果你不介意使用一个外部库,有两个我知道本机实现这个操作:
>>> files = [ ('file1.jpg', 33, '.jpg'), ('file2.avi', 999, '.avi')]
>>> IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')
iteration_utilities.partition: >>> from iteration_utilities import partition >>> notimages, images = partition(files, lambda x: x[2].lower() in IMAGE_TYPES) >>> notimages [('file2.avi', 999, '.avi')] >>> images [('file1.jpg', 33, '.jpg')] more_itertools.partition >>> from more_itertools import partition >>> notimages, images = partition(lambda x: x[2].lower() in IMAGE_TYPES, files) >>> list(notimages) # returns a generator so you need to explicitly convert to list. [('file2.avi', 999, '.avi')] >>> list(images) [('file1.jpg', 33, '.jpg')]
受到@gnibbler伟大(但简洁!)回答的启发,我们可以应用该方法映射到多个分区:
from collections import defaultdict
def splitter(l, mapper):
"""Split an iterable into multiple partitions generated by a callable mapper."""
results = defaultdict(list)
for x in l:
results[mapper(x)] += [x]
return results
然后可以使用splitter,如下所示:
>>> l = [1, 2, 3, 4, 2, 3, 4, 5, 6, 4, 3, 2, 3]
>>> split = splitter(l, lambda x: x % 2 == 0) # partition l into odds and evens
>>> split.items()
>>> [(False, [1, 3, 3, 5, 3, 3]), (True, [2, 4, 2, 4, 6, 4, 2])]
这适用于有更复杂映射的两个以上分区(也适用于迭代器):
>>> import math
>>> l = xrange(1, 23)
>>> split = splitter(l, lambda x: int(math.log10(x) * 5))
>>> split.items()
[(0, [1]),
(1, [2]),
(2, [3]),
(3, [4, 5, 6]),
(4, [7, 8, 9]),
(5, [10, 11, 12, 13, 14, 15]),
(6, [16, 17, 18, 19, 20, 21, 22])]
或者用字典来映射:
>>> map = {'A': 1, 'X': 2, 'B': 3, 'Y': 1, 'C': 2, 'Z': 3}
>>> l = ['A', 'B', 'C', 'C', 'X', 'Y', 'Z', 'A', 'Z']
>>> split = splitter(l, map.get)
>>> split.items()
(1, ['A', 'Y', 'A']), (2, ['C', 'C', 'X']), (3, ['B', 'Z', 'Z'])]
def partition(pred, iterable):
'Use a predicate to partition entries into false entries and true entries'
# partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
检查这个
下面是惰性迭代器方法:
from itertools import tee
def split_on_condition(seq, condition):
l1, l2 = tee((condition(item), item) for item in seq)
return (i for p, i in l1 if p), (i for p, i in l2 if not p)
它对每个项计算一次条件,并返回两个生成器,第一个生成条件为真时序列中的值,另一个生成条件为假时序列中的值。
因为它是惰性的,你可以在任何迭代器上使用它,甚至是无限迭代器:
from itertools import count, islice
def is_prime(n):
return n > 1 and all(n % i for i in xrange(2, n))
primes, not_primes = split_on_condition(count(), is_prime)
print("First 10 primes", list(islice(primes, 10)))
print("First 10 non-primes", list(islice(not_primes, 10)))
通常情况下,非惰性列表返回方法会更好:
def split_on_condition(seq, condition):
a, b = [], []
for item in seq:
(a if condition(item) else b).append(item)
return a, b
编辑:对于您更具体的用例,将项目按某些键分割到不同的列表中,这里有一个通用函数:
DROP_VALUE = lambda _:_
def split_by_key(seq, resultmapping, keyfunc, default=DROP_VALUE):
"""Split a sequence into lists based on a key function.
seq - input sequence
resultmapping - a dictionary that maps from target lists to keys that go to that list
keyfunc - function to calculate the key of an input value
default - the target where items that don't have a corresponding key go, by default they are dropped
"""
result_lists = dict((key, []) for key in resultmapping)
appenders = dict((key, result_lists[target].append) for target, keys in resultmapping.items() for key in keys)
if default is not DROP_VALUE:
result_lists.setdefault(default, [])
default_action = result_lists[default].append
else:
default_action = DROP_VALUE
for item in seq:
appenders.get(keyfunc(item), default_action)(item)
return result_lists
用法:
def file_extension(f):
return f[2].lower()
split_files = split_by_key(files, {'images': IMAGE_TYPES}, keyfunc=file_extension, default='anims')
print split_files['images']
print split_files['anims']