我有这样的代码:

good = [x for x in mylist if x in goodvals]
bad = [x for x in mylist if x not in goodvals]

目标是根据mylist的内容是否满足条件,将它们拆分为另外两个列表。

我怎样才能做得更优雅呢?我能避免在mylist上做两个单独的迭代吗?我可以通过这样做来提高性能吗?


当前回答

有时候你并不需要清单上的另一半。 例如:

import sys
from itertools import ifilter

trustedPeople = sys.argv[1].split(',')
newName = sys.argv[2]

myFriends = ifilter(lambda x: x.startswith('Shi'), trustedPeople)

print '%s is %smy friend.' % (newName, newName not in myFriends 'not ' or '')

其他回答

Good = [x for x in mylist if x in goodvals] Bad = [x for x in mylist if x not in goodvals] 我怎样才能做得更优雅呢?

代码已经非常优雅了。

使用集合可能会有轻微的性能改进,但差异是微不足道的。基于集合的方法也会丢弃重复项,并且不会保留元素的顺序。我发现列表理解也更容易阅读。

事实上,我们甚至可以更简单地使用for循环:

good, bad = [], []

for x in mylist:
    if x in goodvals:
        good.append(f)
    else:
        bad.append(f)

这种方法可以更容易地添加额外的逻辑。例如,代码很容易被修改为丢弃None值:

good, bad = [], []

for x in mylist:
    if x is None:
        continue
    if x in goodvals:
        good.append(f)
    else:
        bad.append(f)
good.append(x) if x in goodvals else bad.append(x)

来自@dansalmo的这个优雅简洁的回答被埋没在评论中,所以我只是把它作为一个答案转发到这里,这样它就能得到应有的重视,尤其是对新读者来说。

完整的例子:

good, bad = [], []
for x in my_list:
    good.append(x) if x in goodvals else bad.append(x)

之前的答案似乎并不能满足我所有的四种强迫症:

尽可能的懒惰, 只对原始Iterable求值一次 每个项只计算谓词一次 提供良好的类型注释(适用于python 3.7)

我的解决方案并不漂亮,我不认为我可以推荐使用它,但它是:

def iter_split_on_predicate(predicate: Callable[[T], bool], iterable: Iterable[T]) -> Tuple[Iterator[T], Iterator[T]]:
    deque_predicate_true = deque()
    deque_predicate_false = deque()
    
    # define a generator function to consume the input iterable
    # the Predicate is evaluated once per item, added to the appropriate deque, and the predicate result it yielded 
    def shared_generator(definitely_an_iterator):
        for item in definitely_an_iterator:
            print("Evaluate predicate.")
            if predicate(item):
                deque_predicate_true.appendleft(item)
                yield True
            else:
                deque_predicate_false.appendleft(item)
                yield False
    
    # consume input iterable only once,
    # converting to an iterator with the iter() function if necessary. Probably this conversion is unnecessary
    shared_gen = shared_generator(
        iterable if isinstance(iterable, collections.abc.Iterator) else iter(iterable)
    )
    
    # define a generator function for each predicate outcome and queue
    def iter_for(predicate_value, hold_queue):
        def consume_shared_generator_until_hold_queue_contains_something():
            if not hold_queue:
                try:
                    while next(shared_gen) != predicate_value:
                        pass
                except:
                    pass
        
        consume_shared_generator_until_hold_queue_contains_something()
        while hold_queue:
            print("Yield where predicate is "+str(predicate_value))
            yield hold_queue.pop()
            consume_shared_generator_until_hold_queue_contains_something()
    
    # return a tuple of two generators  
    return iter_for(predicate_value=True, hold_queue=deque_predicate_true), iter_for(predicate_value=False, hold_queue=deque_predicate_false)

用下面的测试,我们从print语句中得到如下输出:

t,f = iter_split_on_predicate(lambda item:item>=10,[1,2,3,10,11,12,4,5,6,13,14,15])
print(list(zip(t,f)))
# Evaluate predicate.
# Evaluate predicate.
# Evaluate predicate.
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# Evaluate predicate.
# Evaluate predicate.
# Evaluate predicate.
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# Evaluate predicate.
# Yield where predicate is True
# Yield where predicate is False
# [(10, 1), (11, 2), (12, 3), (13, 4), (14, 5), (15, 6)]

如果你不介意使用一个外部库,有两个我知道本机实现这个操作:

>>> files = [ ('file1.jpg', 33, '.jpg'), ('file2.avi', 999, '.avi')]
>>> IMAGE_TYPES = ('.jpg','.jpeg','.gif','.bmp','.png')

iteration_utilities.partition: >>> from iteration_utilities import partition >>> notimages, images = partition(files, lambda x: x[2].lower() in IMAGE_TYPES) >>> notimages [('file2.avi', 999, '.avi')] >>> images [('file1.jpg', 33, '.jpg')] more_itertools.partition >>> from more_itertools import partition >>> notimages, images = partition(lambda x: x[2].lower() in IMAGE_TYPES, files) >>> list(notimages) # returns a generator so you need to explicitly convert to list. [('file2.avi', 999, '.avi')] >>> list(images) [('file1.jpg', 33, '.jpg')]

第一步(pre-OP-edit):使用集合:

mylist = [1,2,3,4,5,6,7]
goodvals = [1,3,7,8,9]

myset = set(mylist)
goodset = set(goodvals)

print list(myset.intersection(goodset))  # [1, 3, 7]
print list(myset.difference(goodset))    # [2, 4, 5, 6]

这对可读性(IMHO)和性能都有好处。

第二步(post-OP-edit):

创建一个好的扩展列表:

IMAGE_TYPES = set(['.jpg','.jpeg','.gif','.bmp','.png'])

这将提高性能。否则,你现在的情况在我看来还不错。