是否有理由更喜欢使用map()而不是列表理解,反之亦然?它们中的任何一个通常比另一个更有效或被认为更python化吗?


当前回答

这里有一个可能的例子:

map(lambda op1,op2: op1*op2, list1, list2)

对比:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜,如果坚持使用列表推导式而不是映射,那么zip()是一种不幸的、不必要的开销。如果有人能肯定或否定地澄清这一点,那就太好了。

其他回答

Python 2:你应该使用map和filter而不是列表推导式。

一个客观的原因是,即使它们不是“Pythonic”,你也应该喜欢它们: 它们需要函数/lambdas作为参数,这引入了一个新的作用域。

我不止一次被这个问题困扰过:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果我说:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那一切都会好起来的。

你可以说我在相同的作用域中使用相同的变量名是愚蠢的。

我不是。代码本来是好的——两个x不在同一个作用域内。 直到我将内部块移动到代码的不同部分后,问题才出现(即:问题发生在维护期间,而不是开发期间),而且我没有预料到。

是的,如果你从来没有犯过这个错误,那么列表推导式会更优雅。 但从个人经验(以及看到其他人犯同样的错误)来看,我已经见过很多次这样的情况,所以我认为当这些错误渗透到代码中时,不值得你经历这种痛苦。

结论:

使用映射和过滤器。它们可以防止微妙的、难以诊断的范围相关错误。

注:

不要忘记考虑使用imap和filter(在itertools中),如果它们适合你的情况!

情况下

Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you're doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you're doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected. Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered 'unpythonic'. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway. Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you're mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]

Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) -- at least efficient in terms of memory (not necessarily in terms of time, where I'd expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“蟒蛇主义”

我不喜欢“pythonic”这个词,因为我觉得pythonic在我眼里并不总是优雅的。然而,map和filter以及类似的函数(比如非常有用的itertools模块)在风格上可能被认为是非python的。

懒惰

在效率方面,像大多数函数式编程结构一样,MAP可以是LAZY,实际上在python中是LAZY。这意味着你可以这样做(在python3中),你的计算机将不会耗尽内存,丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

试着用一个列表理解来做这个:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意,列表推导式本身也是惰性的,但python选择将它们实现为非惰性的。尽管如此,python确实支持生成器表达式形式的惰性列表推导,如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

你基本上可以想到[…]将生成器表达式传递给列表构造函数,如list(x for x in range(5))。

简单的例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导式是非惰性的,因此可能需要更多内存(除非使用生成器推导式)。方括号[…]通常会让事情变得很明显,尤其是当括号乱七八糟的时候。另一方面,有时您最终会变得冗长,例如在....中键入[x for x只要保持迭代器变量短小,如果不缩进代码,列表推导式通常会更清晰。但是你总是可以缩进你的代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或者分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

地图现在是惰性的:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

然而,假设我们有一个预先创建的函数f,我们想要映射,我们忽略了map的惰性,立即强制用list(…)求值。我们得到了一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

重要的是要意识到这些测试假设一个非常简单的函数(恒等函数);不过,这也没关系,因为如果函数很复杂,那么与程序中的其他因素相比,性能开销就可以忽略不计了。(用f= x:x+x等简单的东西进行测试可能仍然很有趣)

如果你擅长阅读python程序集,你可以使用dis模块来查看幕后是否真的发生了什么:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

似乎用……更好。语法比list(…)遗憾的是,map类对分解来说有点不透明,但我们可以通过速度测试来实现。

所以从Python 3开始,map()是一个迭代器,你需要记住你需要什么:一个迭代器或列表对象。

正如@AlexMartelli已经提到的,只有在不使用lambda函数的情况下,map()才比列表理解更快。

我会给你们看一些时间比较。

Python 3.5.2和CPythonI已经使用了Jupiter笔记本电脑,特别是%timeit内置的魔法命令 测量:s == 1000 ms == 1000 * 1000µs = 1000 * 1000 * 1000 ns

设置:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

内置函数:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda函数:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

还有类似生成器表达式的东西,参见PEP-0289。所以我认为把它添加到比较中是有用的

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

你需要列表对象:

如果是自定义函数,则使用列表推导式;如果是内置函数,则使用list(map())

你不需要列表对象,你只需要一个可迭代对象:

总是使用map()!

这里有一个可能的例子:

map(lambda op1,op2: op1*op2, list1, list2)

对比:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜,如果坚持使用列表推导式而不是映射,那么zip()是一种不幸的、不必要的开销。如果有人能肯定或否定地澄清这一点,那就太好了。

我的用例:

def sum_items(*args):
    return sum(args)


list_a = [1, 2, 3]
list_b = [1, 2, 3]

list_of_sums = list(map(sum_items,
                        list_a, list_b))
>>> [3, 6, 9]

comprehension = [sum(items) for items in iter(zip(list_a, list_b))]

我发现自己开始使用更多的map,我认为map可能比comp慢,因为传递和返回参数,这就是我找到这篇文章的原因。

我相信使用map可以更有可读性和灵活性,特别是当我需要构造列表的值时。

如果你用地图的话,你读的时候就明白了。

def pair_list_items(*args):
    return args

packed_list = list(map(pair_list_items,
                       lista, *listb, listc.....listn))

再加上灵活性奖励。 谢谢你其他的答案,再加上绩效奖金。