列表推导式与映射式

情况下

Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you're doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you're doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected. Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered 'unpythonic'. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway. Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you're mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]

Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) -- at least efficient in terms of memory (not necessarily in terms of time, where I'd expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“蟒蛇主义”

我不喜欢“pythonic”这个词，因为我觉得pythonic在我眼里并不总是优雅的。然而，map和filter以及类似的函数(比如非常有用的itertools模块)在风格上可能被认为是非python的。

懒惰

在效率方面，像大多数函数式编程结构一样，MAP可以是LAZY，实际上在python中是LAZY。这意味着你可以这样做(在python3中)，你的计算机将不会耗尽内存，丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

试着用一个列表理解来做这个:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意，列表推导式本身也是惰性的，但python选择将它们实现为非惰性的。尽管如此，python确实支持生成器表达式形式的惰性列表推导，如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

你基本上可以想到[…]将生成器表达式传递给列表构造函数，如list(x for x in range(5))。

简单的例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导式是非惰性的，因此可能需要更多内存(除非使用生成器推导式)。方括号[…]通常会让事情变得很明显，尤其是当括号乱七八糟的时候。另一方面，有时您最终会变得冗长，例如在....中键入[x for x只要保持迭代器变量短小，如果不缩进代码，列表推导式通常会更清晰。但是你总是可以缩进你的代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或者分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

地图现在是惰性的:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

然而，假设我们有一个预先创建的函数f，我们想要映射，我们忽略了map的惰性，立即强制用list(…)求值。我们得到了一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

重要的是要意识到这些测试假设一个非常简单的函数(恒等函数);不过，这也没关系，因为如果函数很复杂，那么与程序中的其他因素相比，性能开销就可以忽略不计了。(用f= x:x+x等简单的东西进行测试可能仍然很有趣)

如果你擅长阅读python程序集，你可以使用dis模块来查看幕后是否真的发生了什么:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE

似乎用……更好。语法比list(…)遗憾的是，map类对分解来说有点不透明，但我们可以通过速度测试来实现。

2011-06-20 05:41:27

我发现列表推导式通常比映射式更能表达我想要做的事情——它们都能完成，但前者节省了试图理解复杂lambda表达式的精神负担。

在某个地方也有一个采访(我不能马上找到)，Guido列出lambdas和函数函数是他最后悔接受Python的东西，所以你可以认为它们是非Python的。

2009-08-07 23:59:05

我尝试了@alex-martelli的代码，但发现了一些差异

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop

python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

即使对于非常大的范围，Map也需要相同的时间，而从我的代码中可以明显看出，使用列表理解需要花费大量时间。所以除了被认为是“非python的”，我没有遇到任何与map使用有关的性能问题。

2019-12-16 04:35:00

性能测量

图片来源:Experfy

你可以自己看看在列表理解和映射函数之间哪个更好。

(与map函数相比，列表理解处理100万条记录所需的时间更少。)

2021-01-14 16:44:23

Python 2:你应该使用map和filter而不是列表推导式。

一个客观的原因是，即使它们不是“Pythonic”，你也应该喜欢它们: 它们需要函数/lambdas作为参数，这引入了一个新的作用域。

我不止一次被这个问题困扰过:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果我说:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那一切都会好起来的。

你可以说我在相同的作用域中使用相同的变量名是愚蠢的。

我不是。代码本来是好的——两个x不在同一个作用域内。直到我将内部块移动到代码的不同部分后，问题才出现(即:问题发生在维护期间，而不是开发期间)，而且我没有预料到。

是的，如果你从来没有犯过这个错误，那么列表推导式会更优雅。但从个人经验(以及看到其他人犯同样的错误)来看，我已经见过很多次这样的情况，所以我认为当这些错误渗透到代码中时，不值得你经历这种痛苦。

结论:

使用映射和过滤器。它们可以防止微妙的、难以诊断的范围相关错误。

注:

不要忘记考虑使用imap和filter(在itertools中)，如果它们适合你的情况!

2012-11-20 22:28:55

情况下

Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you're doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you're doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected. Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered 'unpythonic'. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway. Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you're mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:

def sumEach(myLists):
    return [sum(_) for _ in myLists]

Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) -- at least efficient in terms of memory (not necessarily in terms of time, where I'd expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

“蟒蛇主义”

我不喜欢“pythonic”这个词，因为我觉得pythonic在我眼里并不总是优雅的。然而，map和filter以及类似的函数(比如非常有用的itertools模块)在风格上可能被认为是非python的。

懒惰

在效率方面，像大多数函数式编程结构一样，MAP可以是LAZY，实际上在python中是LAZY。这意味着你可以这样做(在python3中)，你的计算机将不会耗尽内存，丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

试着用一个列表理解来做这个:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意，列表推导式本身也是惰性的，但python选择将它们实现为非惰性的。尽管如此，python确实支持生成器表达式形式的惰性列表推导，如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

你基本上可以想到[…]将生成器表达式传递给列表构造函数，如list(x for x in range(5))。

简单的例子

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导式是非惰性的，因此可能需要更多内存(除非使用生成器推导式)。方括号[…]通常会让事情变得很明显，尤其是当括号乱七八糟的时候。另一方面，有时您最终会变得冗长，例如在....中键入[x for x只要保持迭代器变量短小，如果不缩进代码，列表推导式通常会更清晰。但是你总是可以缩进你的代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或者分手:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3的效率比较

地图现在是惰性的:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

然而，假设我们有一个预先创建的函数f，我们想要映射，我们忽略了map的惰性，立即强制用list(…)求值。我们得到了一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

重要的是要意识到这些测试假设一个非常简单的函数(恒等函数);不过，这也没关系，因为如果函数很复杂，那么与程序中的其他因素相比，性能开销就可以忽略不计了。(用f= x:x+x等简单的东西进行测试可能仍然很有趣)

如果你擅长阅读python程序集，你可以使用dis模块来查看幕后是否真的发生了什么:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE

似乎用……更好。语法比list(…)遗憾的是，map类对分解来说有点不透明，但我们可以通过速度测试来实现。

2011-06-20 05:41:27

列表推导式与映射式

推荐文章

最新文章

标签