在python中联接列表的列表

是一个简短的语法连接列表列表到一个单一的列表(或迭代器)在python?

例如，我有一个列表，如下所示，我想迭代a,b和c。

x = [["a","b"], ["c"]]

我能想到的最好的是如下。

result = []
[ result.extend(el) for el in x] 

for el in result:
  print el

当前回答

I had a similar problem when I had to create a dictionary that contained the elements of an array and their count. The answer is relevant because, I flatten a list of lists, get the elements I need and then do a group and count. I used Python's map function to produce a tuple of element and it's count and groupby over the array. Note that the groupby takes the array element itself as the keyfunc. As a relatively new Python coder, I find it to me more easier to comprehend, while being Pythonic as well.

在我讨论代码之前，这里有一个我必须首先平化的数据示例:

{ "_id" : ObjectId("4fe3a90783157d765d000011"), "status" : [ "opencalais" ],
  "content_length" : 688, "open_calais_extract" : { "entities" : [
  {"type" :"Person","name" : "Iman Samdura","rel_score" : 0.223 }, 
  {"type" : "Company",  "name" : "Associated Press",    "rel_score" : 0.321 },          
  {"type" : "Country",  "name" : "Indonesia",   "rel_score" : 0.321 }, ... ]},
  "title" : "Indonesia Police Arrest Bali Bomb Planner", "time" : "06:42  ET",         
  "filename" : "021121bn.01", "month" : "November", "utctime" : 1037836800,
  "date" : "November 21, 2002", "news_type" : "bn", "day" : "21" }

是来自Mongo的查询结果。下面的代码将这样的列表集合平铺开来。

def flatten_list(items):
  return sorted([entity['name'] for entity in [entities for sublist in  
   [item['open_calais_extract']['entities'] for item in items] 
   for entities in sublist])

首先，我将提取所有的“实体”集合，然后对于每个实体集合，遍历字典并提取name属性。

2012-07-31 21:11:24

其他回答

如果你只深入一层，一个嵌套的理解也可以:

>>> x = [["a","b"], ["c"]]
>>> [inner
...     for outer in x
...         for inner in outer]
['a', 'b', 'c']

在一行上，它变成:

>>> [j for i in x for j in i]
['a', 'b', 'c']

2009-04-04 08:21:14

对于一级扁平化，如果你关心速度，在我尝试过的所有条件下，这比之前的任何答案都快。(也就是说，如果您需要结果作为列表。如果你只需要在运行中迭代它，那么链的例子可能更好。)它的工作原理是预先分配一个最终大小的列表，并按片复制部分(这是一种比任何迭代器方法都低级别的块复制):

def join(a):
    """Joins a sequence of sequences into a single sequence.  (One-level flattening.)
    E.g., join([(1,2,3), [4, 5], [6, (7, 8, 9), 10]]) = [1,2,3,4,5,6,(7,8,9),10]
    This is very efficient, especially when the subsequences are long.
    """
    n = sum([len(b) for b in a])
    l = [None]*n
    i = 0
    for b in a:
        j = i+len(b)
        l[i:j] = b
        i = j
    return l

带注释的排序时间列表:

[(0.5391559600830078, 'flatten4b'), # join() above. 
(0.5400412082672119, 'flatten4c'), # Same, with sum(len(b) for b in a) 
(0.5419249534606934, 'flatten4a'), # Similar, using zip() 
(0.7351131439208984, 'flatten1b'), # list(itertools.chain.from_iterable(a)) 
(0.7472689151763916, 'flatten1'), # list(itertools.chain(*a)) 
(1.5468521118164062, 'flatten3'), # [i for j in a for i in j] 
(26.696547985076904, 'flatten2')] # sum(a, [])

2012-01-29 02:51:41

这就是所谓的扁平化，有很多实现。

这个怎么样，尽管它只适用于1级深嵌套:

>>> x = [["a","b"], ["c"]]
>>> for el in sum(x, []):
...     print el
...
a
b
c

从这些链接中，显然最完整的-fast-elegant-etc实现如下:

def flatten(l, ltypes=(list, tuple)):
    ltype = type(l)
    l = list(l)
    i = 0
    while i < len(l):
        while isinstance(l[i], ltypes):
            if not l[i]:
                l.pop(i)
                i -= 1
                break
            else:
                l[i:i + 1] = l[i]
        i += 1
    return ltype(l)

2009-04-04 04:04:21

在我讨论代码之前，这里有一个我必须首先平化的数据示例:

{ "_id" : ObjectId("4fe3a90783157d765d000011"), "status" : [ "opencalais" ],
  "content_length" : 688, "open_calais_extract" : { "entities" : [
  {"type" :"Person","name" : "Iman Samdura","rel_score" : 0.223 }, 
  {"type" : "Company",  "name" : "Associated Press",    "rel_score" : 0.321 },          
  {"type" : "Country",  "name" : "Indonesia",   "rel_score" : 0.321 }, ... ]},
  "title" : "Indonesia Police Arrest Bali Bomb Planner", "time" : "06:42  ET",         
  "filename" : "021121bn.01", "month" : "November", "utctime" : 1037836800,
  "date" : "November 21, 2002", "news_type" : "bn", "day" : "21" }

是来自Mongo的查询结果。下面的代码将这样的列表集合平铺开来。

def flatten_list(items):
  return sorted([entity['name'] for entity in [entities for sublist in  
   [item['open_calais_extract']['entities'] for item in items] 
   for entities in sublist])

首先，我将提取所有的“实体”集合，然后对于每个实体集合，遍历字典并提取name属性。

2012-07-31 21:11:24

性能比较:

import itertools
import timeit
big_list = [[0]*1000 for i in range(1000)]
timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]

生产:

>>> import itertools
>>> import timeit
>>> big_list = [[0]*1000 for i in range(1000)]
>>> timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100)
[3.016212113769325, 3.0148865239060227, 3.0126415732791028]
>>> timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100)
[3.019953987082083, 3.528754223385439, 3.02181439266457]
>>> timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100)
[1.812084445152557, 1.7702404451095965, 1.7722977998725362]
>>> timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100)
[5.409658160700605, 5.477502077679354, 5.444318360412744]
>>> [100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]
[399.27587954973444, 400.9240571138051, 403.7521153804846]

这是在Windows XP 32位的Python 2.7.1上，但上面评论中的@temoto得到from_iterable比map+extend更快，所以它相当依赖于平台和输入。

不要使用sum(big_list， [])

2014-01-10 00:48:17

在python中联接列表的列表

推荐文章

最新文章

标签