我有一个2项元组的列表,我想将它们转换为2个列表,其中第一个包含每个元组中的第一项,第二个列表包含第二项。
例如:
original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])
有没有内置函数可以做到这一点?
我有一个2项元组的列表,我想将它们转换为2个列表,其中第一个包含每个元组中的第一项,第二个列表包含第二项。
例如:
original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])
有没有内置函数可以做到这一点?
当前回答
因为它返回元组(可能会使用大量内存),所以对我来说,zip(*zipped)技巧似乎更聪明而不是有用。
这是一个函数,它会给出zip的倒数。
def unzip(zipped):
"""Inverse of built-in zip function.
Args:
zipped: a list of tuples
Returns:
a tuple of lists
Example:
a = [1, 2, 3]
b = [4, 5, 6]
zipped = list(zip(a, b))
assert zipped == [(1, 4), (2, 5), (3, 6)]
unzipped = unzip(zipped)
assert unzipped == ([1, 2, 3], [4, 5, 6])
"""
unzipped = ()
if len(zipped) == 0:
return unzipped
dim = len(zipped[0])
for i in range(dim):
unzipped = unzipped + ([tup[i] for tup in zipped], )
return unzipped
其他回答
总结一下:
# data
a = ('a', 'b', 'c', 'd')
b = (1, 2, 3, 4)
# forward
zipped = zip(a, b) # [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# reverse
a_, b_ = zip(*zipped)
# verify
assert a == a_
assert b == b_
因为它返回元组(可能会使用大量内存),所以对我来说,zip(*zipped)技巧似乎更聪明而不是有用。
这是一个函数,它会给出zip的倒数。
def unzip(zipped):
"""Inverse of built-in zip function.
Args:
zipped: a list of tuples
Returns:
a tuple of lists
Example:
a = [1, 2, 3]
b = [4, 5, 6]
zipped = list(zip(a, b))
assert zipped == [(1, 4), (2, 5), (3, 6)]
unzipped = unzip(zipped)
assert unzipped == ([1, 2, 3], [4, 5, 6])
"""
unzipped = ()
if len(zipped) == 0:
return unzipped
dim = len(zipped[0])
for i in range(dim):
unzipped = unzipped + ([tup[i] for tup in zipped], )
return unzipped
前面的答案都没有有效地提供所需的输出,即一个由列表组成的元组,而不是由元组组成的列表。对于前者,您可以使用tuple与map。区别在于:
res1 = list(zip(*original)) # [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
res2 = tuple(map(list, zip(*original))) # (['a', 'b', 'c', 'd'], [1, 2, 3, 4])
此外,前面的大多数解决方案都假设Python 2.7,其中zip返回一个列表而不是迭代器。
对于Python 3。X时,您需要将结果传递给list或tuple等函数以耗尽迭代器。对于内存效率高的迭代器,可以忽略各自解的外部列表和元组调用。
要获得一个列表的元组,如问题中所示:
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])
要将两个列表解压缩为单独的变量:
list1, list2 = [list(tup) for tup in zip(*original)]
天真的方法
def transpose_finite_iterable(iterable):
return zip(*iterable) # `itertools.izip` for Python 2 users
适用于有限可迭代对象(例如list/tuple/str等序列)的(可能是无限)可迭代对象,可以说明如下
| |a_00| |a_10| ... |a_n0| |
| |a_01| |a_11| ... |a_n1| |
| |... | |... | ... |... | |
| |a_0i| |a_1i| ... |a_ni| |
| |... | |... | ... |... | |
在哪里
N在ℕ, A_ij对应第i个迭代对象的第j个元素,
在应用transpose_finite_iterable之后,我们得到
| |a_00| |a_01| ... |a_0i| ... |
| |a_10| |a_11| ... |a_1i| ... |
| |... | |... | ... |... | ... |
| |a_n0| |a_n1| ... |a_ni| ... |
Python例子,a_ij == j, n == 2
>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterable(iterable)
>>> next(result)
(0, 0)
>>> next(result)
(1, 1)
但我们不能再次使用transpose_finite_iterable来返回原始iterable的结构,因为result是有限可迭代对象(在我们的例子中是元组)的无限可迭代对象:
>>> transpose_finite_iterable(result)
... hangs ...
Traceback (most recent call last):
File "...", line 1, in ...
File "...", line 2, in transpose_finite_iterable
MemoryError
那么我们如何处理这种情况呢?
... 接下来是deque
在我们看了itertools的文档之后。tee函数,有一个Python recipe,经过一些修改可以帮助我们的情况
def transpose_finite_iterables(iterable):
iterator = iter(iterable)
try:
first_elements = next(iterator)
except StopIteration:
return ()
queues = [deque([element])
for element in first_elements]
def coordinate(queue):
while True:
if not queue:
try:
elements = next(iterator)
except StopIteration:
return
for sub_queue, element in zip(queues, elements):
sub_queue.append(element)
yield queue.popleft()
return tuple(map(coordinate, queues))
让我们检查
>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterables(transpose_finite_iterable(iterable))
>>> result
(<generator object transpose_finite_iterables.<locals>.coordinate at ...>, <generator object transpose_finite_iterables.<locals>.coordinate at ...>)
>>> next(result[0])
0
>>> next(result[0])
1
合成
现在,我们可以使用functools定义用于处理可迭代对象的可迭代对象的通用函数,其中一个是有限的,另一个可能是无限的。单分派装饰器
from collections import (abc,
deque)
from functools import singledispatch
@singledispatch
def transpose(object_):
"""
Transposes given object.
"""
raise TypeError('Unsupported object type: {type}.'
.format(type=type))
@transpose.register(abc.Iterable)
def transpose_finite_iterables(object_):
"""
Transposes given iterable of finite iterables.
"""
iterator = iter(object_)
try:
first_elements = next(iterator)
except StopIteration:
return ()
queues = [deque([element])
for element in first_elements]
def coordinate(queue):
while True:
if not queue:
try:
elements = next(iterator)
except StopIteration:
return
for sub_queue, element in zip(queues, elements):
sub_queue.append(element)
yield queue.popleft()
return tuple(map(coordinate, queues))
def transpose_finite_iterable(object_):
"""
Transposes given finite iterable of iterables.
"""
yield from zip(*object_)
try:
transpose.register(abc.Collection, transpose_finite_iterable)
except AttributeError:
# Python3.5-
transpose.register(abc.Mapping, transpose_finite_iterable)
transpose.register(abc.Sequence, transpose_finite_iterable)
transpose.register(abc.Set, transpose_finite_iterable)
在有限非空可迭代对象上的二元运算符类中,它可以被认为是它自己的逆(数学家称这种函数为“对合”)。
作为单分派的额外好处,我们可以像这样处理numpy数组
import numpy as np
...
transpose.register(np.ndarray, np.transpose)
然后像这样使用它
>>> array = np.arange(4).reshape((2,2))
>>> array
array([[0, 1],
[2, 3]])
>>> transpose(array)
array([[0, 2],
[1, 3]])
Note
由于转置返回迭代器,如果有人想要一个像OP中那样由列表组成的元组——这可以用map内置函数如
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple(map(list, transpose(original)))
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])
广告
我在0.5.0版本的lz包中添加了通用解决方案,可以像这样使用
>>> from lz.transposition import transpose
>>> list(map(tuple, transpose(zip(range(10), range(10, 20)))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)]
P.S.
没有解决方案(至少是明显的)来处理潜在无限迭代对象的潜在无限迭代对象,但这种情况不太常见。