2024-10-29 07:00:04

计算列表差值

在Python中,计算两个列表之间的差值的最佳方法是什么?

例子

A = [1,2,3,4]
B = [2,5]

A - B = [1,3,4]
B - A = [5]

当前回答

如果你不关心项目的顺序或重复,请使用set。使用列表推导式:

>>> def diff(first, second):
        second = set(second)
        return [item for item in first if item not in second]

>>> diff(A, B)
[1, 3, 4]
>>> diff(B, A)
[5]
>>> 

其他回答

一个衬套:

diff = lambda l1,l2: [x for x in l1 if x not in l2]
diff(A,B)
diff(B,A)

Or:

diff = lambda l1,l2: filter(lambda x: x not in l2, l1)
diff(A,B)
diff(B,A)

添加一个答案来处理我们想要严格区别重复的情况,也就是说,在第一个列表中有我们想要保留在结果中的重复。例如,得到,

[1, 1, 1, 2] - [1, 1] --> [1, 2]

我们可以用一个额外的计数器来得到一个优雅的差分函数。

from collections import Counter

def diff(first, second):
    secondCntr = Counter(second)
    second = set(second)
    res = []
    for i in first:
        if i not in second:
            res.append(i)
        elif i in secondCntr:
            if secondCntr[i] > 0:
                secondCntr[i] -= 1
            else:
                res.append(i)        
    return res

如果你不关心项目的顺序或重复,请使用set。使用列表推导式:

>>> def diff(first, second):
        second = set(second)
        return [item for item in first if item not in second]

>>> diff(A, B)
[1, 3, 4]
>>> diff(B, A)
[5]
>>> 
A = [1,2,3,4]
B = [2,5]

#A - B
x = list(set(A) - set(B))
#B - A 
y = list(set(B) - set(A))

print x
print y 

当查看in -operator的TimeComplexity时,在最坏的情况下它与O(n)一起工作。即使是集合。

因此,当比较两个数组时,最好情况下的TimeComplexity为O(n),最坏情况下为O(n²)。

另一种(但不幸的是更复杂)解决方案,在最好和最坏的情况下都适用于O(n):

# Compares the difference of list a and b
# uses a callback function to compare items
def diff(a, b, callback):
  a_missing_in_b = []
  ai = 0
  bi = 0

  a = sorted(a, callback)
  b = sorted(b, callback)

  while (ai < len(a)) and (bi < len(b)):

    cmp = callback(a[ai], b[bi])
    if cmp < 0:
      a_missing_in_b.append(a[ai])
      ai += 1
    elif cmp > 0:
      # Item b is missing in a
      bi += 1
    else:
      # a and b intersecting on this item
      ai += 1
      bi += 1

  # if a and b are not of same length, we need to add the remaining items
  for ai in xrange(ai, len(a)):
    a_missing_in_b.append(a[ai])


  return a_missing_in_b

e.g.

>>> a=[1,2,3]
>>> b=[2,4,6]
>>> diff(a, b, cmp)
[1, 3]