最近我注意到,当我转换一个列表来设置元素的顺序是改变的,并按字符排序。
想想这个例子:
x=[1,2,20,6,210]
print(x)
# [1, 2, 20, 6, 210] # the order is same as initial order
set(x)
# set([1, 2, 20, 210, 6]) # in the set(x) output order is sorted
我的问题是
为什么会这样?
如何才能在不丢失初始顺序的情况下进行设置操作(特别是设置差异)?
有趣的是,人们总是用“现实问题”来开理论科学定义的玩笑。
如果设置有顺序,首先需要解决以下问题。
如果你的列表有重复的元素,当你把它变成一个集合时,顺序应该是什么?如果我们合并两个集合,顺序是什么?如果我们在相同的元素上相交两个不同顺序的集合是什么顺序?
另外,set在搜索特定键时要快得多,这在set操作中非常好(这就是为什么你需要set,而不是list)。
如果您真的关心索引,只需将其保存为列表即可。如果您仍然想对许多列表中的元素执行set操作,最简单的方法是为每个具有相同键的列表创建一个字典,并创建一个list值,其中包含原始列表中键的所有索引。
def indx_dic(l):
dic = {}
for i in range(len(l)):
if l[i] in dic:
dic.get(l[i]).append(i)
else:
dic[l[i]] = [i]
return(dic)
a = [1,2,3,4,5,1,3,2]
set_a = set(a)
dic_a = indx_dic(a)
print(dic_a)
# {1: [0, 5], 2: [1, 7], 3: [2, 6], 4: [3], 5: [4]}
print(set_a)
# {1, 2, 3, 4, 5}
A set is an unordered data structure, so it does not preserve the insertion order.
This depends on your requirements. If you have an normal list, and want to remove some set of elements while preserving the order of the list, you can do this with a list comprehension:
>>> a = [1, 2, 20, 6, 210]
>>> b = set([6, 20, 1])
>>> [x for x in a if x not in b]
[2, 210]
If you need a data structure that supports both fast membership tests and preservation of insertion order, you can use the keys of a Python dictionary, which starting from Python 3.7 is guaranteed to preserve the insertion order:
>>> a = dict.fromkeys([1, 2, 20, 6, 210])
>>> b = dict.fromkeys([6, 20, 1])
>>> dict.fromkeys(x for x in a if x not in b)
{2: None, 210: None}
b doesn't really need to be ordered here – you could use a set as well. Note that a.keys() - b.keys() returns the set difference as a set, so it won't preserve the insertion order.
In older versions of Python, you can use collections.OrderedDict instead:
>>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
>>> b = collections.OrderedDict.fromkeys([6, 20, 1])
>>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
OrderedDict([(2, None), (210, None)])
有趣的是,人们总是用“现实问题”来开理论科学定义的玩笑。
如果设置有顺序,首先需要解决以下问题。
如果你的列表有重复的元素,当你把它变成一个集合时,顺序应该是什么?如果我们合并两个集合,顺序是什么?如果我们在相同的元素上相交两个不同顺序的集合是什么顺序?
另外,set在搜索特定键时要快得多,这在set操作中非常好(这就是为什么你需要set,而不是list)。
如果您真的关心索引,只需将其保存为列表即可。如果您仍然想对许多列表中的元素执行set操作,最简单的方法是为每个具有相同键的列表创建一个字典,并创建一个list值,其中包含原始列表中键的所有索引。
def indx_dic(l):
dic = {}
for i in range(len(l)):
if l[i] in dic:
dic.get(l[i]).append(i)
else:
dic[l[i]] = [i]
return(dic)
a = [1,2,3,4,5,1,3,2]
set_a = set(a)
dic_a = indx_dic(a)
print(dic_a)
# {1: [0, 5], 2: [1, 7], 3: [2, 6], 4: [3], 5: [4]}
print(set_a)
# {1, 2, 3, 4, 5}