冻结集就是冻结集。 冻结列表可以是元组。 冷冻字典会是什么?一个不可变的、可哈希的字典。
我猜它可能是collections.namedtuple之类的东西,但那更像是一个冻结的字典(一个半冻结的字典)。不是吗?
一个“frozendict”应该是一个冻结的字典,它应该有键,值,get等,并支持in, for等。
更新: 在这里:https://www.python.org/dev/peps/pep-0603
冻结集就是冻结集。 冻结列表可以是元组。 冷冻字典会是什么?一个不可变的、可哈希的字典。
我猜它可能是collections.namedtuple之类的东西,但那更像是一个冻结的字典(一个半冻结的字典)。不是吗?
一个“frozendict”应该是一个冻结的字典,它应该有键,值,get等,并支持in, for等。
更新: 在这里:https://www.python.org/dev/peps/pep-0603
假设字典的键和值本身是不可变的(例如字符串),那么:
>>> d
{'forever': 'atones', 'minks': 'cards', 'overhands': 'warranted',
'hardhearted': 'tartly', 'gradations': 'snorkeled'}
>>> t = tuple((k, d[k]) for k in sorted(d.keys()))
>>> hash(t)
1524953596
Python没有内置的frozendict类型。事实证明,这并不经常有用(尽管它仍然可能比frozenset更有用)。
需要这种类型的最常见原因是在记忆函数调用带有未知参数的函数时。存储dict(其中值是可哈希的)的可哈希等价对象的最常见解决方案是类似tuple(sorted(kwargs.items()))的东西。
这取决于排序是不是有点疯狂。Python不能肯定地保证排序会产生合理的结果。(但它不能承诺太多其他东西,所以不要太担心。)
你可以很容易地做一些类似字典的包装。它可能看起来像
import collections
class FrozenDict(collections.Mapping):
"""Don't forget the docstrings!!"""
def __init__(self, *args, **kwargs):
self._d = dict(*args, **kwargs)
self._hash = None
def __iter__(self):
return iter(self._d)
def __len__(self):
return len(self._d)
def __getitem__(self, key):
return self._d[key]
def __hash__(self):
# It would have been simpler and maybe more obvious to
# use hash(tuple(sorted(self._d.iteritems()))) from this discussion
# so far, but this solution is O(n). I don't know what kind of
# n we are going to run into, but sometimes it's hard to resist the
# urge to optimize when it will gain improved algorithmic performance.
if self._hash is None:
hash_ = 0
for pair in self.items():
hash_ ^= hash(pair)
self._hash = hash_
return self._hash
它应该工作得很好:
>>> x = FrozenDict(a=1, b=2)
>>> y = FrozenDict(a=1, b=2)
>>> x is y
False
>>> x == y
True
>>> x == {'a': 1, 'b': 2}
True
>>> d = {x: 'foo'}
>>> d[y]
'foo'
每次写这样的函数时,我都会想到frozendict:
def do_something(blah, optional_dict_parm=None):
if optional_dict_parm is None:
optional_dict_parm = {}
这是我一直在用的代码。我子类化了frozenset。这样做的优点如下。
这是一个真正的不可变对象。不依赖未来用户和开发人员的良好行为。 在常规字典和固定字典之间来回转换很容易。frozdict (orig_dict)—>冻结字典。字典(frozen_dict)—>常规字典。
2015年1月21日更新:我在2014年发布的原始代码使用for循环来查找匹配的键。这实在是太慢了。现在我已经组合了一个实现,它利用了frozenset的散列特性。键-值对存储在特殊的容器中,其中__hash__和__eq__函数仅基于键。这段代码也经过了正式的单元测试,这与我2014年8月在这里发布的代码不同。
MIT-style许可证。
if 3 / 2 == 1:
version = 2
elif 3 / 2 == 1.5:
version = 3
def col(i):
''' For binding named attributes to spots inside subclasses of tuple.'''
g = tuple.__getitem__
@property
def _col(self):
return g(self,i)
return _col
class Item(tuple):
''' Designed for storing key-value pairs inside
a FrozenDict, which itself is a subclass of frozenset.
The __hash__ is overloaded to return the hash of only the key.
__eq__ is overloaded so that normally it only checks whether the Item's
key is equal to the other object, HOWEVER, if the other object itself
is an instance of Item, it checks BOTH the key and value for equality.
WARNING: Do not use this class for any purpose other than to contain
key value pairs inside FrozenDict!!!!
The __eq__ operator is overloaded in such a way that it violates a
fundamental property of mathematics. That property, which says that
a == b and b == c implies a == c, does not hold for this object.
Here's a demonstration:
[in] >>> x = Item(('a',4))
[in] >>> y = Item(('a',5))
[in] >>> hash('a')
[out] >>> 194817700
[in] >>> hash(x)
[out] >>> 194817700
[in] >>> hash(y)
[out] >>> 194817700
[in] >>> 'a' == x
[out] >>> True
[in] >>> 'a' == y
[out] >>> True
[in] >>> x == y
[out] >>> False
'''
__slots__ = ()
key, value = col(0), col(1)
def __hash__(self):
return hash(self.key)
def __eq__(self, other):
if isinstance(other, Item):
return tuple.__eq__(self, other)
return self.key == other
def __ne__(self, other):
return not self.__eq__(other)
def __str__(self):
return '%r: %r' % self
def __repr__(self):
return 'Item((%r, %r))' % self
class FrozenDict(frozenset):
''' Behaves in most ways like a regular dictionary, except that it's immutable.
It differs from other implementations because it doesn't subclass "dict".
Instead it subclasses "frozenset" which guarantees immutability.
FrozenDict instances are created with the same arguments used to initialize
regular dictionaries, and has all the same methods.
[in] >>> f = FrozenDict(x=3,y=4,z=5)
[in] >>> f['x']
[out] >>> 3
[in] >>> f['a'] = 0
[out] >>> TypeError: 'FrozenDict' object does not support item assignment
FrozenDict can accept un-hashable values, but FrozenDict is only hashable if its values are hashable.
[in] >>> f = FrozenDict(x=3,y=4,z=5)
[in] >>> hash(f)
[out] >>> 646626455
[in] >>> g = FrozenDict(x=3,y=4,z=[])
[in] >>> hash(g)
[out] >>> TypeError: unhashable type: 'list'
FrozenDict interacts with dictionary objects as though it were a dict itself.
[in] >>> original = dict(x=3,y=4,z=5)
[in] >>> frozen = FrozenDict(x=3,y=4,z=5)
[in] >>> original == frozen
[out] >>> True
FrozenDict supports bi-directional conversions with regular dictionaries.
[in] >>> original = {'x': 3, 'y': 4, 'z': 5}
[in] >>> FrozenDict(original)
[out] >>> FrozenDict({'x': 3, 'y': 4, 'z': 5})
[in] >>> dict(FrozenDict(original))
[out] >>> {'x': 3, 'y': 4, 'z': 5} '''
__slots__ = ()
def __new__(cls, orig={}, **kw):
if kw:
d = dict(orig, **kw)
items = map(Item, d.items())
else:
try:
items = map(Item, orig.items())
except AttributeError:
items = map(Item, orig)
return frozenset.__new__(cls, items)
def __repr__(self):
cls = self.__class__.__name__
items = frozenset.__iter__(self)
_repr = ', '.join(map(str,items))
return '%s({%s})' % (cls, _repr)
def __getitem__(self, key):
if key not in self:
raise KeyError(key)
diff = self.difference
item = diff(diff({key}))
key, value = set(item).pop()
return value
def get(self, key, default=None):
if key not in self:
return default
return self[key]
def __iter__(self):
items = frozenset.__iter__(self)
return map(lambda i: i.key, items)
def keys(self):
items = frozenset.__iter__(self)
return map(lambda i: i.key, items)
def values(self):
items = frozenset.__iter__(self)
return map(lambda i: i.value, items)
def items(self):
items = frozenset.__iter__(self)
return map(tuple, items)
def copy(self):
cls = self.__class__
items = frozenset.copy(self)
dupl = frozenset.__new__(cls, items)
return dupl
@classmethod
def fromkeys(cls, keys, value):
d = dict.fromkeys(keys,value)
return cls(d)
def __hash__(self):
kv = tuple.__hash__
items = frozenset.__iter__(self)
return hash(frozenset(map(kv, items)))
def __eq__(self, other):
if not isinstance(other, FrozenDict):
try:
other = FrozenDict(other)
except Exception:
return False
return frozenset.__eq__(self, other)
def __ne__(self, other):
return not self.__eq__(other)
if version == 2:
#Here are the Python2 modifications
class Python2(FrozenDict):
def __iter__(self):
items = frozenset.__iter__(self)
for i in items:
yield i.key
def iterkeys(self):
items = frozenset.__iter__(self)
for i in items:
yield i.key
def itervalues(self):
items = frozenset.__iter__(self)
for i in items:
yield i.value
def iteritems(self):
items = frozenset.__iter__(self)
for i in items:
yield (i.key, i.value)
def has_key(self, key):
return key in self
def viewkeys(self):
return dict(self).viewkeys()
def viewvalues(self):
return dict(self).viewvalues()
def viewitems(self):
return dict(self).viewitems()
#If this is Python2, rebuild the class
#from scratch rather than use a subclass
py3 = FrozenDict.__dict__
py3 = {k: py3[k] for k in py3}
py2 = {}
py2.update(py3)
dct = Python2.__dict__
py2.update({k: dct[k] for k in dct})
FrozenDict = type('FrozenDict', (frozenset,), py2)
是的,这是我的第二个答案,但这是一个完全不同的方法。第一个实现是用纯python实现的。这是Cython的。如果您知道如何使用和编译Cython模块,这与使用常规字典一样快。大约0.04到0.06微秒来检索一个值。
这是frozen_dict。pyx文件
import cython
from collections import Mapping
cdef class dict_wrapper:
cdef object d
cdef int h
def __init__(self, *args, **kw):
self.d = dict(*args, **kw)
self.h = -1
def __len__(self):
return len(self.d)
def __iter__(self):
return iter(self.d)
def __getitem__(self, key):
return self.d[key]
def __hash__(self):
if self.h == -1:
self.h = hash(frozenset(self.d.iteritems()))
return self.h
class FrozenDict(dict_wrapper, Mapping):
def __repr__(self):
c = type(self).__name__
r = ', '.join('%r: %r' % (k,self[k]) for k in self)
return '%s({%s})' % (c, r)
__all__ = ['FrozenDict']
这是setup。py文件
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize('frozen_dict.pyx')
)
如果安装了Cython,请将上述两个文件保存到同一目录中。在命令行中移动到该目录。
python setup.py build_ext --inplace
python setup.py install
你应该做完了。
奇怪的是,尽管我们有很少有用的frozenset,但仍然没有冻结映射。这个想法在PEP 416中被否决了——添加一个frozendict内置类型。这个想法可能会在后面的Python版本中重新讨论,请参阅PEP 603—向集合添加frozenmap类型。
所以Python 2的解决方案是:
def foo(config={'a': 1}):
...
这似乎仍是常态:
def foo(config=None):
if config is None:
config = {'a': 1} # default config
...
在Python 3中,你可以这样做:
from types import MappingProxyType
default_config = {'a': 1}
DEFAULTS = MappingProxyType(default_config)
def foo(config=DEFAULTS):
...
现在,默认配置可以动态更新,但在您希望它不可变的地方保持不变,只需传递代理即可。
因此,default_config中的更改将按预期更新默认值,但不能写入映射代理对象本身。
诚然,它与“不可变的、可哈希的字典”不是一回事,但它可能是frozendict某些用例的体面替代品。
namedtuple的主要缺点是需要在使用之前指定它,因此对于单一用例来说不太方便。
然而,有一种实用的变通方法可以用来处理许多此类情况。让我们假设你想有一个不可变的等价物如下dict:
MY_CONSTANT = {
'something': 123,
'something_else': 456
}
可以这样模拟:
from collections import namedtuple
MY_CONSTANT = namedtuple('MyConstant', 'something something_else')(123, 456)
甚至还可以编写一个辅助函数来实现自动化:
def freeze_dict(data):
from collections import namedtuple
keys = sorted(data.keys())
frozen_type = namedtuple(''.join(keys), keys)
return frozen_type(**data)
a = {'foo':'bar', 'x':'y'}
fa = freeze_dict(data)
assert a['foo'] == fa.foo
当然,这只适用于平面字典,但实现递归版本应该不会太难。
你可以使用utispie包装的冷冻液:
>>> from utilspie.collectionsutils import frozendict
>>> my_dict = frozendict({1: 3, 4: 5})
>>> my_dict # object of `frozendict` type
frozendict({1: 3, 4: 5})
# Hashable
>>> {my_dict: 4}
{frozendict({1: 3, 4: 5}): 4}
# Immutable
>>> my_dict[1] = 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mquadri/workspace/utilspie/utilspie/collectionsutils/collections_utils.py", line 44, in __setitem__
self.__setitem__.__name__, type(self).__name__))
AttributeError: You can not call '__setitem__()' for 'frozendict' object
根据文件:
frozendict(dict_obj):接受dict类型的obj并返回一个可哈希且不可变的dict
子类化dict类型
我在野外(github)看到了这种模式,想提一下:
class FrozenDict(dict):
def __init__(self, *args, **kwargs):
self._hash = None
super(FrozenDict, self).__init__(*args, **kwargs)
def __hash__(self):
if self._hash is None:
self._hash = hash(tuple(sorted(self.items()))) # iteritems() on py2
return self._hash
def _immutable(self, *args, **kws):
raise TypeError('cannot change object - object is immutable')
# makes (deep)copy alot more efficient
def __copy__(self):
return self
def __deepcopy__(self, memo=None):
if memo is not None:
memo[id(self)] = self
return self
__setitem__ = _immutable
__delitem__ = _immutable
pop = _immutable
popitem = _immutable
clear = _immutable
update = _immutable
setdefault = _immutable
使用示例:
d1 = FrozenDict({'a': 1, 'b': 2})
d2 = FrozenDict({'a': 1, 'b': 2})
d1.keys()
assert isinstance(d1, dict)
assert len(set([d1, d2])) == 1 # hashable
Pros
支持get()、keys()、items() (py2上的iteritems())以及dict中的所有功能,而无需显式地实现它们 使用内部dict,这意味着性能(dict在CPython中是用c编写的) 优雅简单,没有黑魔法 isinstance(my_frozen_dict, dict)返回True -尽管python鼓励使用isinstance()对许多包进行duck-typing,但这可以节省许多调整和自定义
Cons
任何子类都可以覆盖它或在内部访问它(你不能真正100%保护python中的某些东西,你应该相信你的用户并提供良好的文档)。 如果你关心速度,你可能想让__hash__更快一点。
在没有本地语言支持的情况下,您可以自己动手,也可以使用现有的解决方案。幸运的是,Python使得扩展它们的基本实现变得非常简单。
class frozen_dict(dict):
def __setitem__(self, key, value):
raise Exception('Frozen dictionaries cannot be mutated')
frozen_dict = frozen_dict({'foo': 'FOO' })
print(frozen['foo']) # FOO
frozen['foo'] = 'NEWFOO' # Exception: Frozen dictionaries cannot be mutated
# OR
from types import MappingProxyType
frozen_dict = MappingProxyType({'foo': 'FOO'})
print(frozen_dict['foo']) # FOO
frozen_dict['foo'] = 'NEWFOO' # TypeError: 'mappingproxy' object does not support item assignment
安装frozendict
pip install frozendict
使用它!
from frozendict import frozendict
def smth(param = frozendict({})):
pass
没有fronzedict,但你可以使用MappingProxyType,它被添加到Python 3.3的标准库中:
>>> from types import MappingProxyType
>>> foo = MappingProxyType({'a': 1})
>>> foo
mappingproxy({'a': 1})
>>> foo['a'] = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment
>>> foo
mappingproxy({'a': 1})
我需要在某一时刻访问某个固定键用于某种全局常量类型的东西我确定了这样的东西:
class MyFrozenDict:
def __getitem__(self, key):
if key == 'mykey1':
return 0
if key == 'mykey2':
return "another value"
raise KeyError(key)
像这样使用它
a = MyFrozenDict()
print(a['mykey1'])
警告:对于大多数用例,我不建议这样做,因为它需要进行一些相当严重的权衡。
Freeze实现了可哈希的、类型提示的冻结集合(dict、list和set),并将递归地冻结你给他们的数据(如果可能的话)。
pip install frz
用法:
from freeze import FDict
a_mutable_dict = {
"list": [1, 2],
"set": {3, 4},
}
a_frozen_dict = FDict(a_mutable_dict)
print(repr(a_frozen_dict))
# FDict: {'list': FList: (1, 2), 'set': FSet: {3, 4}}