哈希字典?

为了缓存目的，我需要从字典中存在的GET参数生成一个缓存键。

目前，我正在使用sha1(repr(sorted(my_dict.items()))) (sha1()是一个内部使用hashlib的方便方法)，但我很好奇是否有更好的方法。

当前回答

您可以使用地图库来做到这一点。具体来说,地图。FrozenMap

import maps
fm = maps.FrozenMap(my_dict)
hash(fm)

要安装地图，只需执行:

pip install maps

它也处理嵌套的dict大小写:

import maps
fm = maps.FrozenMap.recurse(my_dict)
hash(fm)

免责声明:我是地图库的作者。

2018-11-08 19:08:59

其他回答

为了保持键顺序，而不是哈希(str(字典))或哈希(json.dumps(字典))，我更喜欢快速和肮脏的解决方案:

from pprint import pformat
h = hash(pformat(dictionary))

它甚至可以用于DateTime等不能序列化的JSON类型。

2015-01-30 00:45:17

解决这个问题的一种方法是用字典的元素创建一个元组:

hash(tuple(my_dict.items()))

2020-03-19 21:48:45

更新自2013年回复…

以上答案在我看来都不可靠。原因是使用了items()。据我所知，这是一个依赖于机器的顺序。

这个怎么样?

import hashlib

def dict_hash(the_dict, *ignore):
    if ignore:  # Sometimes you don't care about some items
        interesting = the_dict.copy()
        for item in ignore:
            if item in interesting:
                interesting.pop(item)
        the_dict = interesting
    result = hashlib.sha1(
        '%s' % sorted(the_dict.items())
    ).hexdigest()
    return result

2013-03-04 18:10:36

编辑:如果你所有的键都是字符串，那么在继续阅读这个答案之前，请参阅Jack O'Connor的更简单(更快)的解决方案(它也适用于嵌套字典)。

虽然答案已经被接受，但问题的标题是“哈希一个python字典”，关于这个标题的答案是不完整的。(关于问题的主体，答案是完整的。)

嵌套的字典

如果一个人在Stack Overflow上搜索如何散列字典，他可能会遇到这个恰当的标题问题，如果他试图散列多重嵌套字典，他可能会感到不满意。上面的答案在这种情况下不起作用，您必须实现某种递归机制来检索散列。

下面是一个这样的机制:

import copy

def make_hash(o):

  """
  Makes a hash from a dictionary, list, tuple or set to any level, that contains
  only other hashable types (including any lists, tuples, sets, and
  dictionaries).
  """

  if isinstance(o, (set, tuple, list)):

    return tuple([make_hash(e) for e in o])    

  elif not isinstance(o, dict):

    return hash(o)

  new_o = copy.deepcopy(o)
  for k, v in new_o.items():
    new_o[k] = make_hash(v)

  return hash(tuple(frozenset(sorted(new_o.items()))))

奖励:哈希对象和类

hash()函数在散列类或实例时工作得很好。然而，关于对象，我发现了一个关于哈希的问题:

class Foo(object): pass
foo = Foo()
print (hash(foo)) # 1209812346789
foo.a = 1
print (hash(foo)) # 1209812346789

哈希值是一样的，即使我改变了foo。这是因为foo的单位没有改变，所以哈希值是一样的。如果你想让foo根据它的当前定义进行不同的哈希，解决方案是哈希掉任何实际发生变化的东西。在本例中，__dict__属性:

class Foo(object): pass
foo = Foo()
print (make_hash(foo.__dict__)) # 1209812346789
foo.a = 1
print (make_hash(foo.__dict__)) # -78956430974785

唉，当你试图对类本身做同样的事情时:

print (make_hash(Foo.__dict__)) # TypeError: unhashable type: 'dict_proxy'

类__dict__属性不是一个普通的字典:

print (type(Foo.__dict__)) # type <'dict_proxy'>

这是一个类似于前面的机制，将适当地处理类:

import copy

DictProxyType = type(object.__dict__)

def make_hash(o):

  """
  Makes a hash from a dictionary, list, tuple or set to any level, that 
  contains only other hashable types (including any lists, tuples, sets, and
  dictionaries). In the case where other kinds of objects (like classes) need 
  to be hashed, pass in a collection of object attributes that are pertinent. 
  For example, a class can be hashed in this fashion:

    make_hash([cls.__dict__, cls.__name__])

  A function can be hashed like so:

    make_hash([fn.__dict__, fn.__code__])
  """

  if type(o) == DictProxyType:
    o2 = {}
    for k, v in o.items():
      if not k.startswith("__"):
        o2[k] = v
    o = o2  

  if isinstance(o, (set, tuple, list)):

    return tuple([make_hash(e) for e in o])    

  elif not isinstance(o, dict):

    return hash(o)

  new_o = copy.deepcopy(o)
  for k, v in new_o.items():
    new_o[k] = make_hash(v)

  return hash(tuple(frozenset(sorted(new_o.items()))))

你可以使用this返回一个包含任意数量元素的哈希元组:

# -7666086133114527897
print (make_hash(func.__code__))

# (-7666086133114527897, 3527539)
print (make_hash([func.__code__, func.__dict__]))

# (-7666086133114527897, 3527539, -509551383349783210)
print (make_hash([func.__code__, func.__dict__, func.__name__]))

注意:以上所有代码都假设Python 3.x。没有在早期版本中测试，尽管我假设make_hash()将在2.7.2中工作。至于让例子起作用，我确实知道

func.__code__

应该用

func.func_code

2012-01-03 15:05:37

这里有一个更清晰的解决方案。

def freeze(o):
  if isinstance(o,dict):
    return frozenset({ k:freeze(v) for k,v in o.items()}.items())

  if isinstance(o,list):
    return tuple([freeze(v) for v in o])

  return o


def make_hash(o):
    """
    makes a hash out of anything that contains only list,dict and hashable types including string and numeric types
    """
    return hash(freeze(o))

2014-02-06 21:13:38

推荐文章

最新文章

标签