在.NET中,GetHashCode方法在整个.NET基类库的许多地方都使用。正确执行它对于在集合中或确定相等时快速查找项目尤为重要。
对于如何为自定义类实现GetHashCode,是否有标准算法或最佳实践,以便不会降低性能?
在.NET中,GetHashCode方法在整个.NET基类库的许多地方都使用。正确执行它对于在集合中或确定相等时快速查找项目尤为重要。
对于如何为自定义类实现GetHashCode,是否有标准算法或最佳实践,以便不会降低性能?
当前回答
如果我们的财产不超过8处(希望如此),这里还有另一种选择。
ValueTuple是一个结构,似乎有一个可靠的GetHashCode实现。
这意味着我们可以简单地这样做:
// Yay, no allocations and no custom implementations!
public override int GetHashCode() => (this.PropA, this.PropB).GetHashCode();
让我们来看看.NETCore当前对ValueTuple的GetHashCode的实现。
这来自ValueTuple:
internal static int CombineHashCodes(int h1, int h2)
{
return HashHelpers.Combine(HashHelpers.Combine(HashHelpers.RandomSeed, h1), h2);
}
internal static int CombineHashCodes(int h1, int h2, int h3)
{
return HashHelpers.Combine(CombineHashCodes(h1, h2), h3);
}
这来自HashHelper:
public static readonly int RandomSeed = Guid.NewGuid().GetHashCode();
public static int Combine(int h1, int h2)
{
unchecked
{
// RyuJIT optimizes this to use the ROL instruction
// Related GitHub pull request: dotnet/coreclr#1830
uint rol5 = ((uint)h1 << 5) | ((uint)h1 >> 27);
return ((int)rol5 + h1) ^ h2;
}
}
英语:
向左旋转(循环移位)h1 5个位置。将结果和h1相加。将结果与h2进行异或运算。首先对{static random seed,h1}执行上述操作。对于每个其他项目,对上一个结果和下一个项目(例如h2)执行操作。
如果能更多地了解这个ROL-5散列代码算法的财产,那就太好了。
遗憾的是,为我们自己的GetHashCode延迟ValueTuple可能不像我们希望的那样快。相关讨论中的这条评论说明了直接调用HashHelpers.Combine更具性能。另一方面,这是内部的,所以我们必须复制代码,牺牲了我们在这里获得的大部分。此外,我们将负责记住首先与随机种子结合。我不知道如果我们跳过这一步会有什么后果。
其他回答
我通常会使用Josh Bloch精彩的有效Java中给出的实现。它很快,创建了一个很好的哈希,不太可能导致冲突。选择两个不同的素数,例如17和23,并执行以下操作:
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + field1.GetHashCode();
hash = hash * 23 + field2.GetHashCode();
hash = hash * 23 + field3.GetHashCode();
return hash;
}
}
正如评论中所指出的,你可能会发现最好选择一个大素数来乘。很显然486187639很好。。。虽然我见过的大多数小数字的例子都倾向于使用素数,但至少有一些类似的算法经常使用非素数。例如,在后面的FNV示例中,我使用的数字显然很好,但初始值不是质数。(不过乘法常数是质数。我不知道这有多重要。)
这比XORing散列码的常见做法要好,主要原因有两个。假设我们有一个具有两个int字段的类型:
XorHash(x, x) == XorHash(y, y) == 0 for all x, y
XorHash(x, y) == XorHash(y, x) for all x, y
顺便说一下,早期的算法是C#编译器当前用于匿名类型的算法。
这个页面提供了很多选项。我认为,在大多数情况下,上述内容“足够好”,而且非常容易记住并正确理解。FNV替代方案同样简单,但使用不同的常数和XOR代替ADD作为组合操作。它看起来像下面的代码,但正常的FNV算法对单个字节进行操作,因此这需要进行修改,以每个字节执行一次迭代,而不是每个32位哈希值。FNV也设计用于可变长度的数据,而我们在这里使用它的方式总是用于相同数量的字段值。对这个答案的评论表明,这里的代码实际上并不像上面的添加方法那样有效(在测试的示例案例中)。
// Note: Not quite FNV!
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = (int) 2166136261;
// Suitable nullity checks etc, of course :)
hash = (hash * 16777619) ^ field1.GetHashCode();
hash = (hash * 16777619) ^ field2.GetHashCode();
hash = (hash * 16777619) ^ field3.GetHashCode();
return hash;
}
}
请注意,需要注意的一点是,理想情况下,您应该防止在将其添加到依赖于哈希代码的集合后,对等式敏感(因此对哈希代码敏感)的状态发生变化。
根据文件:
可以为不可变引用类型重写GetHashCode。通常,对于可变引用类型,只有在以下情况下才应重写GetHashCode:您可以从不可变的字段计算哈希代码;或当可变对象包含在依赖其哈希代码的集合中时,可以确保该对象的哈希代码不会更改。
FNV文章的链接已断开,但这是互联网档案馆的一份副本:永恒的困惑-哈希的艺术
这是我使用JonSkeet实现的助手类。
public static class HashCode
{
public const int Start = 17;
public static int Hash<T>(this int hash, T obj)
{
var h = EqualityComparer<T>.Default.GetHashCode(obj);
return unchecked((hash * 31) + h);
}
}
用法:
public override int GetHashCode()
{
return HashCode.Start
.Hash(_field1)
.Hash(_field2)
.Hash(_field3);
}
如果要避免为System.Int32编写扩展方法:
public readonly struct HashCode
{
private readonly int _value;
public HashCode(int value) => _value = value;
public static HashCode Start { get; } = new HashCode(17);
public static implicit operator int(HashCode hash) => hash._value;
public HashCode Hash<T>(T obj)
{
var h = EqualityComparer<T>.Default.GetHashCode(obj);
return unchecked(new HashCode((_value * 31) + h));
}
public override int GetHashCode() => _value;
}
它仍然避免了任何堆分配,使用方式完全相同:
public override int GetHashCode()
{
// This time `HashCode.Start` is not an `Int32`, it's a `HashCode` instance.
// And the result is implicitly converted to `Int32`.
return HashCode.Start
.Hash(_field1)
.Hash(_field2)
.Hash(_field3);
}
编辑(2018年5月):EqualityComparer<T>。默认getter现在是JIT内在的-Stephen Toub在这篇博文中提到了pull请求。
这是一个实现Josh Bloch实现的静态助手类;并且提供了显式重载来“防止”装箱,并且还专门为长原语实现哈希。
您可以传递与equals实现匹配的字符串比较。
因为Hash输出始终是int,所以您可以只链接Hash调用。
using System;
using System.Collections;
using System.Collections.Generic;
using System.Reflection;
using System.Runtime.CompilerServices;
namespace Sc.Util.System
{
/// <summary>
/// Static methods that allow easy implementation of hashCode. Example usage:
/// <code>
/// public override int GetHashCode()
/// => HashCodeHelper.Seed
/// .Hash(primitiveField)
/// .Hsh(objectField)
/// .Hash(iEnumerableField);
/// </code>
/// </summary>
public static class HashCodeHelper
{
/// <summary>
/// An initial value for a hashCode, to which is added contributions from fields.
/// Using a non-zero value decreases collisions of hashCode values.
/// </summary>
public const int Seed = 23;
private const int oddPrimeNumber = 37;
/// <summary>
/// Rotates the seed against a prime number.
/// </summary>
/// <param name="aSeed">The hash's first term.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static int rotateFirstTerm(int aSeed)
{
unchecked {
return HashCodeHelper.oddPrimeNumber * aSeed;
}
}
/// <summary>
/// Contributes a boolean to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aBoolean">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, bool aBoolean)
{
unchecked {
return HashCodeHelper.rotateFirstTerm(aSeed)
+ (aBoolean
? 1
: 0);
}
}
/// <summary>
/// Contributes a char to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aChar">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, char aChar)
{
unchecked {
return HashCodeHelper.rotateFirstTerm(aSeed)
+ aChar;
}
}
/// <summary>
/// Contributes an int to the developing HashCode seed.
/// Note that byte and short are handled by this method, through implicit conversion.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aInt">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, int aInt)
{
unchecked {
return HashCodeHelper.rotateFirstTerm(aSeed)
+ aInt;
}
}
/// <summary>
/// Contributes a long to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aLong">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, long aLong)
{
unchecked {
return HashCodeHelper.rotateFirstTerm(aSeed)
+ (int)(aLong ^ (aLong >> 32));
}
}
/// <summary>
/// Contributes a float to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aFloat">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, float aFloat)
{
unchecked {
return HashCodeHelper.rotateFirstTerm(aSeed)
+ Convert.ToInt32(aFloat);
}
}
/// <summary>
/// Contributes a double to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aDouble">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, double aDouble)
=> aSeed.Hash(Convert.ToInt64(aDouble));
/// <summary>
/// Contributes a string to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aString">The value to contribute.</param>
/// <param name="stringComparison">Optional comparison that creates the hash.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(
this int aSeed,
string aString,
StringComparison stringComparison = StringComparison.Ordinal)
{
if (aString == null)
return aSeed.Hash(0);
switch (stringComparison) {
case StringComparison.CurrentCulture :
return StringComparer.CurrentCulture.GetHashCode(aString);
case StringComparison.CurrentCultureIgnoreCase :
return StringComparer.CurrentCultureIgnoreCase.GetHashCode(aString);
case StringComparison.InvariantCulture :
return StringComparer.InvariantCulture.GetHashCode(aString);
case StringComparison.InvariantCultureIgnoreCase :
return StringComparer.InvariantCultureIgnoreCase.GetHashCode(aString);
case StringComparison.OrdinalIgnoreCase :
return StringComparer.OrdinalIgnoreCase.GetHashCode(aString);
default :
return StringComparer.Ordinal.GetHashCode(aString);
}
}
/// <summary>
/// Contributes a possibly-null array to the developing HashCode seed.
/// Each element may be a primitive, a reference, or a possibly-null array.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aArray">CAN be null.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, IEnumerable aArray)
{
if (aArray == null)
return aSeed.Hash(0);
int countPlusOne = 1; // So it differs from null
foreach (object item in aArray) {
++countPlusOne;
if (item is IEnumerable arrayItem) {
if (!object.ReferenceEquals(aArray, arrayItem))
aSeed = aSeed.Hash(arrayItem); // recursive call!
} else
aSeed = aSeed.Hash(item);
}
return aSeed.Hash(countPlusOne);
}
/// <summary>
/// Contributes a possibly-null array to the developing HashCode seed.
/// You must provide the hash function for each element.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aArray">CAN be null.</param>
/// <param name="hashElement">Required: yields the hash for each element
/// in <paramref name="aArray"/>.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash<T>(this int aSeed, IEnumerable<T> aArray, Func<T, int> hashElement)
{
if (aArray == null)
return aSeed.Hash(0);
int countPlusOne = 1; // So it differs from null
foreach (T item in aArray) {
++countPlusOne;
aSeed = aSeed.Hash(hashElement(item));
}
return aSeed.Hash(countPlusOne);
}
/// <summary>
/// Contributes a possibly-null object to the developing HashCode seed.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="aObject">CAN be null.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int Hash(this int aSeed, object aObject)
{
switch (aObject) {
case null :
return aSeed.Hash(0);
case bool b :
return aSeed.Hash(b);
case char c :
return aSeed.Hash(c);
case int i :
return aSeed.Hash(i);
case long l :
return aSeed.Hash(l);
case float f :
return aSeed.Hash(f);
case double d :
return aSeed.Hash(d);
case string s :
return aSeed.Hash(s);
case IEnumerable iEnumerable :
return aSeed.Hash(iEnumerable);
}
return aSeed.Hash(aObject.GetHashCode());
}
/// <summary>
/// This utility method uses reflection to iterate all specified properties that are readable
/// on the given object, excluding any property names given in the params arguments, and
/// generates a hashcode.
/// </summary>
/// <param name="aSeed">The developing hash code, or the seed: if you have no seed, use
/// the <see cref="Seed"/>.</param>
/// <param name="aObject">CAN be null.</param>
/// <param name="propertySelector"><see cref="BindingFlags"/> to select the properties to hash.</param>
/// <param name="ignorePropertyNames">Optional.</param>
/// <returns>A hash from the properties contributed to <c>aSeed</c>.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int HashAllProperties(
this int aSeed,
object aObject,
BindingFlags propertySelector
= BindingFlags.Instance
| BindingFlags.Public
| BindingFlags.GetProperty,
params string[] ignorePropertyNames)
{
if (aObject == null)
return aSeed.Hash(0);
if ((ignorePropertyNames != null)
&& (ignorePropertyNames.Length != 0)) {
foreach (PropertyInfo propertyInfo in aObject.GetType()
.GetProperties(propertySelector)) {
if (!propertyInfo.CanRead
|| (Array.IndexOf(ignorePropertyNames, propertyInfo.Name) >= 0))
continue;
aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));
}
} else {
foreach (PropertyInfo propertyInfo in aObject.GetType()
.GetProperties(propertySelector)) {
if (propertyInfo.CanRead)
aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));
}
}
return aSeed;
}
/// <summary>
/// NOTICE: this method is provided to contribute a <see cref="KeyValuePair{TKey,TValue}"/> to
/// the developing HashCode seed; by hashing the key and the value independently. HOWEVER,
/// this method has a different name since it will not be automatically invoked by
/// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,
/// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless
/// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise
/// the generated hash code will not be consistent. This method itself ALSO will not invoke
/// this method on the Key or Value here if that itself is a KeyValuePair.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="keyValuePair">The value to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int HashKeyAndValue<TKey, TValue>(this int aSeed, KeyValuePair<TKey, TValue> keyValuePair)
=> aSeed.Hash(keyValuePair.Key)
.Hash(keyValuePair.Value);
/// <summary>
/// NOTICE: this method is provided to contribute a collection of <see cref="KeyValuePair{TKey,TValue}"/>
/// to the developing HashCode seed; by hashing the key and the value independently. HOWEVER,
/// this method has a different name since it will not be automatically invoked by
/// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,
/// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless
/// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise
/// the generated hash code will not be consistent. This method itself ALSO will not invoke
/// this method on a Key or Value here if that itself is a KeyValuePair or an Enumerable of
/// KeyValuePair.
/// </summary>
/// <param name="aSeed">The developing HashCode value or seed.</param>
/// <param name="keyValuePairs">The values to contribute.</param>
/// <returns>The new hash code.</returns>
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int HashKeysAndValues<TKey, TValue>(
this int aSeed,
IEnumerable<KeyValuePair<TKey, TValue>> keyValuePairs)
{
if (keyValuePairs == null)
return aSeed.Hash(null);
foreach (KeyValuePair<TKey, TValue> keyValuePair in keyValuePairs) {
aSeed = aSeed.HashKeyAndValue(keyValuePair);
}
return aSeed;
}
}
}
我的大部分工作都是通过数据库连接完成的,这意味着我的类都具有来自数据库的唯一标识符。我总是使用数据库中的ID来生成哈希代码。
// Unique ID from database
private int _id;
...
{
return _id.GetHashCode();
}
我想把我的最新发现添加到我经常提到的这个主题中。
我当前的visual studio/项目设置提供了将元组自动重构为结构的功能。这将生成如下GetHashCode函数:
public override int GetHashCode()
{
int hashCode = -2088324004;
hashCode = hashCode * -1521134295 + AuftragGesperrt.GetHashCode();
hashCode = hashCode * -1521134295 + Auftrag_gesperrt_von.GetHashCode();
hashCode = hashCode * -1521134295 + Auftrag_gesperrt_am.GetHashCode();
return hashCode;
}
编辑:为了澄清AuftragGesperrt、Auftrag _gesperrt_von和Auftrag-gesperrt _am是财产。如果微软的开发人员使用这个功能,这可能是一个不错的解决方案。