在.NET中,GetHashCode方法在整个.NET基类库的许多地方都使用。正确执行它对于在集合中或确定相等时快速查找项目尤为重要。
对于如何为自定义类实现GetHashCode,是否有标准算法或最佳实践,以便不会降低性能?
在.NET中,GetHashCode方法在整个.NET基类库的许多地方都使用。正确执行它对于在集合中或确定相等时快速查找项目尤为重要。
对于如何为自定义类实现GetHashCode,是否有标准算法或最佳实践,以便不会降低性能?
当前回答
ValueTuple-C#7更新
正如@cactuaroid在评论中提到的,可以使用值元组。这节省了一些击键,更重要的是纯粹在堆栈上执行(无垃圾):
(PropA, PropB, PropC, PropD).GetHashCode();
(注意:使用匿名类型的原始技术似乎在堆上创建了一个对象,即垃圾,因为匿名类型被实现为类,尽管编译器可能会对此进行优化。对这些选项进行基准测试会很有趣,但元组选项应该更优。)
匿名类型(原始答案)
Microsoft已经提供了一个很好的通用HashCode生成器:只需将属性/字段值复制到匿名类型并对其进行哈希:
new { PropA, PropB, PropC, PropD }.GetHashCode();
这适用于任何数量的财产。它不使用拳击。它只是使用了框架中已经实现的匿名类型的算法。
其他回答
与夜编码器的解决方案非常相似,只是如果你想提高素数更容易。
PS:这是你嘴里吐出一点东西的时候之一,因为你知道这可以用9个默认值重构成一个方法,但它会更慢,所以你闭上眼睛,试着忘掉它。
/// <summary>
/// Try not to look at the source code. It works. Just rely on it.
/// </summary>
public static class HashHelper
{
private const int PrimeOne = 17;
private const int PrimeTwo = 23;
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
hash = hash * PrimeTwo + arg6.GetHashCode();
hash = hash * PrimeTwo + arg7.GetHashCode();
hash = hash * PrimeTwo + arg8.GetHashCode();
hash = hash * PrimeTwo + arg9.GetHashCode();
hash = hash * PrimeTwo + arg10.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
hash = hash * PrimeTwo + arg6.GetHashCode();
hash = hash * PrimeTwo + arg7.GetHashCode();
hash = hash * PrimeTwo + arg8.GetHashCode();
hash = hash * PrimeTwo + arg9.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
hash = hash * PrimeTwo + arg6.GetHashCode();
hash = hash * PrimeTwo + arg7.GetHashCode();
hash = hash * PrimeTwo + arg8.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
hash = hash * PrimeTwo + arg6.GetHashCode();
hash = hash * PrimeTwo + arg7.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4, T5, T6>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
hash = hash * PrimeTwo + arg6.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4, T5>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
hash = hash * PrimeTwo + arg5.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
hash = hash * PrimeTwo + arg4.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
hash = hash * PrimeTwo + arg3.GetHashCode();
return hash;
}
}
public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2)
{
unchecked
{
int hash = PrimeOne;
hash = hash * PrimeTwo + arg1.GetHashCode();
hash = hash * PrimeTwo + arg2.GetHashCode();
return hash;
}
}
}
ValueTuple-C#7更新
正如@cactuaroid在评论中提到的,可以使用值元组。这节省了一些击键,更重要的是纯粹在堆栈上执行(无垃圾):
(PropA, PropB, PropC, PropD).GetHashCode();
(注意:使用匿名类型的原始技术似乎在堆上创建了一个对象,即垃圾,因为匿名类型被实现为类,尽管编译器可能会对此进行优化。对这些选项进行基准测试会很有趣,但元组选项应该更优。)
匿名类型(原始答案)
Microsoft已经提供了一个很好的通用HashCode生成器:只需将属性/字段值复制到匿名类型并对其进行哈希:
new { PropA, PropB, PropC, PropD }.GetHashCode();
这适用于任何数量的财产。它不使用拳击。它只是使用了框架中已经实现的匿名类型的算法。
微软引领了几种哈希方法。。。
//for classes that contain a single int value
return this.value;
//for classes that contain multiple int value
return x ^ y;
//for classes that contain single number bigger than int
return ((int)value ^ (int)(value >> 32));
//for classes that contain class instance fields which inherit from object
return obj1.GetHashCode();
//for classes that contain multiple class instance fields which inherit from object
return obj1.GetHashCode() ^ obj2.GetHashCode() ^ obj3.GetHashCode();
我可以猜测,对于多个大整数,您可以使用这个:
int a=((int)value1 ^ (int)(value1 >> 32));
int b=((int)value2 ^ (int)(value2 >> 32));
int c=((int)value3 ^ (int)(value3 >> 32));
return a ^ b ^ c;
对于多类型也是如此:首先使用GetHashCode()将所有类型转换为int然后int值将被xor'ed,结果是您的哈希值。
对于那些使用哈希作为ID(我的意思是一个唯一的值)的人来说,哈希自然被限制在数字个数,我认为哈希算法是5个字节,至少是MD5。
您可以将多个值转换为哈希值,其中一些值是相同的,因此不要将其用作标识符。(也许有一天我会使用你的组件)
我在Helper库中有一个Hashing类,用于此目的。
/// <summary>
/// This is a simple hashing function from Robert Sedgwicks Hashing in C book.
/// Also, some simple optimizations to the algorithm in order to speed up
/// its hashing process have been added. from: www.partow.net
/// </summary>
/// <param name="input">array of objects, parameters combination that you need
/// to get a unique hash code for them</param>
/// <returns>Hash code</returns>
public static int RSHash(params object[] input)
{
const int b = 378551;
int a = 63689;
int hash = 0;
// If it overflows then just wrap around
unchecked
{
for (int i = 0; i < input.Length; i++)
{
if (input[i] != null)
{
hash = hash * a + input[i].GetHashCode();
a = a * b;
}
}
}
return hash;
}
然后,只需将其用作:
public override int GetHashCode()
{
return Hashing.RSHash(_field1, _field2, _field3);
}
我没有评估它的表现,所以欢迎任何反馈。
我的大部分工作都是通过数据库连接完成的,这意味着我的类都具有来自数据库的唯一标识符。我总是使用数据库中的ID来生成哈希代码。
// Unique ID from database
private int _id;
...
{
return _id.GetHashCode();
}