负载因子在HashMap中的意义是什么?

文档解释得很好:

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets. As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

与所有性能优化一样，避免过早优化是一个好主意(即没有关于瓶颈在哪里的硬数据)。

2012-06-05 17:17:13

从文档中可以看到:

负载因子衡量的是在哈希表的容量自动增加之前允许达到的满度

这实际上取决于您的特定需求，没有“经验法则”来指定初始负载系数。

2012-06-05 17:16:50

文档解释得很好:

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets. As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

与所有性能优化一样，避免过早优化是一个好主意(即没有关于瓶颈在哪里的硬数据)。

2012-06-05 17:17:13

HashMap的默认初始容量为16，负载因子为0.75f(即当前映射大小的75%)。负载因子表示HashMap容量应该在哪个级别加倍。

例如，容量与负载系数的乘积为16 * 0.75 = 12。这表示在HashMap中存储了第12个键值对后，其容量变为32。

2013-10-17 12:33:44

实际上，根据我的计算，“完美”的负载系数更接近于log 2(~ 0.7)。尽管任何负载系数小于这个值都会产生更好的性能。我觉得点75手枪可能是从帽子里拿出来的。

证明:

可以避免连锁，并通过预测分支预测桶是否为空。一个桶可能是空的，如果它的概率空的超过。5。

s表示键的大小，n表示添加的键的数量。使用二项式定理，一个桶为空的概率为:

P(0) = C(n, 0) * (1/s)^0 * (1 - 1/s)^(n - 0)

因此，如果小于，则桶可能是空的

log(2)/log(s/(s - 1)) keys

当s达到无穷大时，如果添加的键数等于 P(0) = .5，则n/s迅速逼近log(2):

lim (log(2)/log(s/(s - 1)))/s as s -> infinity = log(2) ~ 0.693...

2015-07-14 08:42:10

我会选择n * 1.5或n + (n >> 1)的表大小，这将给出不除法的负载因子。66666~，这在大多数系统上是很慢的，特别是在硬件中没有除法的便携式系统上。

2016-02-16 02:13:15

负载因子在HashMap中的意义是什么?

推荐文章

最新文章

标签