例如,容量与负载系数的乘积为16 * 0.75 = 12。这表示在HashMap中存储了第12个键值对后,其容量变为32。
例如,容量与负载系数的乘积为16 * 0.75 = 12。这表示在HashMap中存储了第12个键值对后,其容量变为32。
实际上,根据我的计算,“完美”的负载系数更接近于log 2(~ 0.7)。尽管任何负载系数小于这个值都会产生更好的性能。我觉得点75手枪可能是从帽子里拿出来的。
可以避免连锁,并通过预测分支预测 桶是否为空。一个桶可能是空的,如果它的概率 空的超过。5。
s表示键的大小,n表示添加的键的数量。使用二项式 定理,一个桶为空的概率为:
P(0) = C(n, 0) * (1/s)^0 * (1 - 1/s)^(n - 0)
log(2)/log(s/(s - 1)) keys
当s达到无穷大时,如果添加的键数等于 P(0) = .5,则n/s迅速逼近log(2):
lim (log(2)/log(s/(s - 1)))/s as s -> infinity = log(2) ~ 0.693...
Say you are adding an object x to your hashmap whose hashCode is 888 & in your hashmap the bucket representing the hashcode is free , so the object x gets added to the bucket, but now again say if you are adding another object y whose hashCode is also 888 then your object y will get added for sure BUT at the end of the bucket (because the buckets are nothing but linkedList implementation storing key,value & next) now this has a performance impact ! Since your object y is no longer present in the head of the bucket if you perform a lookup the time taken is not going to be O(1) this time it depends on how many items are there in the same bucket. This is called hash collision by the way & this even happens when your loading factor is less than 1.
更低的负载系数=更多的空闲桶=更少的碰撞几率=高性能=高空间需求。 更高的负载系数=更少的空闲桶=更高的碰撞几率=更低的性能=更低的空间需求。
对于HashMap, DEFAULT_INITIAL_CAPACITY = 16, DEFAULT_LOAD_FACTOR = 0.75f 这意味着HashMap中所有条目的最大数量= 16 * 0.75 = 12。当第13个元素被添加时,HashMap的容量(数组大小)将翻倍! 完美的例子回答了这个问题: 图片从这里拍摄:
An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets. As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.