我在阅读Java的ArrayList源代码时注意到if语句中有一些比较。
在Java 7中,使用了grow(int)方法
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
在Java 6中,不存在增长。然而,方法ensureCapacity(int)使用
if (newCapacity < minCapacity)
newCapacity = minCapacity;
这一变化背后的原因是什么?是性能问题还是风格问题?
我可以想象,与0比较会更快,但仅仅为了检查它是否为负而执行一个完整的减法,对我来说似乎有点过头了。同样在字节码方面,这将涉及两个指令(ISUB和IF_ICMPGE),而不是一个指令(IFGE)。
查看代码:
int newCapacity = oldCapacity + (oldCapacity >> 1);
如果oldCapacity非常大,它将溢出,而newCapacity将是一个负数。像newCapacity < oldCapacity这样的比较将错误地计算为true,并且ArrayList将无法增长。
相反,所写的代码(newCapacity - minCapacity < 0返回false)将允许在下一行中进一步计算newCapacity的负值,从而通过调用hugeCapacity(newCapacity = hugeCapacity(minCapacity);)来重新计算newCapacity,以允许ArrayList增长到MAX_ARRAY_SIZE。
这就是//溢出意识代码注释试图传达的内容,尽管是间接的。
因此,最重要的是,新的比较可以防止分配一个比预定义的MAX_ARRAY_SIZE更大的数组列表,同时允许它在需要时增长到这个限制。
A < b和A - b < 0意味着两个不同的东西。考虑下面的代码:
int a = Integer.MAX_VALUE;
int b = Integer.MIN_VALUE;
if (a < b) {
System.out.println("a < b");
}
if (a - b < 0) {
System.out.println("a - b < 0");
}
运行时,它只输出a - b < 0。结果是a < b显然是假的,但是a - b溢出变成-1,这是负的。
现在,考虑数组的长度非常接近Integer.MAX_VALUE。数组列表中的代码是这样的:
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
oldCapacity非常接近Integer。MAX_VALUE,因此newCapacity(即oldCapacity + 0.5 * oldCapacity)可能溢出并变成Integer。MIN_VALUE(即为负)。然后,减去minCapacity下流回一个正数。
该检查确保没有执行if语句。如果代码被写成If (newCapacity < minCapacity),在这种情况下是正确的(因为newCapacity是负的),因此无论oldCapacity如何,newCapacity都将被迫变为minCapacity。
此溢出情况由下一个if处理。当newCapacity溢出时,将为true: MAX_ARRAY_SIZE定义为Integer。MAX_VALUE - 8和Integer。MIN_VALUE -(整数。MAX_VALUE - 8) > 0为真。因此newCapacity被正确处理:hugeCapacity方法返回MAX_ARRAY_SIZE或Integer.MAX_VALUE。
注意:这就是这个方法中的//溢出意识代码注释所说的。
我找到了这样的解释:
On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern wrote:
I did a quick search and it appears that Java is indeed two's complement
based. Nonetheless, please allow me to point out that, in general, this
type of code worries me since I fully expect that at some point someone will
come along and do exactly what Dmytro suggested; that is, someone will
change:
if (a - b > 0)
to
if (a > b)
and the entire ship will sink. I, personally, like to avoid obscurities
such as making integer overflow an essential basis for my algorithm unless
there is a good reason to do so. I would, in general, prefer to avoid
overflow altogether and to make the overflow scenario more explicit:
if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) {
// Do something
} else {
// Do something else
}
It's a good point.
In ArrayList we cannot do this (or at least not compatibly), because
ensureCapacity is a public API and effectively already accepts
negative numbers as requests for a positive capacity that cannot be
satisfied.
The current API is used like this:
int newcount = count + len;
ensureCapacity(newcount);
If you want to avoid overflow, you would need to change to something
less natural like
ensureCapacity(count, len);
int newcount = count + len;
Anyway, I'm keeping the overflow-conscious code, but adding more
warning comments, and "out-lining" huge array creation so that
ArrayList's code now looks like:
/**
* Increases the capacity of this <tt>ArrayList</tt> instance, if
* necessary, to ensure that it can hold at least the number of elements
* specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
public void ensureCapacity(int minCapacity) {
modCount++;
// Overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
/**
* The maximum size of array to allocate.
* Some VMs reserve some header words in an array.
* Attempts to allocate larger arrays may result in
* OutOfMemoryError: Requested array size exceeds VM limit
*/
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// Overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
private int hugeCapacity(int minCapacity) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
return (minCapacity > MAX_ARRAY_SIZE) ?
Integer.MAX_VALUE :
MAX_ARRAY_SIZE;
}
Webrev regenerated.
Martin
在Java 6中,如果你使用API为:
int newcount = count + len;
ensureCapacity(newcount);
而newCount溢出(这变成负数),if (minCapacity > oldCapacity)将返回false,你可能会错误地认为数组列表增加了len。
查看代码:
int newCapacity = oldCapacity + (oldCapacity >> 1);
如果oldCapacity非常大,它将溢出,而newCapacity将是一个负数。像newCapacity < oldCapacity这样的比较将错误地计算为true,并且ArrayList将无法增长。
相反,所写的代码(newCapacity - minCapacity < 0返回false)将允许在下一行中进一步计算newCapacity的负值,从而通过调用hugeCapacity(newCapacity = hugeCapacity(minCapacity);)来重新计算newCapacity,以允许ArrayList增长到MAX_ARRAY_SIZE。
这就是//溢出意识代码注释试图传达的内容,尽管是间接的。
因此,最重要的是,新的比较可以防止分配一个比预定义的MAX_ARRAY_SIZE更大的数组列表,同时允许它在需要时增长到这个限制。