我在阅读Java的ArrayList源代码时注意到if语句中有一些比较。

在Java 7中,使用了grow(int)方法

if (newCapacity - minCapacity < 0)
    newCapacity = minCapacity;

在Java 6中,不存在增长。然而,方法ensureCapacity(int)使用

if (newCapacity < minCapacity)
    newCapacity = minCapacity;

这一变化背后的原因是什么?是性能问题还是风格问题?

我可以想象,与0比较会更快,但仅仅为了检查它是否为负而执行一个完整的减法,对我来说似乎有点过头了。同样在字节码方面,这将涉及两个指令(ISUB和IF_ICMPGE),而不是一个指令(IFGE)。


A < b和A - b < 0意味着两个不同的东西。考虑下面的代码:

int a = Integer.MAX_VALUE;
int b = Integer.MIN_VALUE;
if (a < b) {
    System.out.println("a < b");
}
if (a - b < 0) {
    System.out.println("a - b < 0");
}

运行时,它只输出a - b < 0。结果是a < b显然是假的,但是a - b溢出变成-1,这是负的。

现在,考虑数组的长度非常接近Integer.MAX_VALUE。数组列表中的代码是这样的:

int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
    newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
    newCapacity = hugeCapacity(minCapacity);

oldCapacity非常接近Integer。MAX_VALUE,因此newCapacity(即oldCapacity + 0.5 * oldCapacity)可能溢出并变成Integer。MIN_VALUE(即为负)。然后,减去minCapacity下流回一个正数。

该检查确保没有执行if语句。如果代码被写成If (newCapacity < minCapacity),在这种情况下是正确的(因为newCapacity是负的),因此无论oldCapacity如何,newCapacity都将被迫变为minCapacity。

此溢出情况由下一个if处理。当newCapacity溢出时,将为true: MAX_ARRAY_SIZE定义为Integer。MAX_VALUE - 8和Integer。MIN_VALUE -(整数。MAX_VALUE - 8) > 0为真。因此newCapacity被正确处理:hugeCapacity方法返回MAX_ARRAY_SIZE或Integer.MAX_VALUE。

注意:这就是这个方法中的//溢出意识代码注释所说的。


我找到了这样的解释:

On Tue, Mar 9, 2010 at 03:02, Kevin L. Stern wrote: I did a quick search and it appears that Java is indeed two's complement based. Nonetheless, please allow me to point out that, in general, this type of code worries me since I fully expect that at some point someone will come along and do exactly what Dmytro suggested; that is, someone will change: if (a - b > 0) to if (a > b) and the entire ship will sink. I, personally, like to avoid obscurities such as making integer overflow an essential basis for my algorithm unless there is a good reason to do so. I would, in general, prefer to avoid overflow altogether and to make the overflow scenario more explicit: if (oldCapacity > RESIZE_OVERFLOW_THRESHOLD) { // Do something } else { // Do something else } It's a good point. In ArrayList we cannot do this (or at least not compatibly), because ensureCapacity is a public API and effectively already accepts negative numbers as requests for a positive capacity that cannot be satisfied. The current API is used like this: int newcount = count + len; ensureCapacity(newcount); If you want to avoid overflow, you would need to change to something less natural like ensureCapacity(count, len); int newcount = count + len; Anyway, I'm keeping the overflow-conscious code, but adding more warning comments, and "out-lining" huge array creation so that ArrayList's code now looks like: /** * Increases the capacity of this <tt>ArrayList</tt> instance, if * necessary, to ensure that it can hold at least the number of elements * specified by the minimum capacity argument. * * @param minCapacity the desired minimum capacity */ public void ensureCapacity(int minCapacity) { modCount++; // Overflow-conscious code if (minCapacity - elementData.length > 0) grow(minCapacity); } /** * The maximum size of array to allocate. * Some VMs reserve some header words in an array. * Attempts to allocate larger arrays may result in * OutOfMemoryError: Requested array size exceeds VM limit */ private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; /** * Increases the capacity to ensure that it can hold at least the * number of elements specified by the minimum capacity argument. * * @param minCapacity the desired minimum capacity */ private void grow(int minCapacity) { // Overflow-conscious code int oldCapacity = elementData.length; int newCapacity = oldCapacity + (oldCapacity >> 1); if (newCapacity - minCapacity < 0) newCapacity = minCapacity; if (newCapacity - MAX_ARRAY_SIZE > 0) newCapacity = hugeCapacity(minCapacity); // minCapacity is usually close to size, so this is a win: elementData = Arrays.copyOf(elementData, newCapacity); } private int hugeCapacity(int minCapacity) { if (minCapacity < 0) // overflow throw new OutOfMemoryError(); return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE; } Webrev regenerated. Martin

在Java 6中,如果你使用API为:

int newcount = count + len;
ensureCapacity(newcount);

而newCount溢出(这变成负数),if (minCapacity > oldCapacity)将返回false,你可能会错误地认为数组列表增加了len。


查看代码:

int newCapacity = oldCapacity + (oldCapacity >> 1);

如果oldCapacity非常大,它将溢出,而newCapacity将是一个负数。像newCapacity < oldCapacity这样的比较将错误地计算为true,并且ArrayList将无法增长。

相反,所写的代码(newCapacity - minCapacity < 0返回false)将允许在下一行中进一步计算newCapacity的负值,从而通过调用hugeCapacity(newCapacity = hugeCapacity(minCapacity);)来重新计算newCapacity,以允许ArrayList增长到MAX_ARRAY_SIZE。

这就是//溢出意识代码注释试图传达的内容,尽管是间接的。

因此,最重要的是,新的比较可以防止分配一个比预定义的MAX_ARRAY_SIZE更大的数组列表,同时允许它在需要时增长到这个限制。


这两种形式的行为完全相同,除非表达式a - b溢出,在这种情况下它们是相反的。如果a是一个很大的负数,b是一个很大的正数,那么(a < b)显然是真的,但a - b会溢出变成正数,所以(a - b < 0)是假的。

如果您熟悉x86汇编代码,可以考虑(a < b)是由jge实现的,当SF = of时,jge围绕If语句体展开分支。另一方面,(a - b < 0)将像jns一样,当SF = 0时分支。因此,当OF = 1时,它们的行为恰好不同。