问:Java中的异常处理真的很慢吗?

传统观点以及大量谷歌结果表明,不应该将异常逻辑用于Java中的正常程序流。通常会给出两个原因,

它真的很慢——甚至比普通代码慢一个数量级(给出的原因各不相同),

and

它很混乱,因为人们只希望在异常代码中处理错误。

这个问题是关于第一条的。

As an example, this page describes Java exception handling as "very slow" and relates the slowness to the creation of the exception message string - "this string is then used in creating the exception object that is thrown. This is not fast." The article Effective Exception Handling in Java says that "the reason for this is due to the object creation aspect of exception handling, which thereby makes throwing exceptions inherently slow". Another reason out there is that the stack trace generation is what slows it down.

My testing (using Java 1.6.0_07, Java HotSpot 10.0, on 32 bit Linux), indicates that exception handling is no slower than regular code. I tried running a method in a loop that executes some code. At the end of the method, I use a boolean to indicate whether to return or throw. This way the actual processing is the same. I tried running the methods in different orders and averaging my test times, thinking it may have been the JVM warming up. In all my tests, the throw was at least as fast as the return, if not faster (up to 3.1% faster). I am completely open to the possibility that my tests were wrong, but I haven't seen anything out there in the way of the code sample, test comparisons, or results in the last year or two that show exception handling in Java to actually be slow.

引导我走上这条路的是我需要使用的一个API,它将抛出异常作为正常控制逻辑的一部分。我想纠正它们的用法,但现在我可能做不到。我是否应该赞美他们的前瞻思维?

在论文《即时编译中的高效Java异常处理》中,作者建议,即使没有抛出异常,仅异常处理程序的存在就足以阻止JIT编译器正确优化代码,从而降低代码的速度。我还没有测试过这个理论。


当前回答

HotSpot非常能够删除系统生成的异常代码,只要它是内联的。但是,显式创建的异常和其他未删除的异常要花费大量时间来创建堆栈跟踪。重写fillInStackTrace以查看这会如何影响性能。

其他回答

Java和c#中的异常性能还有待改进。

作为程序员,这迫使我们遵循“异常应该很少引起”的规则,仅仅是出于实际性能的考虑。

However, as computer scientists, we should rebel against this problematic state. The person authoring a function often has no idea how often it will be called, or whether success or failure is more likely. Only the caller has this information. Trying to avoid exceptions leads to unclear API idoms where in some cases we have only clean-but-slow exception versions, and in other cases we have fast-but-clunky return-value errors, and in still other cases we end up with both. The library implementor may have to write and maintain two versions of APIs, and the caller has to decide which of two versions to use in each situation.

这里有点乱。如果异常具有更好的性能,我们就可以避免这些笨拙的习惯用法,并按照它们应该使用的方式使用异常……作为结构化错误返回工具。

我真的希望看到异常机制使用更接近返回值的技术来实现,这样我们的性能就能更接近返回值。因为这是我们在性能敏感代码中恢复的内容。

下面是一个比较异常性能和错误返回值性能的代码示例。

公共类test {

int value;


public int getValue() {
    return value;
}

public void reset() {
    value = 0;
}

public boolean baseline_null(boolean shouldfail, int recurse_depth) {
    if (recurse_depth <= 0) {
        return shouldfail;
    } else {
        return baseline_null(shouldfail,recurse_depth-1);
    }
}

public boolean retval_error(boolean shouldfail, int recurse_depth) {
    if (recurse_depth <= 0) {
        if (shouldfail) {
            return false;
        } else {
            return true;
        }
    } else {
        boolean nested_error = retval_error(shouldfail,recurse_depth-1);
        if (nested_error) {
            return true;
        } else {
            return false;
        }
    }
}

public void exception_error(boolean shouldfail, int recurse_depth) throws Exception {
    if (recurse_depth <= 0) {
        if (shouldfail) {
            throw new Exception();
        }
    } else {
        exception_error(shouldfail,recurse_depth-1);
    }

}

public static void main(String[] args) {
    int i;
    long l;
    TestIt t = new TestIt();
    int failures;

    int ITERATION_COUNT = 100000000;


    // (0) baseline null workload
    for (int recurse_depth = 2; recurse_depth <= 10; recurse_depth+=3) {
        for (float exception_freq = 0.0f; exception_freq <= 1.0f; exception_freq += 0.25f) {            
            int EXCEPTION_MOD = (exception_freq == 0.0f) ? ITERATION_COUNT+1 : (int)(1.0f / exception_freq);            

            failures = 0;
            long start_time = System.currentTimeMillis();
            t.reset();              
            for (i = 1; i < ITERATION_COUNT; i++) {
                boolean shoulderror = (i % EXCEPTION_MOD) == 0;
                t.baseline_null(shoulderror,recurse_depth);
            }
            long elapsed_time = System.currentTimeMillis() - start_time;
            System.out.format("baseline: recurse_depth %s, exception_freqeuncy %s (%s), time elapsed %s ms\n",
                    recurse_depth, exception_freq, failures,elapsed_time);
        }
    }


    // (1) retval_error
    for (int recurse_depth = 2; recurse_depth <= 10; recurse_depth+=3) {
        for (float exception_freq = 0.0f; exception_freq <= 1.0f; exception_freq += 0.25f) {            
            int EXCEPTION_MOD = (exception_freq == 0.0f) ? ITERATION_COUNT+1 : (int)(1.0f / exception_freq);            

            failures = 0;
            long start_time = System.currentTimeMillis();
            t.reset();              
            for (i = 1; i < ITERATION_COUNT; i++) {
                boolean shoulderror = (i % EXCEPTION_MOD) == 0;
                if (!t.retval_error(shoulderror,recurse_depth)) {
                    failures++;
                }
            }
            long elapsed_time = System.currentTimeMillis() - start_time;
            System.out.format("retval_error: recurse_depth %s, exception_freqeuncy %s (%s), time elapsed %s ms\n",
                    recurse_depth, exception_freq, failures,elapsed_time);
        }
    }

    // (2) exception_error
    for (int recurse_depth = 2; recurse_depth <= 10; recurse_depth+=3) {
        for (float exception_freq = 0.0f; exception_freq <= 1.0f; exception_freq += 0.25f) {            
            int EXCEPTION_MOD = (exception_freq == 0.0f) ? ITERATION_COUNT+1 : (int)(1.0f / exception_freq);            

            failures = 0;
            long start_time = System.currentTimeMillis();
            t.reset();              
            for (i = 1; i < ITERATION_COUNT; i++) {
                boolean shoulderror = (i % EXCEPTION_MOD) == 0;
                try {
                    t.exception_error(shoulderror,recurse_depth);
                } catch (Exception e) {
                    failures++;
                }
            }
            long elapsed_time = System.currentTimeMillis() - start_time;
            System.out.format("exception_error: recurse_depth %s, exception_freqeuncy %s (%s), time elapsed %s ms\n",
                    recurse_depth, exception_freq, failures,elapsed_time);              
        }
    }
}

}

结果如下:

baseline: recurse_depth 2, exception_freqeuncy 0.0 (0), time elapsed 683 ms
baseline: recurse_depth 2, exception_freqeuncy 0.25 (0), time elapsed 790 ms
baseline: recurse_depth 2, exception_freqeuncy 0.5 (0), time elapsed 768 ms
baseline: recurse_depth 2, exception_freqeuncy 0.75 (0), time elapsed 749 ms
baseline: recurse_depth 2, exception_freqeuncy 1.0 (0), time elapsed 731 ms
baseline: recurse_depth 5, exception_freqeuncy 0.0 (0), time elapsed 923 ms
baseline: recurse_depth 5, exception_freqeuncy 0.25 (0), time elapsed 971 ms
baseline: recurse_depth 5, exception_freqeuncy 0.5 (0), time elapsed 982 ms
baseline: recurse_depth 5, exception_freqeuncy 0.75 (0), time elapsed 947 ms
baseline: recurse_depth 5, exception_freqeuncy 1.0 (0), time elapsed 937 ms
baseline: recurse_depth 8, exception_freqeuncy 0.0 (0), time elapsed 1154 ms
baseline: recurse_depth 8, exception_freqeuncy 0.25 (0), time elapsed 1149 ms
baseline: recurse_depth 8, exception_freqeuncy 0.5 (0), time elapsed 1133 ms
baseline: recurse_depth 8, exception_freqeuncy 0.75 (0), time elapsed 1117 ms
baseline: recurse_depth 8, exception_freqeuncy 1.0 (0), time elapsed 1116 ms
retval_error: recurse_depth 2, exception_freqeuncy 0.0 (0), time elapsed 742 ms
retval_error: recurse_depth 2, exception_freqeuncy 0.25 (24999999), time elapsed 743 ms
retval_error: recurse_depth 2, exception_freqeuncy 0.5 (49999999), time elapsed 734 ms
retval_error: recurse_depth 2, exception_freqeuncy 0.75 (99999999), time elapsed 723 ms
retval_error: recurse_depth 2, exception_freqeuncy 1.0 (99999999), time elapsed 728 ms
retval_error: recurse_depth 5, exception_freqeuncy 0.0 (0), time elapsed 920 ms
retval_error: recurse_depth 5, exception_freqeuncy 0.25 (24999999), time elapsed 1121   ms
retval_error: recurse_depth 5, exception_freqeuncy 0.5 (49999999), time elapsed 1037 ms
retval_error: recurse_depth 5, exception_freqeuncy 0.75 (99999999), time elapsed 1141   ms
retval_error: recurse_depth 5, exception_freqeuncy 1.0 (99999999), time elapsed 1130 ms
retval_error: recurse_depth 8, exception_freqeuncy 0.0 (0), time elapsed 1218 ms
retval_error: recurse_depth 8, exception_freqeuncy 0.25 (24999999), time elapsed 1334  ms
retval_error: recurse_depth 8, exception_freqeuncy 0.5 (49999999), time elapsed 1478 ms
retval_error: recurse_depth 8, exception_freqeuncy 0.75 (99999999), time elapsed 1637 ms
retval_error: recurse_depth 8, exception_freqeuncy 1.0 (99999999), time elapsed 1655 ms
exception_error: recurse_depth 2, exception_freqeuncy 0.0 (0), time elapsed 726 ms
exception_error: recurse_depth 2, exception_freqeuncy 0.25 (24999999), time elapsed 17487   ms
exception_error: recurse_depth 2, exception_freqeuncy 0.5 (49999999), time elapsed 33763   ms
exception_error: recurse_depth 2, exception_freqeuncy 0.75 (99999999), time elapsed 67367   ms
exception_error: recurse_depth 2, exception_freqeuncy 1.0 (99999999), time elapsed 66990 ms
exception_error: recurse_depth 5, exception_freqeuncy 0.0 (0), time elapsed 924 ms
exception_error: recurse_depth 5, exception_freqeuncy 0.25 (24999999), time elapsed 23775  ms
exception_error: recurse_depth 5, exception_freqeuncy 0.5 (49999999), time elapsed 46326 ms
exception_error: recurse_depth 5, exception_freqeuncy 0.75 (99999999), time elapsed 91707 ms
exception_error: recurse_depth 5, exception_freqeuncy 1.0 (99999999), time elapsed 91580 ms
exception_error: recurse_depth 8, exception_freqeuncy 0.0 (0), time elapsed 1144 ms
exception_error: recurse_depth 8, exception_freqeuncy 0.25 (24999999), time elapsed 30440 ms
exception_error: recurse_depth 8, exception_freqeuncy 0.5 (49999999), time elapsed 59116   ms
exception_error: recurse_depth 8, exception_freqeuncy 0.75 (99999999), time elapsed 116678 ms
exception_error: recurse_depth 8, exception_freqeuncy 1.0 (99999999), time elapsed 116477 ms

检查和传播返回值与基线空调用相比确实增加了一些成本,而该成本与调用深度成正比。在调用链深度为8时,错误返回值检查版本比不检查返回值的基线版本慢了约27%。

相比之下,异常性能不是调用深度的函数,而是异常频率的函数。然而,随着异常频率的增加,这种退化更为显著。当错误频率只有25%时,代码运行速度变慢了24倍。当错误频率为100%时,异常版本几乎要慢100倍。

这在我看来可能是在我们的异常实现中做出了错误的权衡。异常可以更快,可以避免代价高昂的跟踪遍历,也可以直接将异常转换为编译器支持的返回值检查。在此之前,当我们希望代码运行得更快时,我们不得不避免它们。

不知道这些主题是否相关,但我曾经想实现一个依赖于当前线程的堆栈跟踪的技巧:我想发现方法的名称,它触发了实例化类中的实例化(是的,这个想法很疯狂,我完全放弃了它)。所以我发现调用Thread.currentThread(). getstacktrace()是非常慢的(由于本机的dumpThreads方法,它在内部使用)。

相应地,Java Throwable有一个本地方法fillInStackTrace。我认为前面描述的kill -catch块以某种方式触发了该方法的执行。

但让我告诉你另一个故事……

在Scala中,一些函数特性是使用ControlThrowable在JVM中编译的,它扩展了Throwable,并以以下方式覆盖了它的fillInStackTrace:

override def fillInStackTrace(): Throwable = this

所以我调整了上面的测试(循环量减少了十,我的机器有点慢:):

class ControlException extends ControlThrowable

class T {
  var value = 0

  def reset = {
    value = 0
  }

  def method1(i: Int) = {
    value = ((value + i) / i) << 1
    if ((i & 0xfffffff) == 1000000000) {
      println("You'll never see this!")
    }
  }

  def method2(i: Int) = {
    value = ((value + i) / i) << 1
    if ((i & 0xfffffff) == 1000000000) {
      throw new Exception()
    }
  }

  def method3(i: Int) = {
    value = ((value + i) / i) << 1
    if ((i & 0x1) == 1) {
      throw new Exception()
    }
  }

  def method4(i: Int) = {
    value = ((value + i) / i) << 1
    if ((i & 0x1) == 1) {
      throw new ControlException()
    }
  }
}

class Main {
  var l = System.currentTimeMillis
  val t = new T
  for (i <- 1 to 10000000)
    t.method1(i)
  l = System.currentTimeMillis - l
  println("method1 took " + l + " ms, result was " + t.value)

  t.reset
  l = System.currentTimeMillis
  for (i <- 1 to 10000000) try {
    t.method2(i)
  } catch {
    case _ => println("You'll never see this")
  }
  l = System.currentTimeMillis - l
  println("method2 took " + l + " ms, result was " + t.value)

  t.reset
  l = System.currentTimeMillis
  for (i <- 1 to 10000000) try {
    t.method4(i)
  } catch {
    case _ => // do nothing
  }
  l = System.currentTimeMillis - l
  println("method4 took " + l + " ms, result was " + t.value)

  t.reset
  l = System.currentTimeMillis
  for (i <- 1 to 10000000) try {
    t.method3(i)
  } catch {
    case _ => // do nothing
  }
  l = System.currentTimeMillis - l
  println("method3 took " + l + " ms, result was " + t.value)

}

所以,结果是:

method1 took 146 ms, result was 2
method2 took 159 ms, result was 2
method4 took 1551 ms, result was 2
method3 took 42492 ms, result was 2

你看,method3和method4之间唯一的区别是它们会抛出不同类型的异常。是的,method4仍然比method1和method2慢,但是差异是可以接受的。

前段时间,我写了一个类来测试将字符串转换为整数的相对性能,使用两种方法:(1)调用Integer.parseInt()并捕获异常,或者(2)用正则表达式匹配字符串并仅在匹配成功时调用parseInt()。我以最有效的方式使用正则表达式(即,在终止循环之前创建Pattern和Matcher对象),并且我没有打印或保存异常的堆栈跟踪。

对于一个包含10,000个字符串的列表,如果它们都是有效数字,那么parseInt()方法的速度是regex方法的四倍。但如果只有80%的字符串是有效的,则regex的速度是parseInt()的两倍。如果20%是有效的,这意味着异常在80%的时间内被抛出和捕获,则regex的速度大约是parseInt()的20倍。

我对结果感到惊讶,因为regex方法处理了两次有效字符串:一次用于匹配,另一次用于parseInt()。但是抛出和捕获异常完全弥补了这一点。这种情况在现实世界中不太可能经常发生,但如果发生了,您绝对不应该使用异常捕获技术。但如果您只是验证用户输入或类似的东西,务必使用parseInt()方法。

It depends how exceptions are implemented. The simplest way is using setjmp and longjmp. That means all registers of the CPU are written to the stack (which already takes some time) and possibly some other data needs to be created... all this already happens in the try statement. The throw statement needs to unwind the stack and restore the values of all registers (and possible other values in the VM). So try and throw are equally slow, and that is pretty slow, however if no exception is thrown, exiting the try block takes no time whatsoever in most cases (as everything is put on the stack which cleans up automatically if the method exists).

Sun和其他人认识到,这可能是次优的,当然随着时间的推移,虚拟机会变得越来越快。还有另一种实现异常的方法,它使try本身闪电般快(实际上try本身根本不会发生任何事情——当类被VM加载时,需要发生的一切都已经完成了),并且它使throw不那么慢。我不知道哪个JVM使用了这种新的、更好的技术……

...但你是在用Java写代码,所以你的代码以后只能在一个特定系统的一个JVM上运行吗?因为如果它可以在任何其他平台或任何其他JVM版本(可能是任何其他供应商的)上运行,谁说他们也使用快速实现呢?速度快的要比速度慢的复杂得多,而且不容易在所有系统上实现。你想要便携吗?那就不要指望异常会很快。

It also makes a big difference what you do within a try block. If you open a try block and never call any method from within this try block, the try block will be ultra fast, as the JIT can then actually treat a throw like a simple goto. It neither needs to save stack-state nor does it need to unwind the stack if an exception is thrown (it only needs to jump to the catch handlers). However, this is not what you usually do. Usually you open a try block and then call a method that might throw an exception, right? And even if you just use the try block within your method, what kind of method will this be, that does not call any other method? Will it just calculate a number? Then what for do you need exceptions? There are much more elegant ways to regulate program flow. For pretty much anything else but simple math, you will have to call an external method and this already destroys the advantage of a local try block.

请看下面的测试代码:

public class Test {
    int value;


    public int getValue() {
        return value;
    }

    public void reset() {
        value = 0;
    }

    // Calculates without exception
    public void method1(int i) {
        value = ((value + i) / i) << 1;
        // Will never be true
        if ((i & 0xFFFFFFF) == 1000000000) {
            System.out.println("You'll never see this!");
        }
    }

    // Could in theory throw one, but never will
    public void method2(int i) throws Exception {
        value = ((value + i) / i) << 1;
        // Will never be true
        if ((i & 0xFFFFFFF) == 1000000000) {
            throw new Exception();
        }
    }

    // This one will regularly throw one
    public void method3(int i) throws Exception {
        value = ((value + i) / i) << 1;
        // i & 1 is equally fast to calculate as i & 0xFFFFFFF; it is both
        // an AND operation between two integers. The size of the number plays
        // no role. AND on 32 BIT always ANDs all 32 bits
        if ((i & 0x1) == 1) {
            throw new Exception();
        }
    }

    public static void main(String[] args) {
        int i;
        long l;
        Test t = new Test();

        l = System.currentTimeMillis();
        t.reset();
        for (i = 1; i < 100000000; i++) {
            t.method1(i);
        }
        l = System.currentTimeMillis() - l;
        System.out.println(
            "method1 took " + l + " ms, result was " + t.getValue()
        );

        l = System.currentTimeMillis();
        t.reset();
        for (i = 1; i < 100000000; i++) {
            try {
                t.method2(i);
            } catch (Exception e) {
                System.out.println("You'll never see this!");
            }
        }
        l = System.currentTimeMillis() - l;
        System.out.println(
            "method2 took " + l + " ms, result was " + t.getValue()
        );

        l = System.currentTimeMillis();
        t.reset();
        for (i = 1; i < 100000000; i++) {
            try {
                t.method3(i);
            } catch (Exception e) {
                // Do nothing here, as we will get here
            }
        }
        l = System.currentTimeMillis() - l;
        System.out.println(
            "method3 took " + l + " ms, result was " + t.getValue()
        );
    }
}

结果:

method1 took 972 ms, result was 2
method2 took 1003 ms, result was 2
method3 took 66716 ms, result was 2

try块的减速太小,无法排除后台进程等混杂因素。但是catch block杀死了一切,让它慢了66倍!

正如我所说,如果将try/catch和throw都放在同一个方法(method3)中,结果不会那么糟糕,但这是我不依赖的特殊JIT优化。即使使用这种优化,抛出仍然非常慢。我不知道你们想做什么,但肯定有比try/catch/throw更好的方法。

不幸的是,我的回答太长了,不能在这里发表。所以让我在这里总结一下,并向你推荐http://www.fuwjax.com/how-slow-are-java-exceptions/以获得更具体的细节。

这里真正的问题不是“与“从未失败的代码”相比,“将失败报告为异常”的速度有多慢?”,正如人们所接受的回答可能会让你相信的那样。相反,问题应该是“与其他方式报告的失败相比,‘作为异常报告的失败’有多慢?”通常,报告失败的另外两种方法是使用哨兵值或使用结果包装器。

哨兵值是在成功情况下返回一个类,在失败情况下返回另一个类的尝试。你几乎可以把它看作是返回一个异常而不是抛出一个异常。这需要一个与success对象共享的父类,然后执行“instanceof”检查和几个类型转换来获得成功或失败的信息。

事实证明,冒着类型安全的风险,Sentinel值比异常快,但仅快大约2倍。现在,这可能看起来很多,但2倍只包括实现差异的成本。实际上,这个因素要低得多,因为我们可能失败的方法要比本页其他地方示例代码中的几个算术运算符有趣得多。

另一方面,结果包装器根本不牺牲类型安全。它们将成功和失败信息包装在单个类中。因此,它们提供了一个“isSuccess()”来代替“instanceof”,并为成功和失败对象提供了getter。但是,结果对象大约比使用异常慢2倍。事实证明,每次创建一个新的包装器对象比有时抛出异常要昂贵得多。

最重要的是,异常是语言提供的一种指示方法可能失败的方式。没有其他方法可以仅从API判断哪些方法总是(大部分)工作,哪些方法报告失败。

异常比哨兵更安全,比结果对象更快,并且比两者都不那么令人惊讶。我并不是建议用try/catch替换if/else,但是异常是报告失败的正确方式,即使在业务逻辑中也是如此。

也就是说,我想指出的是,我遇到的两种最常见的实质上影响性能的方法是创建不必要的对象和嵌套循环。如果可以在创建异常和不创建异常之间选择,请不要创建异常。如果要在有时创建异常或始终创建另一个对象之间做出选择,那么就创建异常。