如何用Java编写(并运行)正确的微基准测试?

我正在寻找一些代码示例和注释,以说明需要考虑的各种事情。

示例:基准测试应该测量时间/迭代还是迭代/时间,为什么?

相关:秒表基准测试是可接受的?


当前回答

Java基准测试的重要事项是:

Warm up the JIT first by running the code several times before timing it Make sure you run it for long enough to be able to measure the results in seconds or (better) tens of seconds While you can't call System.gc() between iterations, it's a good idea to run it between tests, so that each test will hopefully get a "clean" memory space to work with. (Yes, gc() is more of a hint than a guarantee, but it's very likely that it really will garbage collect in my experience.) I like to display iterations and time, and a score of time/iteration which can be scaled such that the "best" algorithm gets a score of 1.0 and others are scored in a relative fashion. This means you can run all algorithms for a longish time, varying both number of iterations and time, but still getting comparable results.

我正在写一篇关于。net基准测试框架设计的博客。我有一些以前的帖子,也许能给你一些想法——当然,不是每件事都合适,但其中一些可能是合适的。

其他回答

Java基准测试的重要事项是:

Warm up the JIT first by running the code several times before timing it Make sure you run it for long enough to be able to measure the results in seconds or (better) tens of seconds While you can't call System.gc() between iterations, it's a good idea to run it between tests, so that each test will hopefully get a "clean" memory space to work with. (Yes, gc() is more of a hint than a guarantee, but it's very likely that it really will garbage collect in my experience.) I like to display iterations and time, and a score of time/iteration which can be scaled such that the "best" algorithm gets a score of 1.0 and others are scored in a relative fashion. This means you can run all algorithms for a longish time, varying both number of iterations and time, but still getting comparable results.

我正在写一篇关于。net基准测试框架设计的博客。我有一些以前的帖子,也许能给你一些想法——当然,不是每件事都合适,但其中一些可能是合适的。

http://opt.sourceforge.net/ Java Micro Benchmark -确定不同平台上计算机系统的比较性能特征所需的控制任务。可用于指导优化决策和比较不同的Java实现。

如果您正在尝试比较两种算法,那么每种算法至少要进行两次基准测试,交替使用顺序。例如:

for(i=1..n)
  alg1();
for(i=1..n)
  alg2();
for(i=1..n)
  alg2();
for(i=1..n)
  alg1();

我发现了一些明显的差异(有时5-10%)在运行时相同的算法在不同的通行证。

此外,还要确保n非常大,以便每个循环的运行时间至少为10秒左右。迭代次数越多,基准测试时间中的数字就越重要,数据就越可靠。

基准应该测量时间/迭代还是迭代/时间,为什么?

这取决于你要测试什么。

如果您对延迟感兴趣,则使用时间/迭代,如果您对吞吐量感兴趣,则使用迭代/时间。

jmh是最近添加到OpenJDK的,是由Oracle的一些性能工程师编写的。当然值得一看。

jmh是一个Java工具,用于构建、运行和分析用Java和其他针对JVM的语言编写的纳米/微/宏基准测试。

样本测试注释中隐藏着非常有趣的信息。

参见:

避免JVM上的基准测试陷阱 讨论jmh的主要优势。