如果您强制要求单元测试的代码覆盖率的最低百分比,甚至可能作为提交到存储库的要求,它会是什么?

请解释你是如何得出你的答案的(因为如果你所做的只是选择一个数字,那么我自己也可以完成;)


当前回答

在我看来,答案是“这取决于你有多少时间”。我试着达到100%,但如果我没有在我拥有的时间内完成它,我也不会大惊小怪。

当我编写单元测试时,我戴着与开发产品代码时不同的帽子。我考虑测试的代码声称要做什么,以及可能破坏它的情况是什么。

我通常遵循以下标准或规则:

单元测试应该是关于我的代码的预期行为的一种文档形式。给定特定输入的预期输出以及它可能抛出的客户端可能想要捕获的异常(我的代码的用户应该知道什么?) 单元测试应该帮助我发现我可能还没有想到的假设条件。(如何使我的代码稳定和健壮?)

如果这两条规则不能产生100%的覆盖率,那就顺其自然吧。但是一旦我有时间,我就会分析未覆盖的块和行,并确定是否仍然存在没有单元测试的测试用例,或者是否需要重构代码以消除不必要的代码。

其他回答

我使用cobertura,无论百分比是多少,我都建议保持cobertura检查任务中的值是最新的。至少,不断提高totallinerate和totalbranrate到刚好低于你当前的覆盖率,但永远不要降低这些值。还将Ant构建失败属性绑定到此任务。如果构建因为缺乏覆盖而失败,那么您知道有人添加了代码,但没有测试它。例子:

<cobertura-check linerate="0"
                 branchrate="0"
                 totallinerate="70"
                 totalbranchrate="90"
                 failureproperty="build.failed" />

85%是签入标准的一个很好的起点。

我可能会选择各种更高的发布标准——这取决于正在测试的子系统/组件的临界性。

对于一个设计良好的系统,单元测试从一开始就驱动开发,我认为85%是一个相当低的数字。设计为可测试的小类应该不难更好地覆盖。

我们很容易用这样的话来回避这个问题:

覆盖的行不等于测试的逻辑,不应该对百分比进行过多的解读。

没错,但是关于代码覆盖有一些重要的地方需要注意。根据我的经验,如果使用得当,这个指标实际上非常有用。话虽如此,我并没有见过所有的系统,我敢肯定有很多系统很难看到代码覆盖率分析增加任何真正的价值。代码可能看起来很不一样,可用测试框架的范围也可能不同。

此外,我的推理主要涉及相当短的测试反馈循环。对于我正在开发的产品,最短的反馈循环非常灵活,涵盖了从类测试到进程间信号的所有内容。测试一个可交付的子产品通常需要5分钟,对于这样短的反馈循环,确实可以使用测试结果(特别是我们在这里看到的代码覆盖率指标)来拒绝或接受存储库中的提交。

当使用代码覆盖率度量时,您不应该只有一个必须实现的固定(任意)百分比。在我看来,这样做并不能给您带来代码覆盖率分析的真正好处。相反,定义以下指标:

低水位标记(LWM),在测试系统中所见过的最低裸露线数 高水位标记(HWM),在测试系统中所见过的最高代码覆盖率

只有在不超过LWM和不低于HWM的情况下,才能添加新代码。换句话说,不允许减少代码覆盖率,并且应该覆盖新代码。注意我如何说应该和不必须(下面解释)。

但这难道不意味着,你将不可能清理那些久经考验、不再有用的旧垃圾吗?是的,这就是为什么你在这些事情上必须务实。有些情况下必须打破规则,但根据我的经验,对于典型的日常集成来说,这些指标非常有用。他们给出了以下两个暗示。

Testable code is promoted. When adding new code you really have to make an effort to make the code testable, because you will have to try and cover all of it with your test cases. Testable code is usually a good thing. Test coverage for legacy code is increasing over time. When adding new code and not being able to cover it with a test case, one can try to cover some legacy code instead to get around the LWM rule. This sometimes necessary cheating at least gives the positive side effect that the coverage of legacy code will increase over time, making the seemingly strict enforcement of these rules quite pragmatic in practice.

同样,如果反馈循环太长,在集成过程中设置这样的东西可能是完全不切实际的。

我还想提到代码覆盖度量的另外两个一般好处。

Code coverage analysis is part of the dynamic code analysis (as opposed to the static one, i.e. Lint). Problems found during the dynamic code analysis (by tools such as the purify family, http://www-03.ibm.com/software/products/en/rational-purify-family) are things like uninitialized memory reads (UMR), memory leaks, etc. These problems can only be found if the code is covered by an executed test case. The code that is the hardest to cover in a test case is usually the abnormal cases in the system, but if you want the system to fail gracefully (i.e. error trace instead of crash) you might want to put some effort into covering the abnormal cases in the dynamic code analysis as well. With just a little bit of bad luck, a UMR can lead to a segfault or worse. People take pride in keeping 100% for new code, and people discuss testing problems with a similar passion as other implementation problems. How can this function be written in a more testable manner? How would you go about trying to cover this abnormal case, etc.

为了完整起见,一个是否定的。

In a large project with many involved developers, everyone is not going to be a test-genius for sure. Some people tend to use the code coverage metric as proof that the code is tested and this is very far from the truth, as mentioned in many of the other answers to this question. It is ONE metric that can give you some nice benefits if used properly, but if it is misused it can in fact lead to bad testing. Aside from the very valuable side effects mentioned above a covered line only shows that the system under test can reach that line for some input data and that it can execute without hanging or crashing.

Alberto Savoia的这篇散文恰好回答了这个问题(以一种非常有趣的方式!):

http://www.artima.com/forums/flat.jsp?forum=106&thread=204677

Testivus On Test Coverage Early one morning, a programmer asked the great master: “I am ready to write some unit tests. What code coverage should I aim for?” The great master replied: “Don’t worry about coverage, just write some good tests.” The programmer smiled, bowed, and left. ... Later that day, a second programmer asked the same question. The great master pointed at a pot of boiling water and said: “How many grains of rice should I put in that pot?” The programmer, looking puzzled, replied: “How can I possibly tell you? It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on.” “Exactly,” said the great master. The second programmer smiled, bowed, and left. ... Toward the end of the day, a third programmer came and asked the same question about code coverage. “Eighty percent and no less!” Replied the master in a stern voice, pounding his fist on the table. The third programmer smiled, bowed, and left. ... After this last reply, a young apprentice approached the great master: “Great master, today I overheard you answer the same question about code coverage with three different answers. Why?” The great master stood up from his chair: “Come get some fresh tea with me and let’s talk about it.” After they filled their cups with smoking hot green tea, the great master began to answer: “The first programmer is new and just getting started with testing. Right now he has a lot of code and no tests. He has a long way to go; focusing on code coverage at this time would be depressing and quite useless. He’s better off just getting used to writing and running some tests. He can worry about coverage later.” “The second programmer, on the other hand, is quite experience both at programming and testing. When I replied by asking her how many grains of rice I should put in a pot, I helped her realize that the amount of testing necessary depends on a number of factors, and she knows those factors better than I do – it’s her code after all. There is no single, simple, answer, and she’s smart enough to handle the truth and work with that.” “I see,” said the young apprentice, “but if there is no single simple answer, then why did you answer the third programmer ‘Eighty percent and no less’?” The great master laughed so hard and loud that his belly, evidence that he drank more than just green tea, flopped up and down. “The third programmer wants only simple answers – even when there are no simple answers … and then does not follow them anyway.” The young apprentice and the grizzled great master finished drinking their tea in contemplative silence.

如果你的目标是100%的覆盖率(而不是100%测试所有功能),那么代码覆盖率就是一个误导的指标。

你可以通过一次命中所有的线来获得100%。然而,您仍然可能错过测试这些行命中的特定序列(逻辑路径)。 您不能得到100%,但仍然测试了所有80%/频率使用的代码路径。测试每个“抛出ExceptionTypeX”或类似的防御性编程保护是“有就好”而不是“必须”

所以要相信你自己或你的开发人员是彻底的,并覆盖他们代码中的每一条路径。要务实,不要追求神奇的100%覆盖率。如果你用TDD开发你的代码,你应该得到90%以上的覆盖率作为奖励。使用代码覆盖来突出你错过的代码块(如果你使用TDD就不应该发生这种情况。因为您编写代码只是为了通过测试。没有伙伴测试,任何代码都不能存在。)