如果您强制要求单元测试的代码覆盖率的最低百分比,甚至可能作为提交到存储库的要求,它会是什么?
请解释你是如何得出你的答案的(因为如果你所做的只是选择一个数字,那么我自己也可以完成;)
如果您强制要求单元测试的代码覆盖率的最低百分比,甚至可能作为提交到存储库的要求,它会是什么?
请解释你是如何得出你的答案的(因为如果你所做的只是选择一个数字,那么我自己也可以完成;)
当前回答
这在很大程度上取决于您的应用程序。例如,一些应用程序主要由不能进行单元测试的GUI代码组成。
其他回答
当我认为我的代码没有经过足够的单元测试,并且我不确定接下来要测试什么时,我使用覆盖率来帮助我决定接下来要测试什么。
如果我在一个单元测试中增加覆盖率——我知道这个单元测试有价值。
这适用于未覆盖的代码,50%覆盖或97%覆盖。
代码覆盖率是很好的,但前提是你从中得到的好处超过了实现它的成本/努力。
一段时间以来,我们一直在努力达到80%的标准,但我们刚刚决定放弃这个标准,转而更专注于我们的测试。专注于复杂的业务逻辑等,
这个决定是由于我们花在追逐代码覆盖率和维护现有单元测试上的时间越来越多。我们觉得我们已经到达了这样一个点:我们从代码覆盖率中得到的好处被认为比我们为实现它所付出的努力要少。
Jon Limjap提出了一个很好的观点——没有一个单一的数字可以作为每个项目的标准。有些项目根本不需要这样的标准。在我看来,公认的答案不足之处在于,它没有描述一个人如何为一个给定的项目做出决定。
我将尝试这样做。我不是测试工程方面的专家,很高兴看到一个更明智的答案。
何时设置代码覆盖率需求
First, why would you want to impose such a standard in the first place? In general, when you want to introduce empirical confidence in your process. What do I mean by "empirical confidence"? Well, the real goal correctness. For most software, we can't possibly know this across all inputs, so we settle for saying that code is well-tested. This is more knowable, but is still a subjective standard: It will always be open to debate whether or not you have met it. Those debates are useful and should occur, but they also expose uncertainty.
代码覆盖率是一种客观的度量:一旦您看到覆盖率报告,对于是否满足标准是有用的就没有什么不明确的了。它能证明正确性吗?完全不是,但是它与代码测试的良好程度有明确的关系,这反过来是我们增加对其正确性信心的最佳方式。代码覆盖率是我们所关心的不可测量的质量的可测量近似值。
在某些具体情况下,经验标准可以增加价值:
To satisfy stakeholders. For many projects, there are various actors who have an interest in software quality who may not be involved in the day-to-day development of the software (managers, technical leads, etc.) Saying "we're going to write all the tests we really need" is not convincing: They either need to trust entirely, or verify with ongoing close oversight (assuming they even have the technical understanding to do so.) Providing measurable standards and explaining how they reasonably approximate actual goals is better. To normalize team behavior. Stakeholders aside, if you are working on a team where multiple people are writing code and tests, there is room for ambiguity for what qualifies as "well-tested." Do all of your colleagues have the same idea of what level of testing is good enough? Probably not. How do you reconcile this? Find a metric you can all agree on and accept it as a reasonable approximation. This is especially (but not exclusively) useful in large teams, where leads may not have direct oversight over junior developers, for instance. Networks of trust matter as well, but without objective measurements, it is easy for group behavior to become inconsistent, even if everyone is acting in good faith. To keep yourself honest. Even if you're the only developer and only stakeholder for your project, you might have certain qualities in mind for the software. Instead of making ongoing subjective assessments about how well-tested the software is (which takes work), you can use code coverage as a reasonable approximation, and let machines measure it for you.
使用哪些指标
代码覆盖率不是单一的度量;有几种不同的方法来衡量覆盖率。您可以根据哪一种标准来设置标准,这取决于您使用该标准来满足什么。
我将使用两个常见的指标作为例子,说明何时可以使用它们来设置标准:
Statement coverage: What percentage of statements have been executed during testing? Useful to get a sense of the physical coverage of your code: How much of the code that I have written have I actually tested? This kind of coverage supports a weaker correctness argument, but is also easier to achieve. If you're just using code coverage to ensure that things get tested (and not as an indicator of test quality beyond that) then statement coverage is probably sufficient. Branch coverage: When there is branching logic (e.g. an if), have both branches been evaluated? This gives a better sense of the logical coverage of your code: How many of the possible paths my code may take have I tested? This kind of coverage is a much better indicator that a program has been tested across a comprehensive set of inputs. If you're using code coverage as your best empirical approximation for confidence in correctness, you should set standards based on branch coverage or similar.
还有许多其他指标(例如,行覆盖率与语句覆盖率相似,但对于多行语句产生不同的数值结果;条件覆盖和路径覆盖类似于分支覆盖,但反映了您可能遇到的程序执行的可能排列的更详细的视图。)
需要多大的比例
最后,回到最初的问题:如果您设置了代码覆盖率标准,那么这个数字应该是多少?
希望大家已经很清楚了我们讨论的是一开始的近似值,所以我们选的任何数都是固有的近似值。
你可以选择一些数字:
100%. You might choose this because you want to be sure everything is tested. This doesn't give you any insight into test quality, but does tell you that some test of some quality has touched every statement (or branch, etc.) Again, this comes back to degree of confidence: If your coverage is below 100%, you know some subset of your code is untested. Some might argue that this is silly, and you should only test the parts of your code that are really important. I would argue that you should also only maintain the parts of your code that are really important. Code coverage can be improved by removing untested code, too. 99% (or 95%, other numbers in the high nineties.) Appropriate in cases where you want to convey a level of confidence similar to 100%, but leave yourself some margin to not worry about the occasional hard-to-test corner of code. 80%. I've seen this number in use a few times, and don't entirely know where it originates. I think it might be a weird misappropriation of the 80-20 rule; generally, the intent here is to show that most of your code is tested. (Yes, 51% would also be "most", but 80% is more reflective of what most people mean by most.) This is appropriate for middle-ground cases where "well-tested" is not a high priority (you don't want to waste effort on low-value tests), but is enough of a priority that you'd still like to have some standard in place.
在实践中,我从未见过低于80%的数字,也很难想象在什么情况下会设置这些数字。这些标准的作用是增强人们对正确性的信心,而低于80%的数字并不能特别鼓舞人们的信心。(是的,这是主观的,但同样,这个想法是在你设定标准时做出一次主观选择,然后再使用客观的测量方法。)
其他的笔记
以上假设正确性是目标。代码覆盖率只是信息;它可能与其他目标相关。例如,如果您关心可维护性,那么您可能会关心松耦合,松耦合可以通过可测试性来证明,而可测试性又可以(以某种方式)通过代码覆盖率来度量。因此,代码覆盖率标准也为近似“可维护性”的质量提供了经验基础。
许多商店不看重测试的价值,所以如果你高于零,至少有一些价值的升值——所以可以说非零并不是坏事,因为许多仍然是零。
在。net世界中,人们经常引用80%作为合理的。但题目说的是溶液水平。我更喜欢在项目级别进行度量:如果有Selenium等或手动测试,那么UI项目的30%可能就可以了,数据层项目的20%可能就可以了,但是对于业务规则层(如果不是完全必要的话),95%以上可能是可以实现的。因此,总体覆盖率可能是60%,但关键业务逻辑可能更高。
我也听过这样的话:追求100%,你就能达到80%;但是,立志达到80%,你就会达到40%。
底线:应用80:20规则,让应用程序的bug计数来指导你。
Alberto Savoia的这篇散文恰好回答了这个问题(以一种非常有趣的方式!):
http://www.artima.com/forums/flat.jsp?forum=106&thread=204677
Testivus On Test Coverage Early one morning, a programmer asked the great master: “I am ready to write some unit tests. What code coverage should I aim for?” The great master replied: “Don’t worry about coverage, just write some good tests.” The programmer smiled, bowed, and left. ... Later that day, a second programmer asked the same question. The great master pointed at a pot of boiling water and said: “How many grains of rice should I put in that pot?” The programmer, looking puzzled, replied: “How can I possibly tell you? It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on.” “Exactly,” said the great master. The second programmer smiled, bowed, and left. ... Toward the end of the day, a third programmer came and asked the same question about code coverage. “Eighty percent and no less!” Replied the master in a stern voice, pounding his fist on the table. The third programmer smiled, bowed, and left. ... After this last reply, a young apprentice approached the great master: “Great master, today I overheard you answer the same question about code coverage with three different answers. Why?” The great master stood up from his chair: “Come get some fresh tea with me and let’s talk about it.” After they filled their cups with smoking hot green tea, the great master began to answer: “The first programmer is new and just getting started with testing. Right now he has a lot of code and no tests. He has a long way to go; focusing on code coverage at this time would be depressing and quite useless. He’s better off just getting used to writing and running some tests. He can worry about coverage later.” “The second programmer, on the other hand, is quite experience both at programming and testing. When I replied by asking her how many grains of rice I should put in a pot, I helped her realize that the amount of testing necessary depends on a number of factors, and she knows those factors better than I do – it’s her code after all. There is no single, simple, answer, and she’s smart enough to handle the truth and work with that.” “I see,” said the young apprentice, “but if there is no single simple answer, then why did you answer the third programmer ‘Eighty percent and no less’?” The great master laughed so hard and loud that his belly, evidence that he drank more than just green tea, flopped up and down. “The third programmer wants only simple answers – even when there are no simple answers … and then does not follow them anyway.” The young apprentice and the grizzled great master finished drinking their tea in contemplative silence.