我必须承认,通常我不会费心在程序中的Debug和Release配置之间切换,我通常选择Debug配置,即使程序实际上部署在客户的位置。

据我所知,如果不手动更改,这些配置之间的唯一区别是Debug定义了Debug常量,而Release检查了优化代码。

所以我的问题实际上是双重的:

这两种配置在性能上有很大差异吗?是否有任何特定类型的代码会导致性能上的巨大差异,或者它实际上并不那么重要? 是否有任何类型的代码可以在调试配置下正常运行,但在发布配置下可能失败,或者您能否确定在调试配置下测试并正常运行的代码在发布配置下也可以正常运行。


当前回答

You should never release a .NET Debug build into production. It may contain ugly code to support Edit-and-Continue or who knows what else. As far as I know, this happens only in VB not C# (note: the original post is tagged C#), but it should still give reason to pause as to what Microsoft thinks they are allowed to do with a Debug build. In fact, prior to .NET 4.0, VB code leaks memory proportional to the number of instances of objects with events that you construct in support of Edit-and-Continue. (Though this is reported to be fixed per https://connect.microsoft.com/VisualStudio/feedback/details/481671/vb-classes-with-events-are-not-garbage-collected-when-debugging, the generated code looks nasty, creating WeakReference objects and adding them to a static list while holding a lock) I certainly don't want any of this kind of debugging support in a production environment!

其他回答

我的经验是,中等大小或较大的应用程序在发布版构建中反应明显更好。在您的应用程序中尝试一下,看看效果如何。 发布版本可能会让您感到困扰的一件事是,调试版本代码有时会抑制竞态条件和其他与线程相关的错误。优化的代码可能导致指令重新排序,更快的执行可能加剧某些竞争条件。

在我的经验中,发布模式中最糟糕的事情就是晦涩的“发布漏洞”。由于IL(中间语言)是在发布模式下优化的,因此存在在调试模式下不会出现的错误的可能性。还有其他关于这个问题的SO问题: 在调试模式中不存在发布版本中的错误的常见原因

这种情况在我身上发生过一两次,一个简单的控制台应用程序在调试模式下运行得很好,但给定完全相同的输入,在发布模式下就会出错。这些bug极其难以调试(讽刺的是,根据发布模式的定义)。

这在很大程度上取决于应用程序的性质。如果您的应用程序是ui密集型的,您可能不会注意到任何不同,因为连接到现代计算机的最慢的组件是用户。如果您使用一些UI动画,您可能想要测试在DEBUG版本中运行时是否能察觉到任何明显的延迟。

然而,如果您有很多计算量大的计算,那么您就会注意到差异(可能高达40%,正如@Pieter所提到的,尽管这取决于计算的性质)。

这基本上是一种设计权衡。如果您在DEBUG版本下发布,那么如果用户遇到问题,您可以获得更有意义的回溯,并且可以进行更灵活的诊断。通过在DEBUG版本中发布,你也可以避免优化器产生模糊的Heisenbugs。

我会说

很大程度上取决于您的实现。通常情况下,差别并没有那么大。我做了很多测量,但经常看不到区别。如果你使用非托管代码,大量的巨大数组和类似的东西,性能的差异会稍微大一些,但不是一个不同的世界(像在c++中)。 通常在发布代码中会显示更少的错误(更高的容忍度),因此开关应该可以正常工作。

在发布版本中,c#编译器本身并没有对发出的IL进行很大的修改。值得注意的是,它不再发出允许您在花括号上设置断点的NOP操作码。最大的一个是内置于JIT编译器中的优化器。我知道它做了以下优化:

Method inlining. A method call is replaced by the injecting the code of the method. This is a big one, it makes property accessors essentially free. CPU register allocation. Local variables and method arguments can stay stored in a CPU register without ever (or less frequently) being stored back to the stack frame. This is a big one, notable for making debugging optimized code so difficult. And giving the volatile keyword a meaning. Array index checking elimination. An important optimization when working with arrays (all .NET collection classes use an array internally). When the JIT compiler can verify that a loop never indexes an array out of bounds then it will eliminate the index check. Big one. Loop unrolling. Loops with small bodies are improved by repeating the code up to 4 times in the body and looping less. Reduces the branch cost and improves the processor's super-scalar execution options. Dead code elimination. A statement like if (false) { /.../ } gets completely eliminated. This can occur due to constant folding and inlining. Other cases is where the JIT compiler can determine that the code has no possible side-effect. This optimization is what makes profiling code so tricky. Code hoisting. Code inside a loop that is not affected by the loop can be moved out of the loop. The optimizer of a C compiler will spend a lot more time on finding opportunities to hoist. It is however an expensive optimization due to the required data flow analysis and the jitter can't afford the time so only hoists obvious cases. Forcing .NET programmers to write better source code and hoist themselves. Common sub-expression elimination. x = y + 4; z = y + 4; becomes z = x; Pretty common in statements like dest[ix+1] = src[ix+1]; written for readability without introducing a helper variable. No need to compromise readability. Constant folding. x = 1 + 2; becomes x = 3; This simple example is caught early by the compiler, but happens at JIT time when other optimizations make this possible. Copy propagation. x = a; y = x; becomes y = a; This helps the register allocator make better decisions. It is a big deal in the x86 jitter because it has few registers to work with. Having it select the right ones is critical to perf.

这些都是非常重要的优化,可以产生很大的不同,例如,当你分析应用程序的调试构建并将其与发布构建进行比较时。只有当代码在你的关键路径上时,你写的5%到10%的代码才会真正影响你的程序性能。JIT优化器不够聪明,不能预先知道什么是关键的,它只能对所有代码应用“将它转到11”的拨号盘。

这些优化对程序执行时间的有效结果通常会受到在其他地方运行的代码的影响。读取文件、执行dbase查询等。使JIT优化器所做的工作完全不可见。不过它并不介意:)

The JIT optimizer is pretty reliable code, mostly because it has been put to the test millions of times. It is extremely rare to have problems in the Release build version of your program. It does happen however. Both the x64 and the x86 jitters have had problems with structs. The x86 jitter has trouble with floating point consistency, producing subtly different results when the intermediates of a floating point calculation are kept in a FPU register at 80-bit precision instead of getting truncated when flushed to memory.