在听StackOverflow播客的时候,经常有人说“真正的程序员”是用C语言编写的,而C语言的速度要快得多,因为它“接近机器”。把前面的断言留到另一篇文章,C有什么特别之处,使它比其他语言更快?或者换句话说:什么能阻止其他语言编译成二进制代码,使其运行速度与C语言一样快?


当前回答

甚至C和c++之间的差异有时也会很大。

当你为一个对象分配内存,调用构造函数,在字边界上对齐内存等等,程序最终会经历很多开销,这些开销都是从程序员那里抽象出来的。

C迫使您查看程序所做的每一件事,通常是非常精细的细节。这使得编写执行大量与当前目标无关的任务的代码变得更加困难(尽管并非完全不可能)。

因此,例如在BASIC程序中,你可以使用INPUT关键字从STDIN读取字符串并自动为其变量分配内存,在C中,程序员通常已经分配了内存,并可以控制诸如程序是否阻塞I/O,以及它是否在获得所需信息后停止读取输入或继续读取字符到行尾等事情。

C also performs a lot less error-checking than other languages, presuming the programmer knows what they're doing. So whereas in PHP if you declare a string $myStr = getInput(); and go on to reference $myStr[20], but the input was only 10 characters long, PHP will catch this and safely return to you a blank string. C assumes that you've either allocated enough memory to hold data past the end of the string or that you know what information comes after the string and are trying to reference that instead. These small factors have a huge impact on overhead in aggregate.

其他回答

如果你花了一个月的时间用C语言构建的程序只需要0.05秒,而我花了一天的时间用Java写同样的程序,只需要0.10秒,那么C语言真的更快吗?

但是回答你的问题,编写良好的C代码通常会比其他语言编写的代码运行得更快,因为编写良好的C代码的一部分包括在接近机器的级别上进行手动优化。

尽管编译器确实非常聪明,但它们还不能创造性地提出与手工按摩算法竞争的代码(假设“手”属于一个优秀的C程序员)。

编辑:

很多评论都是这样的:“我用C语言编写,我不考虑优化。”

举个具体的例子:

在Delphi中我可以这样写:

function RemoveAllAFromB(a, b: string): string;
var
  before, after :string;
begin
  Result := b;
  if 0 < Pos(a,b) then begin
    before := Copy(b,1,Pos(a,b)-Length(a));
    after := Copy(b,Pos(a,b)+Length(a),Length(b));
    Result := before + after;
    Result := RemoveAllAFromB(a,Result);  //recursive
  end;
end;

用C语言写:

char *s1, *s2, *result; /* original strings and the result string */
int len1, len2; /* lengths of the strings */
for (i = 0; i < len1; i++) {
   for (j = 0; j < len2; j++) {
     if (s1[i] == s2[j]) {
       break;
     }
   }
   if (j == len2) {  /* s1[i] is not found in s2 */
     *result = s1[i]; 
     result++; /* assuming your result array is long enough */
   }
}

但是C版本中有多少优化呢?我们在实现方面做了很多我在Delphi版本中没有考虑到的决定。字符串是如何实现的?在特尔斐我看不出来。在C语言中,我已经决定它将是一个指向ASCII整数数组的指针,我们称之为字符。在C语言中,我们每次测试一个字符的存在性。在Delphi中,我使用Pos。

这只是一个小例子。在一个大型程序中,C程序员必须对每几行代码做出这类低级决策。它加起来就是一个手工制作、手工优化的可执行文件。

这实际上是一个长期存在的谎言。虽然C程序确实经常更快,但情况并非总是如此,特别是当C程序员不太擅长它的时候。

人们往往会忘记的一个明显的漏洞是,当程序必须为某种IO阻塞时,比如任何GUI程序中的用户输入。在这些情况下,使用什么语言并不重要,因为您受到数据传入速度的限制,而不是处理数据的速度。在这种情况下,不管你使用的是C、Java、c#甚至Perl;你不能比数据进入的速度更快。

The other major thing is that using garbage collection and not using proper pointers allows the virtual machine to make a number of optimizations not available in other languages. For instance, the JVM is capable of moving objects around on the heap to defragment it. This makes future allocations much faster since the next index can simply be used rather than looking it up in a table. Modern JVMs also don't have to actually deallocate memory; instead, they just move the live objects around when they GC and the spent memory from the dead objects is recovered essentially for free.

This also brings up an interesting point about C and even more so in C++. There is something of a design philosophy of "If you don't need it, you don't pay for it." The problem is that if you do want it, you end up paying through the nose for it. For instance, the vtable implementation in Java tends to be a lot better than C++ implementations, so virtual function calls are a lot faster. On the other hand, you have no choice but to use virtual functions in Java and they still cost something, but in programs that use a lot of virtual functions, the reduced cost adds up.

撇开诸如热点优化、预编译元算法和各种形式的并行等高级优化技术不提,语言的基本速度与支持通常在内部循环中指定的操作所需的隐含的幕后复杂性密切相关。

也许最明显的方法是对间接内存引用进行有效性检查——比如检查指针是否为空,检查索引是否符合数组边界。大多数高级语言隐式地执行这些检查,但C不这样做。然而,这并不一定是这些其他语言的基本限制——一个足够聪明的编译器可能能够通过某种形式的循环不变代码运动,从算法的内部循环中删除这些检查。

C语言(在类似程度上与c++密切相关)更基本的优势是严重依赖基于堆栈的内存分配,这本质上是快速的分配、回收和访问。在C(和c++)中,主调用堆栈可用于分配原语、数组和聚合(结构/类)。

虽然C语言确实提供了动态分配任意大小和生命周期的内存的能力(使用所谓的“堆”),但默认情况下是避免这样做的(而是使用堆栈)。

诱人的是,有时可以在其他编程语言的运行时环境中复制C内存分配策略。asm.js已经证明了这一点,它允许用C或c++编写的代码被翻译成JavaScript的子集,并以接近本机的速度安全地运行在web浏览器环境中。


As somewhat of an aside, another area where C and C++ outshine most other languages for speed is the ability to seamlessly integrate with native machine instruction sets. A notable example of this is the (compiler and platform dependent) availability of SIMD intrinsics which support the construction of custom algorithms that take advantage of the now nearly ubiquitous parallel processing hardware -- while still utilizing the data allocation abstractions provided by the language (lower-level register allocation is managed by the compiler).

使用现代优化编译器,纯C程序不太可能比编译后的。net代码快得多,如果有的话。通过像。net这样的框架为开发人员提供的生产力提高,您可以在一天内完成过去用普通c语言需要几周或几个月才能完成的工作。再加上与开发人员的工资相比,硬件成本低廉,用高级语言编写这些东西并以任何缓慢的速度抛出硬件要便宜得多。

The reason Jeff and Joel talk about C being the "real programmer" language is because there is no hand-holding in C. You must allocate your own memory, deallocate that memory, do your own bounds-checking, etc. There's no such thing as new object(); There's no garbage collection, classes, OOP, entity frameworks, LINQ, properties, attributes, fields, or anything like that. You have to know things like pointer arithmetic and how to dereference a pointer. And, for that matter, know and understand what a pointer is. You have to know what a stack frame is and what the instruction pointer is. You have to know the memory model of the CPU architecture you're working on. There is a lot of implicit understanding of the architecture of a microcomputer (usually the microcomputer you're working on) when programming in C that simply is not present nor necessary when programming in something like C# or Java. All of that information has been off-loaded to the compiler (or VM) programmer.

甚至C和c++之间的差异有时也会很大。

当你为一个对象分配内存,调用构造函数,在字边界上对齐内存等等,程序最终会经历很多开销,这些开销都是从程序员那里抽象出来的。

C迫使您查看程序所做的每一件事,通常是非常精细的细节。这使得编写执行大量与当前目标无关的任务的代码变得更加困难(尽管并非完全不可能)。

因此,例如在BASIC程序中,你可以使用INPUT关键字从STDIN读取字符串并自动为其变量分配内存,在C中,程序员通常已经分配了内存,并可以控制诸如程序是否阻塞I/O,以及它是否在获得所需信息后停止读取输入或继续读取字符到行尾等事情。

C also performs a lot less error-checking than other languages, presuming the programmer knows what they're doing. So whereas in PHP if you declare a string $myStr = getInput(); and go on to reference $myStr[20], but the input was only 10 characters long, PHP will catch this and safely return to you a blank string. C assumes that you've either allocated enough memory to hold data past the end of the string or that you know what information comes after the string and are trying to reference that instead. These small factors have a huge impact on overhead in aggregate.