我正在使用一个用于DSP芯片的编译器,该编译器故意生成代码,从C代码中访问一个数组的末尾,而C代码没有!
这是因为循环是结构化的,因此迭代结束时将为下一次迭代预取一些数据。因此,在最后一次迭代结束时预取的数据实际上从未被使用。
编写这样的C代码会调用未定义的行为,但这只是一个标准文档的形式,它关注的是最大的可移植性。
更常见的情况是,访问越界的程序没有被巧妙地优化。它只是有bug。代码获取一些垃圾值,并且与前面提到的编译器的优化循环不同,代码随后在后续计算中使用该值,从而破坏了它们。
捕获这样的错误是值得的,因此即使仅仅为了这个原因,也值得使行为未定义:这样运行时就可以产生类似“main.c第42行数组溢出”这样的诊断消息。
在具有虚拟内存的系统上,分配数组时,后面的地址可能位于虚拟内存的未映射区域。访问将轰炸程序。
说句题外话,请注意,在C语言中,我们允许创建一个指针,它位于数组的末尾之后。这个指针必须比任何指向数组内部的指针都要大。
这意味着C实现不能将数组放在内存的末尾,在那里,1 +地址会被环绕,看起来比数组中的其他地址更小。
Nevertheless, access to uninitialized or out of bounds values are sometimes a valid optimization technique, even if not maximally portable. This is for instance why the Valgrind tool does not report accesses to uninitialized data when those accesses happen, but only when the value is later used in some way that could affect the outcome of the program. You get a diagnostic like "conditional branch in xxx:nnn depends on uninitialized value" and it can be sometimes hard to track down where it originates. If all such accesses were trapped immediately, there would be a lot of false positives arising from compiler optimized code as well as correctly hand-optimized code.
Speaking of which, I was working with some codec from a vendor which was giving off these errors when ported to Linux and run under Valgrind. But the vendor convinced me that only several bits of the value being used actually came from uninitialized memory, and those bits were carefully avoided by the logic.. Only the good bits of the value were being used and Valgrind doesn't have the ability to track down to the individual bit. The uninitialized material came from reading a word past the end of a bit stream of encoded data, but the code knows how many bits are in the stream and will not use more bits than there actually are. Since the access beyond the end of the bit stream array does not cause any harm on the DSP architecture (there is no virtual memory after the array, no memory-mapped ports, and the address does not wrap) it is a valid optimization technique.
“未定义的行为”并没有多大意义,因为根据ISO C,简单地包含一个C标准中没有定义的头文件,或者调用一个程序本身或C标准中没有定义的函数,都是未定义行为的例子。未定义的行为并不意味着“没有被地球上的任何人定义”,而是“没有被ISO C标准定义”。当然,有时候未定义的行为是绝对没有人能定义的。