严格的混叠规则是什么?

当问到C语言中常见的未定义行为时，人们有时会提到严格的混叠规则。他们在谈论什么?

当前回答

通过指针强制转换(而不是使用联合)的类型双关是打破严格混叠的一个主要例子。

2008-09-19 01:38:01

其他回答

我找到的最好的解释是Mike Acton的《Understanding Strict Aliasing》。本文主要关注PS3的开发，但这基本上只是GCC的工作。

摘自文章:

严格混叠是C(或c++)编译器的一个假设，即指向不同类型对象的指针的解引用永远不会指向相同的内存位置(即相互混叠)。

所以基本上，如果你有一个int*指向一些包含int型的内存，然后你把一个float*指向那个内存，并把它用作浮点数，你就违反了规则。如果你的代码不尊重这一点，那么编译器的优化器很可能会破坏你的代码。

该规则的例外是一个char*，它被允许指向任何类型。

2008-09-19 01:38:15

一个典型的情况下，你遇到严格的混叠问题是当覆盖一个结构(如设备/网络msg)到你的系统的字大小的缓冲区(如uint32_ts或uint16_ts指针)。当你将一个结构叠加到这样的缓冲区上，或者通过指针强制转换将一个缓冲区叠加到这样的结构上时，你很容易违反严格的混叠规则。

在这种设置中，如果我想发送消息到某个对象，我必须有两个不兼容的指针指向同一块内存。然后我可能会天真地编写如下代码:

typedef struct Msg
{
    unsigned int a;
    unsigned int b;
} Msg;

void SendWord(uint32_t);

int main(void)
{
    // Get a 32-bit buffer from the system
    uint32_t* buff = malloc(sizeof(Msg));
    
    // Alias that buffer through message
    Msg* msg = (Msg*)(buff);
    
    // Send a bunch of messages    
    for (int i = 0; i < 10; ++i)
    {
        msg->a = i;
        msg->b = i+1;
        SendWord(buff[0]);
        SendWord(buff[1]);   
    }
}

严格的混叠规则使得这种设置是非法的:对一个指针进行解引用，该指针的混叠对象不是兼容类型或C 2011 6.5第71段允许的其他类型之一，这是未定义的行为。不幸的是，您仍然可以以这种方式编码，可能会得到一些警告，让它编译良好，但在运行代码时却会出现奇怪的意外行为。

(GCC在给出别名警告的能力上似乎有些不一致，有时给我们一个友好的警告，有时则不是。)

To see why this behavior is undefined, we have to think about what the strict aliasing rule buys the compiler. Basically, with this rule, it doesn't have to think about inserting instructions to refresh the contents of buff every run of the loop. Instead, when optimizing, with some annoyingly unenforced assumptions about aliasing, it can omit those instructions, load buff[0] and buff[1] into CPU registers once before the loop is run, and speed up the body of the loop. Before strict aliasing was introduced, the compiler had to live in a state of paranoia that the contents of buff could change by any preceding memory stores. So to get an extra performance edge, and assuming most people don't type-pun pointers, the strict aliasing rule was introduced.

请记住，如果您认为这个示例是虚构的，那么即使您将缓冲区传递给另一个为您执行发送的函数，也可能会发生这种情况。

void SendMessage(uint32_t* buff, size_t size32)
{
    for (int i = 0; i < size32; ++i) 
    {
        SendWord(buff[i]);
    }
}

并重写了之前的循环来利用这个方便的函数

for (int i = 0; i < 10; ++i)
{
    msg->a = i;
    msg->b = i+1;
    SendMessage(buff, 2);
}

The compiler may or may not be able to or smart enough to try to inline SendMessage and it may or may not decide to load or not load buff again. If SendMessage is part of another API that's compiled separately, it probably has instructions to load buff's contents. Then again, maybe you're in C++ and this is some templated header only implementation that the compiler thinks it can inline. Or maybe it's just something you wrote in your .c file for your own convenience. Anyway undefined behavior might still ensue. Even when we know some of what's happening under the hood, it's still a violation of the rule so no well defined behavior is guaranteed. So just by wrapping in a function that takes our word delimited buffer doesn't necessarily help.

我该怎么解决这个问题呢?

Use a union. Most compilers support this without complaining about strict aliasing. This is allowed in C99 and explicitly allowed in C11. union { Msg msg; unsigned int asBuffer[sizeof(Msg)/sizeof(unsigned int)]; }; You can disable strict aliasing in your compiler (f[no-]strict-aliasing in gcc)) You can use char* for aliasing instead of your system's word. The rules allow an exception for char* (including signed char and unsigned char). It's always assumed that char* aliases other types. However this won't work the other way: there's no assumption that your struct aliases a buffer of chars.

初学者要小心

这只是将两种类型叠加在一起时的一个潜在雷区。您还应该了解字节顺序、单词对齐，以及如何通过正确打包结构来处理对齐问题。

脚注

C 2011 6.5 7允许左值访问的类型有:

a type compatible with the effective type of the object, a qualified version of a type compatible with the effective type of the object, a type that is the signed or unsigned type corresponding to the effective type of the object, a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or a character type.

2008-09-19 02:36:24

通过指针强制转换(而不是使用联合)的类型双关是打破严格混叠的一个主要例子。

2008-09-19 01:38:01

严格的混叠不只是指指针，它也影响引用，我为boost开发者wiki写了一篇关于它的论文，它很受欢迎，我把它变成了我的咨询网站上的一个页面。它完全解释了它是什么，为什么它让人们如此困惑，以及如何应对它。严格的混叠白皮书。它特别解释了为什么联合对于c++来说是危险的行为，以及为什么使用memcpy是唯一可以在C和c++之间移植的修复程序。希望这对你有帮助。

2011-06-19 23:46:55

严格的混叠是不允许不同的指针类型指向相同的数据。

本文将帮助您全面详细地理解这个问题。

2008-09-19 01:33:31

严格的混叠规则是什么?

推荐文章

最新文章

标签