我以前很轻松地使用过工会;今天当我读到这篇文章并知道这个代码时,我很震惊

union ARGB
{
    uint32_t colour;

    struct componentsTag
    {
        uint8_t b;
        uint8_t g;
        uint8_t r;
        uint8_t a;
    } components;

} pixel;

pixel.colour = 0xff040201;  // ARGB::colour is the active member from now on

// somewhere down the line, without any edit to pixel

if(pixel.components.a)      // accessing the non-active member ARGB::components

实际上是未定义的行为,即从工会成员中读取除最近写的人以外的内容会导致未定义的行为。如果这不是联合的预期用途,那么什么才是?谁能详细解释一下吗?

更新:

我想事后澄清一些事情。

The answer to the question isn't the same for C and C++; my ignorant younger self tagged it as both C and C++. After scouring through C++11's standard I couldn't conclusively say that it calls out accessing/inspecting a non-active union member is undefined/unspecified/implementation-defined. All I could find was §9.5/1: If a standard-layout union contains several standard-layout structs that share a common initial sequence, and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members. §9.2/19: Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members. While in C, (C99 TC3 - DR 283 onwards) it's legal to do so (thanks to Pascal Cuoq for bringing this up). However, attempting to do it can still lead to undefined behavior, if the value read happens to be invalid (so called "trap representation") for the type it is read through. Otherwise, the value read is implementation defined. C89/90 called this out under unspecified behavior (Annex J) and K&R's book says it's implementation defined. Quote from K&R: This is the purpose of a union - a single variable that can legitimately hold any of one of several types. [...] so long as the usage is consistent: the type retrieved must be the type most recently stored. It is the programmer's responsibility to keep track of which type is currently stored in a union; the results are implementation-dependent if something is stored as one type and extracted as another. Extract from Stroustrup's TC++PL (emphasis mine) Use of unions can be essential for compatness of data [...] sometimes misused for "type conversion".

最重要的是,这个问题(它的标题从我的提问开始就没有改变)是为了理解联合的目的而提出的,而不是关于标准允许什么。例如,使用继承来实现代码重用当然是c++标准允许的,但这并不是将继承引入c++语言特性的目的或初衷。这就是为什么安德烈的回答仍然被人们所接受的原因。


当前回答

尽管这是严格未定义的行为,但实际上它适用于几乎任何编译器。它是一种被广泛使用的范例,任何有自尊的编译器都需要在这种情况下做“正确的事情”。它当然比类型双关语更受欢迎,在某些编译器中,类型双关语很可能会生成坏代码。

其他回答

正如其他人提到的,联合与枚举结合并包装成结构体可用于实现带标签的联合。一个实际用途是实现Rust的Result<T, E>,它最初是使用纯枚举实现的(Rust可以在枚举变量中保存额外的数据)。下面是一个c++的例子:

template <typename T, typename E> struct Result {
    public:
    enum class Success : uint8_t { Ok, Err };
    Result(T val) {
        m_success = Success::Ok;
        m_value.ok = val;
    }
    Result(E val) {
        m_success = Success::Err;
        m_value.err = val;
    }
    inline bool operator==(const Result& other) {
        return other.m_success == this->m_success;
    }
    inline bool operator!=(const Result& other) {
        return other.m_success != this->m_success;
    }
    inline T expect(const char* errorMsg) {
        if (m_success == Success::Err) throw errorMsg;
        else return m_value.ok;
    }
    inline bool is_ok() {
        return m_success == Success::Ok;
    }
    inline bool is_err() {
        return m_success == Success::Err;
    }
    inline const T* ok() {
        if (is_ok()) return m_value.ok;
        else return nullptr;
    }
    inline const T* err() {
        if (is_err()) return m_value.err;
        else return nullptr;
    }

    // Other methods from https://doc.rust-lang.org/std/result/enum.Result.html

    private:
    Success m_success;
    union _val_t { T ok; E err; } m_value;
}

从技术上讲,它是未定义的,但实际上大多数(所有?)编译器都将其视为从一种类型使用reinterpret_cast到另一种类型,其结果是已定义的实现。我不会因为你现在的代码而失眠。

再举一个联合实际使用的例子,CORBA框架使用带标签的联合方法序列化对象。所有用户定义的类都是一个(巨大的)联合的成员,整数标识符告诉解编码器如何解释该联合。

@bobobobo代码是正确的,正如@Joshua指出的那样(遗憾的是,我不允许添加注释,所以在这里做,IMO不允许它放在第一位的坏决定):

https://en.cppreference.com/w/cpp/language/data_members#Standard_layout告诉我们这样做是可以的,至少从c++ 14开始

在具有非并集类类型T1的活动成员的标准布局联合中,允许读取另一个非并集类类型T2的联合成员的非静态数据成员m,前提是m是T1和T2的公共初始序列的一部分(除非通过非易失性glvalue读取易失性成员是未定义的)。

因为在当前的情况下T1和T2无论如何都提供了相同的类型。

在c++中,Boost Variant实现了一个安全的联合版本,旨在尽可能地防止未定义的行为。

它的性能与enum + union结构相同(也分配了堆栈等),但它使用类型的模板列表而不是enum:)