什么时候应该使用工会?我们为什么需要它们?
当前回答
许多答案都涉及从一种类型转换到另一种类型。我从具有相同类型的联合中得到最多的使用(即在解析串行数据流时)。它们允许解析/构造一个有框架的包变得很简单。
typedef union
{
UINT8 buffer[PACKET_SIZE]; // Where the packet size is large enough for
// the entire set of fields (including the payload)
struct
{
UINT8 size;
UINT8 cmd;
UINT8 payload[PAYLOAD_SIZE];
UINT8 crc;
} fields;
}PACKET_T;
// This should be called every time a new byte of data is ready
// and point to the packet's buffer:
// packet_builder(packet.buffer, new_data);
void packet_builder(UINT8* buffer, UINT8 data)
{
static UINT8 received_bytes = 0;
// All range checking etc removed for brevity
buffer[received_bytes] = data;
received_bytes++;
// Using the struc only way adds lots of logic that relates "byte 0" to size
// "byte 1" to cmd, etc...
}
void packet_handler(PACKET_T* packet)
{
// Process the fields in a readable manner
if(packet->fields.size > TOO_BIG)
{
// handle error...
}
if(packet->fields.cmd == CMD_X)
{
// do stuff..
}
}
编辑 关于字节序和结构填充的评论是有效的,而且非常值得关注。我几乎完全在嵌入式软件中使用了这段代码,其中大部分我都可以控制管道的两端。
其他回答
COM接口中使用的VARIANT呢?它有两个字段——“type”和一个包含实际值的联合,该值根据“type”字段进行处理。
联合允许互斥的数据成员共享相同的内存。当内存比较稀缺时,例如在嵌入式系统中,这是非常重要的。
示例如下:
union {
int a;
int b;
int c;
} myUnion;
这个联合将占用一个int值的空间,而不是3个独立的int值。如果用户设置了a的值,然后设置了b的值,它将覆盖a的值,因为它们都共享相同的内存位置。
我想说,它可以更容易地重用可能以不同方式使用的内存,即节省内存。例如,你想做一些“变体”结构体,能够保存一个短字符串以及一个数字:
struct variant {
int type;
double number;
char *string;
};
在32位系统中,这将导致每个变体实例至少使用96位或12个字节。
使用联合可以将大小减小到64位或8字节:
struct variant {
int type;
union {
double number;
char *string;
} value;
};
如果你想添加更多不同的变量类型,你甚至可以保存更多。这可能是真的,你可以做类似的事情,强制转换一个空指针-但联合使它更容易访问,以及类型安全。这样的节省听起来并不是很大,但是您节省了用于该结构的所有实例的三分之一的内存。
在C的早期版本中,所有结构声明都共享一组公共字段。考虑到:
struct x {int x_mode; int q; float x_f};
struct y {int y_mode; int q; int y_l};
struct z {int z_mode; char name[20];};
a compiler would essentially produce a table of structures' sizes (and possibly alignments), and a separate table of structures' members' names, types, and offsets. The compiler didn't keep track of which members belonged to which structures, and would allow two structures to have a member with the same name only if the type and offset matched (as with member q of struct x and struct y). If p was a pointer to any structure type, p->q would add the offset of "q" to pointer p and fetch an "int" from the resulting address.
Given the above semantics, it was possible to write a function that could perform some useful operations on multiple kinds of structure interchangeably, provided that all the fields used by the function lined up with useful fields within the structures in question. This was a useful feature, and changing C to validate members used for structure access against the types of the structures in question would have meant losing it in the absence of a means of having a structure that can contain multiple named fields at the same address. Adding "union" types to C helped fill that gap somewhat (though not, IMHO, as well as it should have been).
An essential part of unions' ability to fill that gap was the fact that a pointer to a union member could be converted into a pointer to any union containing that member, and a pointer to any union could be converted to a pointer to any member. While the C89 Standard didn't expressly say that casting a T* directly to a U* was equivalent to casting it to a pointer to any union type containing both T and U, and then casting that to U*, no defined behavior of the latter cast sequence would be affected by the union type used, and the Standard didn't specify any contrary semantics for a direct cast from T to U. Further, in cases where a function received a pointer of unknown origin, the behavior of writing an object via T*, converting the T* to a U*, and then reading the object via U* would be equivalent to writing a union via member of type T and reading as type U, which would be standard-defined in a few cases (e.g. when accessing Common Initial Sequence members) and Implementation-Defined (rather than Undefined) for the rest. While it was rare for programs to exploit the CIS guarantees with actual objects of union type, it was far more common to exploit the fact that pointers to objects of unknown origin had to behave like pointers to union members and have the behavioral guarantees associated therewith.
一个简单而有用的例子是....
想象一下:
你有一个uint32_t数组[2],想要访问字节链的第3个和第4个字节。 你可以做*((uint16_t*) &数组[1])。 但遗憾的是,这打破了严格的混叠规则!
但是已知的编译器允许你做以下事情:
union un
{
uint16_t array16[4];
uint32_t array32[2];
}
严格来说,这仍然是违反规则的。但是所有已知的标准都支持这种用法。