考虑:

struct mystruct_A
{
   char a;
   int b;
   char c;
} x;

struct mystruct_B
{
   int b;
   char a;
} y;

结构尺寸分别为12和8。

这些结构是填充的还是包装的?

什么时候进行填充或包装?


当前回答

填充将结构成员对齐到“自然”地址边界——例如,int成员将有偏移量,在32位平台上是mod(4) == 0。默认情况下,填充是开启的。它在你的第一个结构中插入以下“间隙”:

struct mystruct_A {
    char a;
    char gap_0[3]; /* inserted by compiler: for alignment of b */
    int b;
    char c;
    char gap_1[3]; /* -"-: for alignment of the whole struct in an array */
} x;

另一方面,打包可以防止编译器进行填充-这必须显式地请求-在GCC下,它是__attribute__((__packked__)),因此如下:

struct __attribute__((__packed__)) mystruct_A {
    char a;
    int b;
    char c;
};

会在32位架构上产生大小为6的结构。

不过需要注意的是,在允许未对齐内存访问的体系结构(如x86和amd64)上,未对齐内存访问速度较慢,并且在严格对齐的体系结构(如SPARC)上是明确禁止的。

其他回答

这件事没有但是!想要掌握这门学科必须做到以下几点:

细读埃里克·s·雷蒙德所著的《丢失的结构包装艺术》 看一下Eric的代码示例 最后但并非最不重要的是,不要忘记下面关于填充的规则,即结构体的对齐方式与最大类型的对齐方式一致 要求。

我知道这个问题很老了,这里的大多数答案都很好地解释了填充,但当我自己试图理解它时,我发现对正在发生的事情有一个“视觉”形象是有帮助的。

处理器以一定大小(字)的“块”读取内存。假设处理器字有8字节长。它将把内存看作一个8字节的大行构建块。每当它需要从内存中获取一些信息时,它就会到达其中一个块并获取它。

如上图所示,一个Char(1字节长)在哪里并不重要,因为它将在其中一个块中,只需要CPU处理1个字。

When we deal with data larger than one byte, like a 4 byte int or a 8 byte double, the way they are aligned in the memory makes a difference on how many words will have to be processed by the CPU. If 4-byte chunks are aligned in a way they always fit the inside of a block (memory address being a multiple of 4) only one word will have to be processed. Otherwise a chunk of 4-bytes could have part of itself on one block and part on another, requiring the processor to process 2 words to read this data.

这同样适用于8字节的double,只不过现在它必须在8的倍数内存地址中,以确保它始终在块中。

这里考虑的是8字节的字处理器,但这个概念也适用于其他大小的字。

填充通过填充这些数据之间的间隙来确保它们与这些块对齐,从而提高读取内存时的性能。

然而,正如其他人回答的那样,有时空间比性能本身更重要。也许您正在一台没有太多RAM的计算机上处理大量数据(可以使用交换空间,但速度要慢得多)。您可以在程序中排列变量,直到完成最少的填充(这在其他一些回答中得到了很好的例子),但如果这还不够,您可以显式地禁用填充,这就是打包。

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system) or larger. Data alignment means putting the data at a memory address equal to some multiple of the word size, which increases the system’s performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

In order to align the data in memory, one or more empty bytes (addresses) are inserted (or left empty) between memory addresses which are allocated for other structure members while memory allocation. This concept is called structure padding. Architecture of a computer processor is such a way that it can read 1 word (4 byte in 32 bit processor) from memory at a time. To make use of this advantage of processor, data are always aligned as 4 bytes package which leads to insert empty addresses between other member’s address. Because of this structure padding concept in C, size of the structure is always not same as what we think.

填充将结构成员对齐到“自然”地址边界——例如,int成员将有偏移量,在32位平台上是mod(4) == 0。默认情况下,填充是开启的。它在你的第一个结构中插入以下“间隙”:

struct mystruct_A {
    char a;
    char gap_0[3]; /* inserted by compiler: for alignment of b */
    int b;
    char c;
    char gap_1[3]; /* -"-: for alignment of the whole struct in an array */
} x;

另一方面,打包可以防止编译器进行填充-这必须显式地请求-在GCC下,它是__attribute__((__packked__)),因此如下:

struct __attribute__((__packed__)) mystruct_A {
    char a;
    int b;
    char c;
};

会在32位架构上产生大小为6的结构。

不过需要注意的是,在允许未对齐内存访问的体系结构(如x86和amd64)上,未对齐内存访问速度较慢,并且在严格对齐的体系结构(如SPARC)上是明确禁止的。

变量存储在可以被其对齐方式(通常是大小)整除的任何地址上。所以,填充/填充不仅仅是为了结构。实际上,所有数据都有自己的对齐要求:

int main(void) {
    // We assume the `c` is stored as first byte of machine word
    // as a convenience! If the `c` was stored as a last byte of previous
    // word, there is no need to pad bytes before variable `i`
    // because `i` is automatically aligned in a new word.

    char      c;  // starts from any addresses divisible by 1(any addresses).
    char pad[3];  // not-used memory for `i` to start from its address.
    int32_t   i;  // starts from any addresses divisible by 4.

这类似于struct,但有一些区别。首先,我们可以说有两种填充——a)为了正确地从每个成员的地址开始,在成员之间插入一些字节。b)为了正确地从struct的地址启动下一个struct实例,将一些字节追加到每个struct:

// Example for rule 1 below.
struct st {
    char      c;  // starts from any addresses divisible by 4, not 1.
    char pad[3];  // not-used memory for `i` to start from its address.
    int32_t   i;  // starts from any addresses divisible by 4.
};

// Example for rule 2 below.
struct st {
    int32_t   i;  // starts from any addresses divisible by 4.
    char      c;  // starts from any addresses.
    char pad[3];  // not-used memory for next `st`(or anything that has same
                  // alignment requirement) to start from its own address.
};

The struct's first member always starts from any addresses divisible by struct's own alignment requirement which is determined by largest member's alignment requirement(here 4, alignment of int32_t). This is different with normal variables. The normal variables can start any addresses divisible by its alignment, but it is not the case for struct's first member. As you know, the address of a struct is the same as the address of its first member. There can be additional padded trailing bytes inside a struct, making next struct(or next element in an array of structs) starting from its own address. Think of struct st arr[2];. To make arr[1](arr[1]'s first member) starting from an address divisible by 4, we should append 3 bytes at the end of each struct.

这是我从《丢失的结构包装艺术》中学到的。

注意:可以通过_Alignof操作符来研究数据类型的对齐要求。同样,你也可以通过offsetof宏来获取结构中成员的偏移量。