为什么我得到一个分割错误时写入一个“char *s”初始化的字符串文字，而不是“char s[]”?

下面的代码在第2行接收到seg错误:

char *str = "string";
str[0] = 'z';  // could be also written as *str = 'z'
printf("%s\n", str);

虽然这个方法非常有效:

char str[] = "string";
str[0] = 'z';
printf("%s\n", str);

用MSVC和GCC测试。

当前回答

在第一个代码中，"string"是一个字符串常量，字符串常量永远不应该被修改，因为它们通常被放置在只读内存中。"str"是一个用来修改常量的指针。

在第二段代码中，"string"是一个数组初始化器，类似于

char str[7] =  { 's', 't', 'r', 'i', 'n', 'g', '\0' };

"str"是堆栈上分配的数组，可以自由修改。

2008-10-02 19:50:27

其他回答

为什么我得到一个分割错误时写入字符串?

c99n1256草案

字符串字面量有两种不同的用法:

Initialize char[]: char c[] = "abc"; This is "more magic", and described at 6.7.8/14 "Initialization": An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array. So this is just a shortcut for: char c[] = {'a', 'b', 'c', '\0'}; Like any other regular array, c can be modified. Everywhere else: it generates an: unnamed array of char What is the type of string literals in C and C++? with static storage that gives UB if modified So when you write: char *c = "abc"; This is similar to: /* __unnamed is magic because modifying it gives UB. */ static char __unnamed[] = "abc"; char *c = __unnamed; Note the implicit cast from char[] to char *, which is always legal. Then if you modify c[0], you also modify __unnamed, which is UB. This is documented at 6.4.5 "String literals": 5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...] 6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32“初始化”给出了一个直接的例子:

EXAMPLE 8: The declaration char s[] = "abc", t[3] = "abc"; defines "plain" char array objects s and t whose elements are initialized with character string literals. This declaration is identical to char s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' }; The contents of the arrays are modifiable. On the other hand, the declaration char *p = "abc"; defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF实现

计划:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

编译和反编译:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

输出包含:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

结论:GCC将char* it存储在.rodata部分，而不是在.text中。

如果我们对char[]做同样的操作:

 char s[] = "abc";

我们获得:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

因此它被存储在堆栈中(相对于%rbp)。

但是请注意，默认的链接器脚本将.rodata和.text放在同一个段中，该段有执行权限，但没有写权限。这可以观察到:

readelf -l a.out

它包含:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata

2015-06-05 08:50:02

当您试图访问不可访问的内存时，会导致分割错误。

Char *str是一个指向不可修改的字符串的指针(这是导致segfault的原因)。

而char str[]是一个数组，可以修改。

2013-01-03 19:03:45

char *str = "string";

分配一个指向字符串字面量的指针，编译器将其放入可执行文件中不可修改的部分;

char str[] = "string";

分配并初始化一个可修改的本地数组

2008-10-02 19:51:06

不变的记忆

由于字符串字面量在设计上是只读的，所以它们存储在内存的Constant部分。存储在那里的数据是不可变的，即不能被更改。因此，在C代码中定义的所有字符串字面值在这里都获得一个只读内存地址。

栈内存

内存的堆栈部分是存放局部变量地址的地方，例如，函数中定义的变量。

正如@matli的回答所暗示的，有两种方法来处理这些常量字符串。

1. 指向字符串字面量的指针

当我们定义指向字符串字面量的指针时，我们是在Stack内存中创建一个指针变量。它指向底层字符串字面值所在的只读地址。

#include <stdio.h>

int main(void) {
  char *s = "hello";
  printf("%p\n", &s);  // Prints a read-only address, e.g. 0x7ffc8e224620
  return 0;
}

如果我们试图通过插入来修改s

s[0] = 'H';

我们得到一个分割错误(核心转储)。我们试图访问不应该访问的内存。我们正在尝试修改只读地址0x7ffc8e224620的值。

2. 字符数组

对于示例而言，假设存储在常量内存中的字符串字面值“Hello”具有与上述地址相同的只读内存地址0x7ffc8e224620。

#include <stdio.h>

int main(void) {
  // We create an array from a string literal with address 0x7ffc8e224620.
  // C initializes an array variable in the stack, let's give it address
  // 0x7ffc7a9a9db2.
  // C then copies the read-only value from 0x7ffc8e224620 into 
  // 0x7ffc7a9a9db2 to give us a local copy we can mutate.
  char a[] = "hello";

  // We can now mutate the local copy
  a[0] = 'H';

  printf("%p\n", &a);  // Prints the Stack address, e.g. 0x7ffc7a9a9db2
  printf("%s\n", a);   // Prints "Hello"

  return 0;
}

注意:当使用指针指向字符串字面量时，如1。，最好的做法是使用const关键字，如const *s = "hello"。这样可读性更强，并且当它被违反时，编译器将提供更好的帮助。然后它将抛出类似error:分配只读位置' *s '的错误，而不是seg错误。编辑器中的linter也可能在手动编译代码之前发现错误。

2021-08-31 21:49:58

像“String”这样的字符串文字可能在可执行文件的地址空间中作为只读数据分配(通过编译器)。当你去触摸它时，它会害怕你在它的泳衣区，并让你知道一个隔离错误。

在第一个例子中，你得到一个指向const数据的指针。在第二个示例中，使用const数据的副本初始化一个7个字符的数组。

2008-10-02 19:51:59

为什么我得到一个分割错误时写入一个“char *s”初始化的字符串文字，而不是“char s[]”?

推荐文章

最新文章

标签