下面的代码在第2行接收到seg错误:

char *str = "string";
str[0] = 'z';  // could be also written as *str = 'z'
printf("%s\n", str);

虽然这个方法非常有效:

char str[] = "string";
str[0] = 'z';
printf("%s\n", str);

用MSVC和GCC测试。


当前回答

char *str = "string";

分配一个指向字符串字面量的指针,编译器将其放入可执行文件中不可修改的部分;

char str[] = "string";

分配并初始化一个可修改的本地数组

其他回答

为什么我得到一个分割错误时写入字符串?

c99n1256草案

字符串字面量有两种不同的用法:

Initialize char[]: char c[] = "abc"; This is "more magic", and described at 6.7.8/14 "Initialization": An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array. So this is just a shortcut for: char c[] = {'a', 'b', 'c', '\0'}; Like any other regular array, c can be modified. Everywhere else: it generates an: unnamed array of char What is the type of string literals in C and C++? with static storage that gives UB if modified So when you write: char *c = "abc"; This is similar to: /* __unnamed is magic because modifying it gives UB. */ static char __unnamed[] = "abc"; char *c = __unnamed; Note the implicit cast from char[] to char *, which is always legal. Then if you modify c[0], you also modify __unnamed, which is UB. This is documented at 6.4.5 "String literals": 5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...] 6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32“初始化”给出了一个直接的例子:

EXAMPLE 8: The declaration char s[] = "abc", t[3] = "abc"; defines "plain" char array objects s and t whose elements are initialized with character string literals. This declaration is identical to char s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' }; The contents of the arrays are modifiable. On the other hand, the declaration char *p = "abc"; defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF实现

计划:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

编译和反编译:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

输出包含:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

结论:GCC将char* it存储在.rodata部分,而不是在.text中。

如果我们对char[]做同样的操作:

 char s[] = "abc";

我们获得:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

因此它被存储在堆栈中(相对于%rbp)。

但是请注意,默认的链接器脚本将.rodata和.text放在同一个段中,该段有执行权限,但没有写权限。这可以观察到:

readelf -l a.out

它包含:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata
char *str = "string";  

上面的代码将str设置为指向在程序的二进制映像中硬编码的字面值“string”,它在内存中可能被标记为只读。

因此str[0]=试图写入应用程序的只读代码。我猜这可能依赖于编译器。

要理解这个错误或问题,您应该首先了解指针和数组的差异b/w 所以在这里,我首先要解释一下它们的区别

字符串数组

 char strarray[] = "hello";

在存储器数组中存储的是连续存储器单元,存储为[h][e][l][l][o][\0] =>[]是1个char字节大小的存储器单元,而这个连续存储器单元可以通过名为strarray的名称在这里访问。这里string数组strarray本身包含初始化的所有字符串。在这种情况下,"hello" 因此,我们可以通过访问每个字符的索引值来轻松地更改其内存内容

`strarray[0]='m'` it access character at index 0 which is 'h'in strarray

它的值变成了m所以strarray的值变成了mello;

这里需要注意的一点是,我们可以通过一个字符一个字符地改变字符串数组的内容,但不能像strarray="new string"这样直接初始化其他字符串,这是无效的

指针

我们都知道指针指向内存中的内存位置, 未初始化的指针指向随机内存位置,初始化后指向特定内存位置

char *ptr = "hello";

这里的指针ptr被初始化为字符串“hello”,这是一个存储在只读存储器(ROM)中的常量字符串,所以“hello”不能被更改,因为它存储在ROM中

PTR存储在堆栈部分并指向常量字符串"hello"

所以ptr[0]='m'是无效的,因为你不能访问只读内存

但是ptr可以直接初始化为其他字符串值,因为它只是一个指针,所以它可以指向其数据类型变量的任何内存地址

ptr="new string"; is valid

参见C常见问题,问题1.32

Q: What is the difference between these initializations? char a[] = "string literal"; char *p = "string literal"; My program crashes if I try to assign a new value to p[i]. A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways: As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size). Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element. Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

5.5节K&R的字符指针和功能也讨论了这个主题:

There is an important difference between these definitions: char amessage[] = "now is the time"; /* an array */ char *pmessage = "now is the time"; /* a pointer */ amessage is an array, just big enough to hold the sequence of characters and '\0' that initializes it. Individual characters within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents.