如果不是内存地址，C指针到底是什么?

在关于C的一个有信誉的来源中，在讨论&操作符后给出了以下信息:

．.．有点不幸的是，术语[地址的]仍然存在，因为它混淆了那些不知道地址是关于什么的人，并误导了那些知道地址的人:将指针视为地址通常会导致悲伤……

我读过的其他材料(来自同样有名望的来源，我想说)总是毫不掩饰地将指针和&操作符作为内存地址。我很愿意继续寻找事情的真相，但当有信誉的消息来源不同意时，这有点困难。

现在我有点困惑了——如果指针不是内存地址，那么它到底是什么?

P.S.

作者后来说:……不过，我将继续使用“地址”这个术语，因为发明一个不同的(术语)会更糟糕。

当前回答

简短的总结 (我也会把它放在顶部):

将指针视为地址通常是一个很好的学习工具，并且通常是普通数据类型指针的实际实现。

(1)但是在许多，也许是大多数编译器上，指向函数的指针不是地址，而是比地址大(通常是2倍，有时更多)，或者实际上是指向内存中结构体的指针，而不是包含函数地址和常量池之类的东西。

(2)指向数据成员的指针和指向方法的指针通常更奇怪。

(3)遗留的x86代码的FAR和NEAR指针问题

(4)几个例子，最著名的是IBM AS/400，具有安全的“胖指针”。

我相信你能找到更多。

细节:

UMMPPHHH ! !到目前为止，许多答案都是相当典型的“程序员菜鸟”答案——但不是编译器菜鸟或硬件菜鸟。因为我假装是一个硬件弱项，并且经常与编译器弱项一起工作，让我抛出我的意见:

在许多(可能是大多数)C编译器中，指向类型为T的数据的指针实际上是T的地址。

很好。

但是，即使在许多这样的编译器上，某些指针也不是地址。你可以通过sizeof(ThePointer)来判断。

For example, pointers to functions are sometimes quite a lot bigger than ordinary addresses. Or, they may involve a level of indirection. This article provides one description, involving the Intel Itanium processor, but I have seen others. Typically, to call a function you must know not only the address of the function code, but also the address of the function's constant pool - a region of memory from which constants are loaded with a single load instruction, rather than the compiler having to generate a 64 bit constant out of several Load Immediate and Shift and OR instructions. So, rather than a single 64 bit address, you need 2 64 bit addresses. Some ABIs (Application Binary Interfaces) move this around as 128 bits, whereas others use a level of indirection, with the function pointer actually being the address of a function descriptor that contains the 2 actual addresses just mentioned. Which is better? Depends on your point of view: performance, code size, and some compatibility issues - often code assumes that a pointer can be cast to a long or a long long, but may also assume that the long long is exactly 64 bits. Such code may not be standards compliant, but nevertheless customers may want it to work.

我们中的许多人都对旧的英特尔x86分段架构有痛苦的记忆，有NEAR指针和FAR指针。值得庆幸的是，这些几乎已经灭绝了，所以只有一个快速的总结:在16位实模式中，实际的线性地址是

LinearAddress = SegmentRegister[SegNum].base << 4 + Offset

而在保护模式下，它可能是

LinearAddress = SegmentRegister[SegNum].base + offset

with the resulting address being checked against a limit set in the segment. Some programs used not really standard C/C++ FAR and NEAR pointer declarations, but many just said *T --- but there were compiler and linker switches so, for example, code pointers might be near pointers, just a 32 bit offset against whatever is in the CS (Code Segment) register, while the data pointers might be FAR pointers, specifying both a 16 bit segment number and a 32 bit offset for a 48 bit value. Now, both of these quantities are certainly related to the address, but since they aren't the same size, which of them is the address? Moreover, the segments also carried permissions - read-only, read-write, executable - in addition to stuff related to the actual address.

A more interesting example, IMHO, is (or, perhaps, was) the IBM AS/400 family. This computer was one of the first to implement an OS in C++. Pointers on this machime were typically 2X the actual address size - e.g. as this presentation says, 128 bit pointers, but the actual addresses were 48-64 bits, and, again, some extra info, what is called a capability, that provided permissions such as read, write, as well as a limit to prevent buffer overflow. Yes: you can do this compatibly with C/C++ -- and if this were ubiquitous, the Chinese PLA and slavic mafia would not be hacking into so many Western computer systems. But historically most C/C++ programming has neglected security for performance. Most interestingly, the AS400 family allowed the operating system to create secure pointers, that could be given to unprivileged code, but which the unprivileged code could not forge or tamper with. Again, security, and while standards compliant, much sloppy non-standards compliant C/C++ code will not work in such a secure system. Again, there are official standards, and there are de-facto standards.

现在，我将放下我的安全演讲，并提到指针(各种类型)通常不是真正地址的其他一些方式:指向数据成员的指针，指向成员函数方法的指针，以及它们的静态版本比普通地址更大。正如这篇文章所说:

有许多方法可以解决这个问题[与单继承和多继承以及虚拟继承有关的问题]。Visual Studio编译器决定如何处理它:指向多重继承类的成员函数的指针实际上是一个结构。” 他们接着说:“强制转换函数指针可以改变它的大小!”

从我对安全性的评论中，您可能会猜到，我曾经参与过C/ c++硬件/软件项目，在这些项目中，指针更像是一种能力，而不是原始地址。

我还可以继续，但我希望你们能明白。

简短的总结 (我也会把它放在顶部):

(0)将指针视为地址通常是一个很好的学习工具，并且通常是普通数据类型指针的实际实现。

(1)但是在许多，也许是大多数编译器上，指向函数的指针不是地址，而是比地址大(通常是2X，有时更多)，或者实际上是指向内存中结构体的指针，而不是包含函数地址和常量池之类的东西。

(2)指向数据成员的指针和指向方法的指针通常更奇怪。

(3)遗留的x86代码的FAR和NEAR指针问题

(4)几个例子，最著名的是IBM AS/400，具有安全的“胖指针”。

我相信你能找到更多。

2013-03-15 01:40:25

其他回答

指针是一个保存内存地址的变量，而不是地址本身。但是，您可以解除对指针的引用-并访问内存位置。

例如:

int q = 10; /*say q is at address 0x10203040*/
int *p = &q; /*means let p contain the address of q, which is 0x10203040*/
*p = 20; /*set whatever is at the address pointed by "p" as 20*/

就是这样。就是这么简单。

一个演示我所说内容的程序，其输出如下:

http://ideone.com/rcSUsb

程序:

#include <stdio.h>

int main(int argc, char *argv[])
{
  /* POINTER AS AN ADDRESS */
  int q = 10;
  int *p = &q;

  printf("address of q is %p\n", (void *)&q);
  printf("p contains %p\n", (void *)p);

  p = NULL;
  printf("NULL p now contains %p\n", (void *)p);
  return 0;
}

2013-03-01 05:52:59

地址用于标识一个固定大小的存储空间，通常为每个字节，作为一个整数。这被精确地称为字节地址，它也被ISO c使用。可以有一些其他方法来构造地址，例如为每一位。然而，只有字节地址是如此经常使用，我们通常省略“字节”。

从技术上讲，一个地址在C中从来都不是一个值，因为在(ISO) C中术语“值”的定义是:

对象的内容在解释为具有特定类型时的精确含义

(我强调了一下。)然而，在C语言中没有这样的“地址类型”。

指针不一样。指针是C语言中的一种类型。有几种不同的指针类型。它们不一定遵守相同的语言规则集，例如++对int*类型值和char*类型值的影响。

C语言中的值可以是指针类型。这叫做指针值。需要明确的是，指针值在C语言中不是指针。但是我们习惯把它们混在一起，因为在C语言中，它不太可能是模棱两可的:如果我们把表达式p称为“指针”，它只是一个指针值，而不是一个类型，因为C语言中的命名类型不是由表达式表示，而是由type-name或typedef-name表示。

其他一些事情是微妙的。作为C语言的使用者，首先要知道object是什么意思:

数据存储在执行环境中的区域，其中的内容可以表示值

对象是表示特定类型的值的实体。指针是一种对象类型。因此，如果我们声明int* p;，则p表示“指针类型的对象”，或“指针对象”。

Note there is no "variable" normatively defined by the standard (in fact it is never being used as a noun by ISO C in normative text). However, informally, we call an object a variable, as some other language does. (But still not so exactly, e.g. in C++ a variable can be of reference type normatively, which is not an object.) The phrases "pointer object" or "pointer variable" are sometimes treated like "pointer value" as above, with a probable slight difference. (One more set of examples is "array".)

由于指针是一种类型，而地址在C语言中实际上是“无类型的”，因此指针值大致“包含”一个地址。指针类型的表达式可以产生一个地址，例如。

Iso c11 6.5.2.3

一元&操作符产生其操作数的地址。

请注意，这个措辞是由WG14/N1256引入的，即ISO C99:TC3。在C99中有

一元&操作符返回其操作数的地址。

它反映了委员会的观点:地址不是由一元操作符&返回的指针值。

尽管有上述措辞，但即使在标准上也存在一些混乱。

Iso c11 6.6

地址常量是一个空指针，一个指向左值的指针，该左值指定一个static对象存储持续时间，或指向函数指示符的指针

Iso c++ 11 5.19

3.一个地址常量表达式是指针类型的prvalue核心常量表达式，计算结果为对象的地址具有静态存储持续时间的对象，转换为函数的地址、空指针值或prvalue核心类型std::nullptr_t. ...的常量表达式

(最近的c++标准草案使用了另一种措辞，所以不存在这个问题。)

实际上，C中的“地址常量”和c++中的“地址常量表达式”都是指针类型的常量表达式(或者至少从c++ 11开始是“类指针”类型)。

内置的一元&运算符在C和c++中被称为“address-of”;类似地，std::addressof是在c++ 11中引入的。

这些命名可能会带来误解。结果表达式是指针类型的，所以它们被解释为:结果包含/产生一个地址，而不是一个地址。

2014-11-09 07:42:49

指针只是另一个变量，它通常包含另一个变量的内存地址。指针是一个变量，它也有一个内存地址。

2013-03-01 06:16:43

A pointer, like any other variable in C, is fundamentally a collection of bits which may be represented by one or more concatenated unsigned char values (as with any other type of cariable, sizeof(some_variable) will indicate the number of unsigned char values). What makes a pointer different from other variables is that a C compiler will interpret the bits in a pointer as identifying, somehow, a place where a variable may be stored. In C, unlike some other languages, it is possible to request space for multiple variables, and then convert a pointer to any value in that set into a pointer to any other variable within that set.

Many compilers implement pointers by using their bits store actual machine addresses, but that is not the only possible implementation. An implementation could keep one array--not accessible to user code--listing the hardware address and allocated size of all of the memory objects (sets of variables) which a program was using, and have each pointer contain an index into an array along with an offset from that index. Such a design would allow a system to not only restrict code to only operating upon memory that it owned, but also ensure that a pointer to one memory item could not be accidentally converted into a pointer to another memory item (in a system that uses hardware addresses, if foo and bar are arrays of 10 items that are stored consecutively in memory, a pointer to the "eleventh" item of foo might instead point to the first item of bar, but in a system where each "pointer" is an object ID and an offset, the system could trap if code tried to index a pointer to foo beyond its allocated range). It would also be possible for such a system to eliminate memory-fragmentation problems, since the physical addresses associated with any pointers could be moved around.

Note that while pointers are somewhat abstract, they're not quite abstract enough to allow a fully-standards-compliant C compiler to implement a garbage collector. The C compiler specifies that every variable, including pointers, is represented as a sequence of unsigned char values. Given any variable, one can decompose it into a sequence of numbers and later convert that sequence of numbers back into a variable of the original type. Consequently, it would be possible for a program to calloc some storage (receiving a pointer to it), store something there, decompose the pointer into a series of bytes, display those on the screen, and then erase all reference to them. If the program then accepted some numbers from the keyboard, reconstituted those to a pointer, and then tried to read data from that pointer, and if user entered the same numbers that the program had earlier displayed, the program would be required to output the data that had been stored in the calloc'ed memory. Since there is no conceivable way the computer could know whether the user had made a copy of the numbers that were displayed, there would be no conceivable may the computer could know whether the aforementioned memory might ever be accessed in future.

2013-03-02 21:22:52

以下是我过去是如何向一些困惑的人解释的: 指针有两个影响其行为的属性。它有一个值(在典型环境中)是一个内存地址，还有一个类型(告诉您它所指向的对象的类型和大小)。

例如，给定:

union {
    int i;
    char c;
} u;

你可以有三个不同的指针都指向同一个对象:

void *v = &u;
int *i = &u.i;
char *c = &u.c;

如果你比较这些指针的值，它们都是相等的:

v==i && i==c

但是，如果对每个指针加1，就会发现它们所指向的类型变得相关了。

i++;
c++;
// You can't perform arithmetic on a void pointer, so no v++
i != c

此时，变量i和c将具有不同的值，因为i++使i包含下一个可访问的整数的地址，而c++使c指向下一个可寻址的字符。通常，整数比字符占用更多的内存，所以在它们都加一之后，i的值将比c的值更大。

2013-03-02 19:15:34

如果不是内存地址，C指针到底是什么?

推荐文章

最新文章

标签