在关于C的一个有信誉的来源中,在讨论&操作符后给出了以下信息:

... 有点不幸的是,术语[地址的]仍然存在,因为它混淆了那些不知道地址是关于什么的人,并误导了那些知道地址的人:将指针视为地址通常会导致悲伤……

我读过的其他材料(来自同样有名望的来源,我想说)总是毫不掩饰地将指针和&操作符作为内存地址。我很愿意继续寻找事情的真相,但当有信誉的消息来源不同意时,这有点困难。

现在我有点困惑了——如果指针不是内存地址,那么它到底是什么?

P.S.

作者后来说:……不过,我将继续使用“地址”这个术语,因为发明一个不同的(术语)会更糟糕。


当前回答

地址用于标识一个固定大小的存储空间,通常为每个字节,作为一个整数。这被精确地称为字节地址,它也被ISO c使用。可以有一些其他方法来构造地址,例如为每一位。然而,只有字节地址是如此经常使用,我们通常省略“字节”。

从技术上讲,一个地址在C中从来都不是一个值,因为在(ISO) C中术语“值”的定义是:

对象的内容在解释为具有特定类型时的精确含义

(我强调了一下。)然而,在C语言中没有这样的“地址类型”。

指针不一样。指针是C语言中的一种类型。有几种不同的指针类型。它们不一定遵守相同的语言规则集,例如++对int*类型值和char*类型值的影响。

C语言中的值可以是指针类型。这叫做指针值。需要明确的是,指针值在C语言中不是指针。但是我们习惯把它们混在一起,因为在C语言中,它不太可能是模棱两可的:如果我们把表达式p称为“指针”,它只是一个指针值,而不是一个类型,因为C语言中的命名类型不是由表达式表示,而是由type-name或typedef-name表示。

其他一些事情是微妙的。作为C语言的使用者,首先要知道object是什么意思:

数据存储在执行环境中的区域,其中的内容可以表示 值

对象是表示特定类型的值的实体。指针是一种对象类型。因此,如果我们声明int* p;,则p表示“指针类型的对象”,或“指针对象”。

Note there is no "variable" normatively defined by the standard (in fact it is never being used as a noun by ISO C in normative text). However, informally, we call an object a variable, as some other language does. (But still not so exactly, e.g. in C++ a variable can be of reference type normatively, which is not an object.) The phrases "pointer object" or "pointer variable" are sometimes treated like "pointer value" as above, with a probable slight difference. (One more set of examples is "array".)

由于指针是一种类型,而地址在C语言中实际上是“无类型的”,因此指针值大致“包含”一个地址。指针类型的表达式可以产生一个地址,例如。

Iso c11 6.5.2.3

一元&操作符产生其操作数的地址。

请注意,这个措辞是由WG14/N1256引入的,即ISO C99:TC3。在C99中有

一元&操作符返回其操作数的地址。

它反映了委员会的观点:地址不是由一元操作符&返回的指针值。

尽管有上述措辞,但即使在标准上也存在一些混乱。

Iso c11 6.6

地址常量是一个空指针,一个指向左值的指针,该左值指定一个static对象 存储持续时间,或指向函数指示符的指针

Iso c++ 11 5.19

3.一个地址 常量表达式是指针类型的prvalue核心常量表达式,计算结果为对象的地址 具有静态存储持续时间的对象,转换为函数的地址、空指针值或prvalue核心 类型std::nullptr_t. ...的常量表达式

(最近的c++标准草案使用了另一种措辞,所以不存在这个问题。)

实际上,C中的“地址常量”和c++中的“地址常量表达式”都是指针类型的常量表达式(或者至少从c++ 11开始是“类指针”类型)。

内置的一元&运算符在C和c++中被称为“address-of”;类似地,std::addressof是在c++ 11中引入的。

这些命名可能会带来误解。结果表达式是指针类型的,所以它们被解释为:结果包含/产生一个地址,而不是一个地址。

其他回答

A pointer, like any other variable in C, is fundamentally a collection of bits which may be represented by one or more concatenated unsigned char values (as with any other type of cariable, sizeof(some_variable) will indicate the number of unsigned char values). What makes a pointer different from other variables is that a C compiler will interpret the bits in a pointer as identifying, somehow, a place where a variable may be stored. In C, unlike some other languages, it is possible to request space for multiple variables, and then convert a pointer to any value in that set into a pointer to any other variable within that set.

Many compilers implement pointers by using their bits store actual machine addresses, but that is not the only possible implementation. An implementation could keep one array--not accessible to user code--listing the hardware address and allocated size of all of the memory objects (sets of variables) which a program was using, and have each pointer contain an index into an array along with an offset from that index. Such a design would allow a system to not only restrict code to only operating upon memory that it owned, but also ensure that a pointer to one memory item could not be accidentally converted into a pointer to another memory item (in a system that uses hardware addresses, if foo and bar are arrays of 10 items that are stored consecutively in memory, a pointer to the "eleventh" item of foo might instead point to the first item of bar, but in a system where each "pointer" is an object ID and an offset, the system could trap if code tried to index a pointer to foo beyond its allocated range). It would also be possible for such a system to eliminate memory-fragmentation problems, since the physical addresses associated with any pointers could be moved around.

Note that while pointers are somewhat abstract, they're not quite abstract enough to allow a fully-standards-compliant C compiler to implement a garbage collector. The C compiler specifies that every variable, including pointers, is represented as a sequence of unsigned char values. Given any variable, one can decompose it into a sequence of numbers and later convert that sequence of numbers back into a variable of the original type. Consequently, it would be possible for a program to calloc some storage (receiving a pointer to it), store something there, decompose the pointer into a series of bytes, display those on the screen, and then erase all reference to them. If the program then accepted some numbers from the keyboard, reconstituted those to a pointer, and then tried to read data from that pointer, and if user entered the same numbers that the program had earlier displayed, the program would be required to output the data that had been stored in the calloc'ed memory. Since there is no conceivable way the computer could know whether the user had made a copy of the numbers that were displayed, there would be no conceivable may the computer could know whether the aforementioned memory might ever be accessed in future.

A pointer value is an address. A pointer variable is an object that can store an address. This is true because that's what the standard defines a pointer to be. It's important to tell it to C novices because C novices are often unclear on the difference between a pointer and the thing it points to (that is to say, they don't know the difference between an envelope and a building). The notion of an address (every object has an address and that's what a pointer stores) is important because it sorts that out.

然而,标准在特定的抽象层次上进行讨论。作者所说的那些“知道地址是关于什么的”,但对C不熟悉的人,必须在不同的抽象级别上学习地址——也许是通过编写汇编语言。不能保证C实现使用与cpu操作码相同的地址表示(在本文中称为“存储地址”),这些人已经知道。

He goes on to talk about "perfectly reasonable address manipulation". As far as the C standard is concerned there's basically no such thing as "perfectly reasonable address manipulation". Addition is defined on pointers and that is basically it. Sure, you can convert a pointer to integer, do some bitwise or arithmetic ops, and then convert it back. This is not guaranteed to work by the standard, so before writing that code you'd better know how your particular C implementation represents pointers and performs that conversion. It probably uses the address representation you expect, but it it doesn't that's your fault because you didn't read the manual. That's not confusion, it's incorrect programming procedure ;-)

简而言之,C使用了比作者更抽象的地址概念。

The author's concept of an address of course is also not the lowest-level word on the matter. What with virtual memory maps and physical RAM addressing across multiple chips, the number that you tell the CPU is "the store address" you want to access has basically nothing to do with where the data you want is actually located in hardware. It's all layers of indirection and representation, but the author has chosen one to privilege. If you're going to do that when talking about C, choose the C level to privilege!

Personally I don't think the author's remarks are all that helpful, except in the context of introducing C to assembly programmers. It's certainly not helpful to those coming from higher level languages to say that pointer values aren't addresses. It would be far better to acknowledge the complexity than it is to say that the CPU has the monopoly on saying what an address is and thus that C pointer values "are not" addresses. They are addresses, but they may be written in a different language from the addresses he means. Distinguishing the two things in the context of C as "address" and "store address" would be adequate, I think.

指针是表示内存位置的抽象。请注意,这句话并没有说把指针当作内存地址是错误的,它只是说它“通常会导致悲伤”。换句话说,它会让你产生错误的期望。

The most likely source of grief is certainly pointer arithmetic, which is actually one of C's strengths. If a pointer was an address, you'd expect pointer arithmetic to be address arithmetic; but it's not. For example, adding 10 to an address should give you an address that is larger by 10 addressing units; but adding 10 to a pointer increments it by 10 times the size of the kind of object it points to (and not even the actual size, but rounded up to an alignment boundary). With an int * on an ordinary architecture with 32-bit integers, adding 10 to it would increment it by 40 addressing units (bytes). Experienced C programmers are aware of this and put it to all kinds of good uses, but your author is evidently no fan of sloppy metaphors.

There's the additional question of how the contents of the pointer represent the memory location: As many of the answers have explained, an address is not always an int (or long). In some architectures an address is a "segment" plus an offset. A pointer might even contain just the offset into the current segment ("near" pointer), which by itself is not a unique memory address. And the pointer contents might have only an indirect relationship to a memory address as the hardware understands it. But the author of the quote cited doesn't even mention representation, so I think it was conceptual equivalence, rather than representation, that they had in mind.

快速总结:C地址是一个值,通常表示为具有特定类型的机器级内存地址。

非限定词“指针”有歧义。C语言有指针对象(变量)、指针类型、指针表达式和指针值。

使用“指针”这个词来表示“指针对象”是非常常见的,这可能会导致一些混淆——这就是为什么我试图将“指针”作为形容词而不是名词使用。

C标准,至少在某些情况下,使用“指针”这个词来表示“指针值”。例如,malloc的描述说它“返回空指针或指向已分配空间的指针”。

那么C中的地址是什么呢?它是一个指针值,即某个特定指针类型的值。(除了空指针值不一定被称为“地址”,因为它不是任何东西的地址)。

标准对一元&操作符的描述说它“产生其操作数的地址”。在C标准之外,单词“address”通常用于指(物理或虚拟)内存地址,通常是一个单词大小(无论给定系统上的“word”是什么)。

C“地址”通常实现为机器地址——就像C int值通常实现为机器字一样。但是C地址(指针值)不仅仅是一个机器地址。它是一个通常表示为机器地址的值,它是一个具有特定类型的值。

C或c++指针与简单内存地址的另一个不同之处是,我在其他答案中没有看到不同的指针类型(尽管考虑到它们的总大小,我可能忽略了它)。但它可能是最重要的一个,因为即使是经验丰富的C/ c++程序员也会被它绊倒:

编译器可能会假设不兼容类型的指针不指向相同的地址,即使它们很明显指向相同的地址,这可能会导致简单的pointer==address模型不可能出现的行为。考虑以下代码(假设sizeof(int) = 2*sizeof(short)):

unsigned int i = 0;
unsigned short* p = (unsigned short*)&i;
p[0]=p[1]=1;

if (i == 2 + (unsigned short)(-1))
{
  // you'd expect this to execute, but it need not
}

if (i == 0)
{
  // you'd expect this not to execute, but it actually may do so
}

注意,char*有一个例外,所以使用char*操作值是可能的(尽管不是很可移植)。