在关于C的一个有信誉的来源中,在讨论&操作符后给出了以下信息:

... 有点不幸的是,术语[地址的]仍然存在,因为它混淆了那些不知道地址是关于什么的人,并误导了那些知道地址的人:将指针视为地址通常会导致悲伤……

我读过的其他材料(来自同样有名望的来源,我想说)总是毫不掩饰地将指针和&操作符作为内存地址。我很愿意继续寻找事情的真相,但当有信誉的消息来源不同意时,这有点困难。

现在我有点困惑了——如果指针不是内存地址,那么它到底是什么?

P.S.

作者后来说:……不过,我将继续使用“地址”这个术语,因为发明一个不同的(术语)会更糟糕。


当前回答

我不确定你的来源,但你描述的语言类型来自C标准:

6.5.3.2地址和间接操作符 […] 3.一元&操作符产生其操作数的地址。[…]

所以…是的,指针指向内存地址。至少这是C标准所暗示的意思。

更清楚地说,指针是保存某个地址值的变量。对象的地址(可以存储在指针中)使用一元&操作符返回。

我可以将地址“42 Wallaby Way, Sydney”存储在一个变量中(该变量将是某种“指针”,但由于这不是一个内存地址,所以我们不能正确地称之为“指针”)。您的计算机有内存桶的地址。指针存储地址的值(例如,指针存储值“42 Wallaby Way, Sydney”,这是一个地址)。

编辑:我想对Alexey Frunze的评论进行扩展。

指针到底是什么?让我们看看C标准:

6.2.5类型 […] 20.[…] 指针类型可以从函数类型或对象类型派生,称为引用类型。指针类型描述了一个对象,该对象的值提供了对所引用类型实体的引用。从引用类型T派生的指针类型有时称为“指向T的指针”。从引用类型构造指针类型称为“指针类型派生”。指针类型是一个完整的对象类型。

从本质上讲,指针存储一个值,该值提供对某些对象或函数的引用。种。指针用于存储提供对某些对象或函数引用的值,但情况并非总是如此:

6.3.2.3指针 […] 5. 整数可以转换为任何指针类型。除非像前面指定的那样,否则结果是由实现定义的,可能没有正确对齐,可能没有指向引用类型的实体,并且可能是陷阱表示。

The above quote says that we can turn an integer into a pointer. If we do that (that is, if we stuff an integer value into a pointer instead of a specific reference to an object or function), then the pointer "might not point to an entity of reference type" (i.e. it may not provide a reference to an object or function). It might provide us with something else. And this is one place where you might stick some kind of handle or ID in a pointer (i.e. the pointer isn't pointing to an object; it's storing a value that represents something, but that value may not be an address).

是的,正如Alexey Frunze所说,指针可能没有存储对象或函数的地址。有可能一个指针存储的是某种“句柄”或ID,你可以通过给指针赋某个任意整数值来做到这一点。这个句柄或ID表示什么取决于系统/环境/上下文。只要您的系统/实现能够理解这个值,您就处于良好的状态(但这取决于具体的值和具体的系统/实现)。

通常,指针存储对象或函数的地址。如果它没有存储实际的地址(到对象或函数),则结果是实现定义的(这意味着究竟发生了什么以及指针现在表示什么取决于您的系统和实现,因此它可能是特定系统上的句柄或ID,但在另一个系统上使用相同的代码/值可能会使程序崩溃)。

结果比我想象的要长……

其他回答

A pointer value is an address. A pointer variable is an object that can store an address. This is true because that's what the standard defines a pointer to be. It's important to tell it to C novices because C novices are often unclear on the difference between a pointer and the thing it points to (that is to say, they don't know the difference between an envelope and a building). The notion of an address (every object has an address and that's what a pointer stores) is important because it sorts that out.

然而,标准在特定的抽象层次上进行讨论。作者所说的那些“知道地址是关于什么的”,但对C不熟悉的人,必须在不同的抽象级别上学习地址——也许是通过编写汇编语言。不能保证C实现使用与cpu操作码相同的地址表示(在本文中称为“存储地址”),这些人已经知道。

He goes on to talk about "perfectly reasonable address manipulation". As far as the C standard is concerned there's basically no such thing as "perfectly reasonable address manipulation". Addition is defined on pointers and that is basically it. Sure, you can convert a pointer to integer, do some bitwise or arithmetic ops, and then convert it back. This is not guaranteed to work by the standard, so before writing that code you'd better know how your particular C implementation represents pointers and performs that conversion. It probably uses the address representation you expect, but it it doesn't that's your fault because you didn't read the manual. That's not confusion, it's incorrect programming procedure ;-)

简而言之,C使用了比作者更抽象的地址概念。

The author's concept of an address of course is also not the lowest-level word on the matter. What with virtual memory maps and physical RAM addressing across multiple chips, the number that you tell the CPU is "the store address" you want to access has basically nothing to do with where the data you want is actually located in hardware. It's all layers of indirection and representation, but the author has chosen one to privilege. If you're going to do that when talking about C, choose the C level to privilege!

Personally I don't think the author's remarks are all that helpful, except in the context of introducing C to assembly programmers. It's certainly not helpful to those coming from higher level languages to say that pointer values aren't addresses. It would be far better to acknowledge the complexity than it is to say that the CPU has the monopoly on saying what an address is and thus that C pointer values "are not" addresses. They are addresses, but they may be written in a different language from the addresses he means. Distinguishing the two things in the context of C as "address" and "store address" would be adequate, I think.

Come to think about it, I think it's a matter of semantics. I don't think the author is right, since the C standard refers to a pointer as holding an address to the referenced object as others have already mentioned here. However, address!=memory address. An address can be really anything as per C standard although it will eventually lead to a memory address, the pointer itself can be an id, an offset + selector (x86), really anything as long as it can describe (after mapping) any memory address in the addressable space.

它说“因为它让那些不知道地址是什么的人感到困惑”——而且,这是真的:如果你知道地址是什么,你就不会困惑了。从理论上讲,指针是一个指向另一个变量的变量,实际上保存着一个地址,即它所指向的变量的地址。我不知道为什么要隐瞒这个事实,这又不是什么高深的科学。如果你理解了指针,你就离理解计算机的工作原理更近了一步。去吧!

简短的总结 (我也会把它放在顶部):

将指针视为地址通常是一个很好的学习工具,并且通常是普通数据类型指针的实际实现。

(1)但是在许多,也许是大多数编译器上,指向函数的指针不是地址,而是比地址大(通常是2倍,有时更多),或者实际上是指向内存中结构体的指针,而不是包含函数地址和常量池之类的东西。

(2)指向数据成员的指针和指向方法的指针通常更奇怪。

(3)遗留的x86代码的FAR和NEAR指针问题

(4)几个例子,最著名的是IBM AS/400,具有安全的“胖指针”。

我相信你能找到更多。

细节:

UMMPPHHH ! !到目前为止,许多答案都是相当典型的“程序员菜鸟”答案——但不是编译器菜鸟或硬件菜鸟。因为我假装是一个硬件弱项,并且经常与编译器弱项一起工作,让我抛出我的意见:

在许多(可能是大多数)C编译器中,指向类型为T的数据的指针实际上是T的地址。

很好。

但是,即使在许多这样的编译器上,某些指针也不是地址。你可以通过sizeof(ThePointer)来判断。

For example, pointers to functions are sometimes quite a lot bigger than ordinary addresses. Or, they may involve a level of indirection. This article provides one description, involving the Intel Itanium processor, but I have seen others. Typically, to call a function you must know not only the address of the function code, but also the address of the function's constant pool - a region of memory from which constants are loaded with a single load instruction, rather than the compiler having to generate a 64 bit constant out of several Load Immediate and Shift and OR instructions. So, rather than a single 64 bit address, you need 2 64 bit addresses. Some ABIs (Application Binary Interfaces) move this around as 128 bits, whereas others use a level of indirection, with the function pointer actually being the address of a function descriptor that contains the 2 actual addresses just mentioned. Which is better? Depends on your point of view: performance, code size, and some compatibility issues - often code assumes that a pointer can be cast to a long or a long long, but may also assume that the long long is exactly 64 bits. Such code may not be standards compliant, but nevertheless customers may want it to work.

我们中的许多人都对旧的英特尔x86分段架构有痛苦的记忆,有NEAR指针和FAR指针。值得庆幸的是,这些几乎已经灭绝了,所以只有一个快速的总结:在16位实模式中,实际的线性地址是

LinearAddress = SegmentRegister[SegNum].base << 4 + Offset

而在保护模式下,它可能是

LinearAddress = SegmentRegister[SegNum].base + offset

with the resulting address being checked against a limit set in the segment. Some programs used not really standard C/C++ FAR and NEAR pointer declarations, but many just said *T --- but there were compiler and linker switches so, for example, code pointers might be near pointers, just a 32 bit offset against whatever is in the CS (Code Segment) register, while the data pointers might be FAR pointers, specifying both a 16 bit segment number and a 32 bit offset for a 48 bit value. Now, both of these quantities are certainly related to the address, but since they aren't the same size, which of them is the address? Moreover, the segments also carried permissions - read-only, read-write, executable - in addition to stuff related to the actual address.

A more interesting example, IMHO, is (or, perhaps, was) the IBM AS/400 family. This computer was one of the first to implement an OS in C++. Pointers on this machime were typically 2X the actual address size - e.g. as this presentation says, 128 bit pointers, but the actual addresses were 48-64 bits, and, again, some extra info, what is called a capability, that provided permissions such as read, write, as well as a limit to prevent buffer overflow. Yes: you can do this compatibly with C/C++ -- and if this were ubiquitous, the Chinese PLA and slavic mafia would not be hacking into so many Western computer systems. But historically most C/C++ programming has neglected security for performance. Most interestingly, the AS400 family allowed the operating system to create secure pointers, that could be given to unprivileged code, but which the unprivileged code could not forge or tamper with. Again, security, and while standards compliant, much sloppy non-standards compliant C/C++ code will not work in such a secure system. Again, there are official standards, and there are de-facto standards.

现在,我将放下我的安全演讲,并提到指针(各种类型)通常不是真正地址的其他一些方式:指向数据成员的指针,指向成员函数方法的指针,以及它们的静态版本比普通地址更大。正如这篇文章所说:

有许多方法可以解决这个问题[与单继承和多继承以及虚拟继承有关的问题]。Visual Studio编译器决定如何处理它:指向多重继承类的成员函数的指针实际上是一个结构。” 他们接着说:“强制转换函数指针可以改变它的大小!”

从我对安全性的评论中,您可能会猜到,我曾经参与过C/ c++硬件/软件项目,在这些项目中,指针更像是一种能力,而不是原始地址。

我还可以继续,但我希望你们能明白。

简短的总结 (我也会把它放在顶部):

(0)将指针视为地址通常是一个很好的学习工具,并且通常是普通数据类型指针的实际实现。

(1)但是在许多,也许是大多数编译器上,指向函数的指针不是地址,而是比地址大(通常是2X,有时更多),或者实际上是指向内存中结构体的指针,而不是包含函数地址和常量池之类的东西。

(2)指向数据成员的指针和指向方法的指针通常更奇怪。

(3)遗留的x86代码的FAR和NEAR指针问题

(4)几个例子,最著名的是IBM AS/400,具有安全的“胖指针”。

我相信你能找到更多。

你是对的,是理智的。通常,指针只是一个地址,因此您可以将其强制转换为整数并进行任何算术运算。

但有时指针只是地址的一部分。在一些体系结构上,指针被转换为一个增加了基数的地址或使用另一个CPU寄存器。

但是现在,在PC和ARM架构上,使用平面内存模型和原生编译的C语言,可以认为指针是指向一维可寻址RAM中某个位置的整数地址。