简短的总结
(我也会把它放在顶部):
将指针视为地址通常是一个很好的学习工具,并且通常是普通数据类型指针的实际实现。
(1)但是在许多,也许是大多数编译器上,指向函数的指针不是地址,而是比地址大(通常是2倍,有时更多),或者实际上是指向内存中结构体的指针,而不是包含函数地址和常量池之类的东西。
(2)指向数据成员的指针和指向方法的指针通常更奇怪。
(3)遗留的x86代码的FAR和NEAR指针问题
(4)几个例子,最著名的是IBM AS/400,具有安全的“胖指针”。
我相信你能找到更多。
细节:
UMMPPHHH ! !到目前为止,许多答案都是相当典型的“程序员菜鸟”答案——但不是编译器菜鸟或硬件菜鸟。因为我假装是一个硬件弱项,并且经常与编译器弱项一起工作,让我抛出我的意见:
在许多(可能是大多数)C编译器中,指向类型为T的数据的指针实际上是T的地址。
很好。
但是,即使在许多这样的编译器上,某些指针也不是地址。你可以通过sizeof(ThePointer)来判断。
For example, pointers to functions are sometimes quite a lot bigger than ordinary addresses. Or, they may involve a level of indirection. This article provides one description, involving the Intel Itanium processor, but I have seen others. Typically, to call a function you must know not only the address of the function code, but also the address of the function's constant pool - a region of memory from which constants are loaded with a single load instruction, rather than the compiler having to generate a 64 bit constant out of several Load Immediate and Shift and OR instructions. So, rather than a single 64 bit address, you need 2 64 bit addresses. Some ABIs (Application Binary Interfaces) move this around as 128 bits, whereas others use a level of indirection, with the function pointer actually being the address of a function descriptor that contains the 2 actual addresses just mentioned. Which is better? Depends on your point of view: performance, code size, and some compatibility issues - often code assumes that a pointer can be cast to a long or a long long, but may also assume that the long long is exactly 64 bits. Such code may not be standards compliant, but nevertheless customers may want it to work.
我们中的许多人都对旧的英特尔x86分段架构有痛苦的记忆,有NEAR指针和FAR指针。值得庆幸的是,这些几乎已经灭绝了,所以只有一个快速的总结:在16位实模式中,实际的线性地址是
LinearAddress = SegmentRegister[SegNum].base << 4 + Offset
而在保护模式下,它可能是
LinearAddress = SegmentRegister[SegNum].base + offset
with the resulting address being checked against a limit set in the segment. Some programs used not really standard C/C++ FAR and NEAR pointer declarations, but many just said *T --- but there were compiler and linker switches so, for example, code pointers might be near pointers, just a 32 bit offset against whatever is in the CS (Code Segment) register, while the data pointers might be FAR pointers, specifying both a 16 bit segment number and a 32 bit offset for a 48 bit value. Now, both of these quantities are certainly related to the address, but since they aren't the same size, which of them is the address? Moreover, the segments also carried permissions - read-only, read-write, executable - in addition to stuff related to the actual address.
A more interesting example, IMHO, is (or, perhaps, was) the IBM AS/400 family. This computer was one of the first to implement an OS in C++. Pointers on this machime were typically 2X the actual address size - e.g. as this presentation says, 128 bit pointers, but the actual addresses were 48-64 bits, and, again, some extra info, what is called a capability, that provided permissions such as read, write, as well as a limit to prevent buffer overflow. Yes: you can do this compatibly with C/C++ -- and if this were ubiquitous, the Chinese PLA and slavic mafia would not be hacking into so many Western computer systems. But historically most C/C++ programming has neglected security for performance. Most interestingly, the AS400 family allowed the operating system to create secure pointers, that could be given to unprivileged code, but which the unprivileged code could not forge or tamper with. Again, security, and while standards compliant, much sloppy non-standards compliant C/C++ code will not work in such a secure system. Again, there are official standards, and there are de-facto standards.
现在,我将放下我的安全演讲,并提到指针(各种类型)通常不是真正地址的其他一些方式:指向数据成员的指针,指向成员函数方法的指针,以及它们的静态版本比普通地址更大。正如这篇文章所说:
有许多方法可以解决这个问题[与单继承和多继承以及虚拟继承有关的问题]。Visual Studio编译器决定如何处理它:指向多重继承类的成员函数的指针实际上是一个结构。”
他们接着说:“强制转换函数指针可以改变它的大小!”
从我对安全性的评论中,您可能会猜到,我曾经参与过C/ c++硬件/软件项目,在这些项目中,指针更像是一种能力,而不是原始地址。
我还可以继续,但我希望你们能明白。
简短的总结
(我也会把它放在顶部):
(0)将指针视为地址通常是一个很好的学习工具,并且通常是普通数据类型指针的实际实现。
(1)但是在许多,也许是大多数编译器上,指向函数的指针不是地址,而是比地址大(通常是2X,有时更多),或者实际上是指向内存中结构体的指针,而不是包含函数地址和常量池之类的东西。
(2)指向数据成员的指针和指向方法的指针通常更奇怪。
(3)遗留的x86代码的FAR和NEAR指针问题
(4)几个例子,最著名的是IBM AS/400,具有安全的“胖指针”。
我相信你能找到更多。