什么是数组指针衰减?它和数组指针有关系吗?


当前回答

数组是由c语言中的指针自动传递的,其背后的原理只能推测。

int a[5], int *a and int (*a)[5] are all glorified addresses meaning that the compiler treats arithmetic and deference operators on them differently depending on the type, so when they refer to the same address they are not treated the same by the compiler. int a[5] is different to the other 2 in that the address is implicit and does not manifest on the stack or the executable as part of the array itself, it is only used by the compiler to resolve certain arithmetic operations, like taking its address or pointer arithmetic. int a[5] is therefore an array as well as an implicit address, but as soon as you talk about the address itself and place it on the stack, the address itself is no longer an array, and can only be a pointer to an array or a decayed array i.e. a pointer to the first member of the array.

例如,在int (*a)[5]上,对a的第一次解引用将产生int *(因此相同的地址,只是不同的类型,注意不是int a[5]),对a的指针算术,即a+1或*(a+1)将根据5个int数组的大小(这是它所指向的数据类型),第二次解引用将产生int。然而,对于int类型a[5],第一次解引用将产生int类型,指针的算术将与int类型的大小相关。

To a function, you can only pass int * and int (*)[5], and the function casts it to whatever the parameter type is, so within the function you have a choice whether to treat an address that is being passed as a decayed array or a pointer to an array (where the function has to specify the size of the array being passed). If you pass a to a function and a is defined int a[5], then as a resolves to an address, you are passing an address, and an address can only be a pointer type. In the function, the parameter it accesses is then an address on the stack or in a register, which can only be a pointer type and not an array type -- this is because it's an actual address on the stack and is therefore clearly not the array itself.

You lose the size of the array because the type of the parameter, being an address, is a pointer and not an array, which does not have an array size, as can be seen when using sizeof, which works on the type of the value being passed to it. The parameter type int a[5] instead of int *a is allowed but is treated as int * instead of disallowing it outright, though it should be disallowed, because it is misleading, because it makes you think that the size information can be used, but you can only do this by casting it to int (*a)[5], and of course, the function has to specify the size of the array because there is no way to pass the size of the array because the size of the array needs to be a compile-time constant.

其他回答

据说数组会“衰减”成指针。声明为int numbers[5]的c++数组不能被重指向,即不能说numbers = 0x5a5aff23。更重要的是,衰减一词意味着类型和维度的丧失;数字通过丢失维数信息(计数5)衰减为int*,类型不再是int[5]。看看这里没有发生衰变的情况。

如果你是按值传递一个数组,你实际上是在复制一个指针——指向数组第一个元素的指针被复制到形参(形参的类型也应该是数组元素的类型的指针)。这是由于数组的衰减性质;一旦衰变,sizeof就不再给出整个数组的大小,因为它本质上变成了一个指针。这就是为什么首选通过引用或指针传递的原因(以及其他原因)。

传递数组1的三种方法:

void by_value(const T* array)   // const T array[] means the same
void by_pointer(const T (*array)[U])
void by_reference(const T (&array)[U])

后两个将提供适当的sizeof信息,而第一个不会,因为数组参数已经衰减为赋值给形参。

常量U应该在编译时已知。

以下是该标准的内容(C99 6.3.3.1 /3 -其他操作数-左值、数组和函数指示符):

除非它是sizeof操作符或一元&操作符的操作数,或者是a 字符串字面值用于初始化数组,具有类型“数组类型”的表达式为 转换为类型为“指针指向类型”的表达式,该表达式指向的初始元素 数组对象和不是左值。

这意味着无论何时在表达式中使用数组名称,它都会自动转换为指向数组中第一项的指针。

请注意,函数名的作用与此类似,但函数指针的使用要少得多,而且使用的方式要专门得多,因此不会像将数组名自动转换为指针那样引起混乱。

c++标准(4.2数组到指针转换)将转换要求放宽为(强调我的):

类型为“N T的数组”或“未知T界的数组”的左值或右值可以转换为右值 类型为“指向t的指针”

因此转换不必像C中那样发生(这让函数重载或模板匹配数组类型)。

这也是为什么在C语言中你应该避免在函数原型/定义中使用数组形参(在我看来-我不确定是否有普遍的共识)。它们会引起混乱,而且无论如何都是虚构的——使用指针形参,混乱可能不会完全消失,但至少形参声明没有说谎。

我可能会大胆地认为有四(4)种方法将数组作为函数参数传递。这里还有简短但可以工作的代码供您阅读。

#include <iostream>
#include <string>
#include <vector>
#include <cassert>

using namespace std;

// test data
// notice native array init with no copy aka "="
// not possible in C
 const char* specimen[]{ __TIME__, __DATE__, __TIMESTAMP__ };

// ONE
// simple, dangerous and useless
template<typename T>
void as_pointer(const T* array) { 
    // a pointer
    assert(array != nullptr); 
} ;

// TWO
// for above const T array[] means the same
// but and also , minimum array size indication might be given too
// this also does not stop the array decay into T *
// thus size information is lost
template<typename T>
void by_value_no_size(const T array[0xFF]) { 
    // decayed to a pointer
    assert( array != nullptr ); 
}

// THREE
// size information is preserved
// but pointer is asked for
template<typename T, size_t N>
void pointer_to_array(const T (*array)[N])
{
   // dealing with native pointer 
    assert( array != nullptr ); 
}

// FOUR
// no C equivalent
// array by reference
// size is preserved
template<typename T, size_t N>
void reference_to_array(const T (&array)[N])
{
    // array is not a pointer here
    // it is (almost) a container
    // most of the std:: lib algorithms 
    // do work on array reference, for example
    // range for requires std::begin() and std::end()
    // on the type passed as range to iterate over
    for (auto && elem : array )
    {
        cout << endl << elem ;
    }
}

int main()
{
     // ONE
     as_pointer(specimen);
     // TWO
     by_value_no_size(specimen);
     // THREE
     pointer_to_array(&specimen);
     // FOUR
     reference_to_array( specimen ) ;
}

我可能也认为这显示了c++相对于C的优势,至少在引用(双关语)通过引用传递数组。

当然,有些非常严格的项目没有堆分配,没有异常,也没有std:: lib。有人可能会说,c++原生数组处理是关键任务语言特性。

"Decay" refers to the implicit conversion of an expression from an array type to a pointer type. In most contexts, when the compiler sees an array expression it converts the type of the expression from "N-element array of T" to "pointer to T" and sets the value of the expression to the address of the first element of the array. The exceptions to this rule are when an array is an operand of either the sizeof or & operators, or the array is a string literal being used as an initializer in a declaration.

假设有以下代码:

char a[80];
strcpy(a, "This is a test");

表达式a的类型是“80-element array of char”,表达式“This is a test”的类型是“15-element array of char”(在C语言中;在c++中,字符串字面值是const char数组)。然而,在对strcpy()的调用中,两个表达式都不是sizeof或&的操作数,因此它们的类型被隐式转换为“指向char的指针”,并且它们的值被设置为每个表达式中第一个元素的地址。strcpy()接收的不是数组,而是指针,正如它的原型所示:

char *strcpy(char *dest, const char *src);

这和数组指针不是一回事。例如:

char a[80];
char *ptr_to_first_element = a;
char (*ptr_to_array)[80] = &a;

ptr_to_first_element和ptr_to_array都有相同的值;a的基址。但它们是不同的类型,区别对待,如下图所示:

a[i] == ptr_to_first_element[i] == (*ptr_to_array)[i] != *ptr_to_array[i] != ptr_to_array[i]

请记住,表达式a[i]被解释为*(a+i)(只有在数组类型转换为指针类型时才有效),因此a[i]和ptr_to_first_element[i]的工作原理相同。表达式(*ptr_to_array)[i]被解释为*(*a+i)。表达式*ptr_to_array[i]和ptr_to_array[i]可能会根据上下文导致编译器警告或错误;如果你期望它们的值是a[i],它们肯定会出错。

sizeof a == sizeof *ptr_to_array == 80

同样,当数组是sizeof的操作数时,它不会转换为指针类型。

sizeof *ptr_to_first_element == sizeof (char) == 1
sizeof ptr_to_first_element == sizeof (char *) == whatever the pointer size
                                                  is on your platform

Ptr_to_first_element是一个简单的char指针。

当数组腐烂并被指向时;-)

实际上,如果你想传递一个数组到某个地方,但却传递了指针(因为谁会他妈的为你传递整个数组),人们会说这个可怜的数组衰减为指针。