它是链表还是数组?我四处搜寻,只发现人们在猜测。我的C语言知识还不够好,不能看源代码。


当前回答

我发现这篇文章对于理解如何使用python代码实现列表非常有帮助。

class OhMyList:
    def __init__(self):
        self.length = 0
        self.capacity = 8
        self.array = (self.capacity * ctypes.py_object)()

    def append(self, item):
        if self.length == self.capacity:
            self._resize(self.capacity*2)
        self.array[self.length] = item
        self.length += 1

    def _resize(self, new_cap):
        new_arr = (new_cap * ctypes.py_object)()
        for idx in range(self.length):
            new_arr[idx] = self.array[idx]
        self.array = new_arr
        self.capacity = new_cap

    def __len__(self):
        return self.length

    def __getitem__(self, idx):
        return self.array[idx]

其他回答

正如上面其他人所述,列表(当相当大时)是通过分配固定数量的空间来实现的,如果该空间应该被填满,则分配更大数量的空间并复制元素。

To understand why the method is O(1) amortized, without loss of generality, assume we have inserted a = 2^n elements, and we now have to double our table to 2^(n+1) size. That means we're currently doing 2^(n+1) operations. Last copy, we did 2^n operations. Before that we did 2^(n-1)... all the way down to 8,4,2,1. Now, if we add these up, we get 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) total insertions (i.e. O(1) amortized time). Also, it should be noted that if the table allows deletions the table shrinking has to be done at a different factor (e.g 3x)

Python中的列表类似于数组,可以存储多个值。列表是可变的,这意味着你可以改变它。你应该知道的更重要的事情是,当我们创建一个列表时,Python会自动为该列表变量创建一个reference_id。如果你通过分配其他变量来改变它,主列表也会改变。让我们举个例子:

list_one = [1,2,3,4]

my_list = list_one

#my_list: [1,2,3,4]

my_list.append("new")

#my_list: [1,2,3,4,'new']
#list_one: [1,2,3,4,'new']

我们添加了my_list,但是主列表已经改变了。那意味着列表没有赋值为一个复制列表赋值为它的引用。

实际上,C代码非常简单。扩展一个宏并删除一些不相关的注释,基本结构在listobject.h中,它将列表定义为:

typedef struct {
    PyObject_HEAD
    Py_ssize_t ob_size;

    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     */
    Py_ssize_t allocated;
} PyListObject;

PyObject_HEAD包含引用计数和类型标识符。这是一个过度分配的向量/数组。当数组满时,调整数组大小的代码在listobject.c中。它实际上并没有使数组加倍,而是通过分配来增长

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
new_allocated += newsize;

到每次的容量,其中newsize是请求的大小(不一定分配+ 1,因为您可以扩展任意数量的元素,而不是逐个添加它们)。

参见Python常见问题解答。

在CPython中,list是作为动态数组实现的,因此当我们追加时,不仅添加了一个宏,而且分配了更多的空间,这样每次都不应该添加新的空间。

我推荐Laurent Luce的文章“Python列表实现”。对我来说真的很有用,因为作者解释了如何在CPython中实现列表,并为此使用了出色的图表。

List object C structure A list object in CPython is represented by the following C structure. ob_item is a list of pointers to the list elements. allocated is the number of slots allocated in memory. typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject; It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as len(l). The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing calling realloc each time a new elements is appended to the list.

...

附加 我们在列表中添加一个整数:l.p append(1)。会发生什么呢? 我们继续添加一个元素:l.p append(2)。当n+1 = 2时调用List_resize,但因为分配的大小是4,所以不需要分配更多的内存。当我们再加两个整数l.append(3), l.append(4)时也会发生同样的事情。下面的图表显示了我们目前所拥有的。

...

插入 让我们在位置1插入一个新的整数(5):l.insert(1,5),看看内部发生了什么。

...

Pop When you pop the last element: l.pop(), listpop() is called. list_resize is called inside listpop() and if the new size is less than half of the allocated size then the list is shrunk. You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4. Let’s pop one more element. In list_resize(), size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3. You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.

...

删除 Python列表对象有一个删除特定元素的方法:l.remove(5)。