它是链表还是数组?我四处搜寻,只发现人们在猜测。我的C语言知识还不够好,不能看源代码。


当前回答

正如上面其他人所述,列表(当相当大时)是通过分配固定数量的空间来实现的,如果该空间应该被填满,则分配更大数量的空间并复制元素。

To understand why the method is O(1) amortized, without loss of generality, assume we have inserted a = 2^n elements, and we now have to double our table to 2^(n+1) size. That means we're currently doing 2^(n+1) operations. Last copy, we did 2^n operations. Before that we did 2^(n-1)... all the way down to 8,4,2,1. Now, if we add these up, we get 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) total insertions (i.e. O(1) amortized time). Also, it should be noted that if the table allows deletions the table shrinking has to be done at a different factor (e.g 3x)

其他回答

根据文档,

Python的列表实际上是变长数组,而不是lisp风格的链表。

我推荐Laurent Luce的文章“Python列表实现”。对我来说真的很有用,因为作者解释了如何在CPython中实现列表,并为此使用了出色的图表。

List object C structure A list object in CPython is represented by the following C structure. ob_item is a list of pointers to the list elements. allocated is the number of slots allocated in memory. typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject; It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as len(l). The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing calling realloc each time a new elements is appended to the list.

...

附加 我们在列表中添加一个整数:l.p append(1)。会发生什么呢? 我们继续添加一个元素:l.p append(2)。当n+1 = 2时调用List_resize,但因为分配的大小是4,所以不需要分配更多的内存。当我们再加两个整数l.append(3), l.append(4)时也会发生同样的事情。下面的图表显示了我们目前所拥有的。

...

插入 让我们在位置1插入一个新的整数(5):l.insert(1,5),看看内部发生了什么。

...

Pop When you pop the last element: l.pop(), listpop() is called. list_resize is called inside listpop() and if the new size is less than half of the allocated size then the list is shrunk. You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4. Let’s pop one more element. In list_resize(), size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3. You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.

...

删除 Python列表对象有一个删除特定元素的方法:l.remove(5)。

在CPython中,list是作为动态数组实现的,因此当我们追加时,不仅添加了一个宏,而且分配了更多的空间,这样每次都不应该添加新的空间。

它是一个动态数组。实践证明:无论索引如何,索引都需要相同的时间(当然差异非常小(0.0013µs !)):

...>python -m timeit --setup="x = [None]*1000" "x[500]"
10000000 loops, best of 3: 0.0579 usec per loop

...>python -m timeit --setup="x = [None]*1000" "x[0]"
10000000 loops, best of 3: 0.0566 usec per loop

如果IronPython或Jython使用链表,我会感到震惊——它们会破坏许多广泛使用的库的性能,这些库建立在链表是动态数组的假设之上。

我发现这篇文章对于理解如何使用python代码实现列表非常有帮助。

class OhMyList:
    def __init__(self):
        self.length = 0
        self.capacity = 8
        self.array = (self.capacity * ctypes.py_object)()

    def append(self, item):
        if self.length == self.capacity:
            self._resize(self.capacity*2)
        self.array[self.length] = item
        self.length += 1

    def _resize(self, new_cap):
        new_arr = (new_cap * ctypes.py_object)()
        for idx in range(self.length):
            new_arr[idx] = self.array[idx]
        self.array = new_arr
        self.capacity = new_cap

    def __len__(self):
        return self.length

    def __getitem__(self, idx):
        return self.array[idx]