NumPy相对于常规Python列表的优势是什么?

我有大约100个金融市场系列,我将创建一个包含100x100x100 = 100万个单元格的立方体数组。我将用每个y和z回归(3个变量)每个x,用标准误差填充数组。

我听说对于“大型矩阵”,出于性能和可伸缩性的考虑,我应该使用NumPy而不是Python列表。问题是,我知道Python列表,它们似乎对我有用。

如果我转移到NumPy会有什么好处?

如果我有1000个系列(即立方体中有10亿个浮点单元)会怎样?


当前回答

所有这些都强调了numpy数组和python列表之间几乎所有的主要区别,我将在这里简要介绍一下:

Numpy arrays have a fixed size at creation, unlike python lists (which can grow dynamically). Changing the size of ndarray will create a new array and delete the original. The elements in a Numpy array are all required to be of the same data type (we can have the heterogeneous type as well but that will not gonna permit you mathematical operations) and thus will be the same size in memory Numpy arrays are facilitated advances mathematical and other types of operations on large numbers of data. Typically such operations are executed more efficiently and with less code than is possible using pythons build in sequences

其他回答

NumPy不仅效率更高;它也更方便。你可以免费得到很多向量和矩阵的运算,有时可以避免不必要的工作。它们也得到了有效的实施。

例如,你可以直接从文件中读入一个数组:

x = numpy.fromfile(file=open("data"), dtype=float).reshape((100, 100, 100))

沿着第二个维度求和:

s = x.sum(axis=1)

找出超过阈值的单元格:

(x > 0.5).nonzero()

移除第三维度上的每一个偶数索引切片:

x[:, :, ::2]

此外,许多有用的库都使用NumPy数组。例如,统计分析和可视化库。

即使您没有性能问题,学习NumPy也是值得的。

以下是scipy.org网站常见问题解答中的一个很好的答案:

NumPy数组比(嵌套的)Python列表有什么优势?

Python’s lists are efficient general-purpose containers. They support (fairly) efficient insertion, deletion, appending, and concatenation, and Python’s list comprehensions make them easy to construct and manipulate. However, they have certain limitations: they don’t support “vectorized” operations like elementwise addition and multiplication, and the fact that they can contain objects of differing types mean that Python must store type information for every element, and must execute type dispatching code when operating on each element. This also means that very few list operations can be carried out by efficient C loops – each iteration would require type checks and other Python API bookkeeping.

The standard mutable multielement container in Python is the list. Because of Python's dynamic typing, we can even create heterogeneous list. To allow these flexible types, each item in the list must contain its own type info, reference count, and other information. That is, each item is a complete Python object. In the special case that all variables are of the same type, much of this information is redundant; it can be much more efficient to store data in a fixed-type array (NumPy-style). Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.

NumPy不是另一种编程语言,而是Python扩展模块。它对同构数据数组提供了快速高效的操作。 Numpy有固定的创建大小。 在Python中:列表用方括号编写。 这些列表可以是同构的,也可以是异构的 使用Numpy数组而不是Python列表的主要优点: 它消耗的内存更少。 与python列表相比速度快。 使用方便。

所有这些都强调了numpy数组和python列表之间几乎所有的主要区别,我将在这里简要介绍一下:

Numpy arrays have a fixed size at creation, unlike python lists (which can grow dynamically). Changing the size of ndarray will create a new array and delete the original. The elements in a Numpy array are all required to be of the same data type (we can have the heterogeneous type as well but that will not gonna permit you mathematical operations) and thus will be the same size in memory Numpy arrays are facilitated advances mathematical and other types of operations on large numbers of data. Typically such operations are executed more efficiently and with less code than is possible using pythons build in sequences