如何获取Pandas DataFrame的行计数？

如何获取panda数据帧df的行数？

当前回答

…建立在Jan Philip Gehrcke的答案之上。

len（df）或len（df.index）比df.shape[0]更快的原因是：

看看代码。df.shape是一个@属性，它运行两次调用len的DataFrame方法。

df.shape??
Type:        property
String form: <property object at 0x1127b33c0>
Source:
# df.shape.fget
@property
def shape(self):
    """
    Return a tuple representing the dimensionality of the DataFrame.
    """
    return len(self.index), len(self.columns)

在len（df）的罩下

df.__len__??
Signature: df.__len__()
Source:
    def __len__(self):
        """Returns length of info axis, but here we use the index """
        return len(self.index)
File:      ~/miniconda2/lib/python2.7/site-packages/pandas/core/frame.py
Type:      instancemethod

len（df.index）将比len（df）稍快，因为它少了一个函数调用，但这总是比df.shape[0]快

2017-12-07 23:37:11

其他回答

假设数据集是“data”，将数据集命名为“data_fr”，data_fr中的行数为“nu_rows”

#import the data frame. Extention could be different as csv,xlsx or etc.
data_fr = pd.read_csv('data.csv')

#print the number of rows
nu_rows = data_fr.shape[0]
print(nu_rows)

2021-01-02 23:04:44

我不确定这是否可行（数据可以省略），但这可能可行：

*dataframe name*.tails(1)

然后使用这个，您可以通过运行代码片段并查看提供给您的行号来找到行数。

2020-04-05 19:49:33

如果要在链接操作的中间获取行数，可以使用：

df.pipe(len)

例子：

row_count = (
      pd.DataFrame(np.random.rand(3,4))
      .reset_index()
      .pipe(len)
)

如果您不想在len（）函数中放一个长语句，这可能很有用。

您可以改用__len__（），但__len__）看起来有点奇怪。

2018-02-22 02:58:24

使用len（df）：-）。

__len__（）记录了“返回索引长度”。

计时信息，设置方式与root的答案相同：

In [7]: timeit len(df.index)
1000000 loops, best of 3: 248 ns per loop

In [8]: timeit len(df)
1000000 loops, best of 3: 573 ns per loop

由于有一个额外的函数调用，当然可以说它比直接调用len（df.index）慢一点。但在大多数情况下，这并不重要。我发现len（df）非常可读。

2013-08-19 15:02:45

假设df是您的数据帧，那么：

count_row = df.shape[0]  # Gives number of rows
count_col = df.shape[1]  # Gives number of columns

或者更简洁地说，

r, c = df.shape

2016-02-20 13:30:05

如何获取Pandas DataFrame的行计数？

推荐文章

最新文章

标签