我想找出我的数据的每一列中NaN的数量。


当前回答

如果你需要得到非NA (non-None)和NA (None)计数在不同的组拉出groupby:

gdf = df.groupby(['ColumnToGroupBy'])

def countna(x):
    return (x.isna()).sum()

gdf.agg(['count', countna, 'size'])

这将返回每个组的非NA、NA和总条目数。

其他回答

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.count.html#pandas.Series.count

pandas.Series.count
Series.count(level=None)[source]

返回系列中非na /null观测值的个数

import pandas as pd
import numpy as np

# example DataFrame
df = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]})

# count the NaNs in a column
num_nan_a = df.loc[ (pd.isna(df['a'])) , 'a' ].shape[0]
num_nan_b = df.loc[ (pd.isna(df['b'])) , 'b' ].shape[0]

# summarize the num_nan_b
print(df)
print(' ')
print(f"There are {num_nan_a} NaNs in column a")
print(f"There are {num_nan_b} NaNs in column b")

给出输出:

     a    b
0  1.0  NaN
1  2.0  1.0
2  NaN  NaN

There are 1 NaNs in column a
There are 2 NaNs in column b
df.isnull().sum() 
      //type: <class 'pandas.core.series.Series'>

or

df.column_name.isnull().sum()
     //type: <type 'numpy.int64'>

.sum df.isnull () () 将给出缺失值的列和。

如果你想知道特定列中缺失值的总和,那么以下代码将起作用:

我使用这个循环来计算每一列的缺失值:

# check missing values
import numpy as np, pandas as pd
for col in df:
      print(col +': '+ np.str(df[col].isna().sum()))