float(nan')表示nan(不是数字)。但我该如何检查呢?
当前回答
对于panda中的字符串,请使用pd.isnull:
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
NLTK的特征提取功能
def act_features(atext):
features = {}
if not pd.isnull(atext):
for word in nltk.word_tokenize(atext):
if word not in default_stopwords:
features['cont({})'.format(word.lower())]=True
return features
其他回答
这里有三种方法可以测试变量是否为“NaN”。
import pandas as pd
import numpy as np
import math
# For single variable all three libraries return single boolean
x1 = float("nan")
print(f"It's pd.isna: {pd.isna(x1)}")
print(f"It's np.isnan: {np.isnan(x1)}}")
print(f"It's math.isnan: {math.isnan(x1)}}")
输出
It's pd.isna: True
It's np.isnan: True
It's math.isnan: True
用于浮球类型
>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False
numpy.isnan(数字)告诉你它是不是NaN。
似乎检查它是否等于自身(x!=x)是最快的。
import pandas as pd
import numpy as np
import math
x = float('nan')
%timeit x != x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
判断变量是NaN还是None的所有方法:
无类型
In [1]: from numpy import math
In [2]: a = None
In [3]: not a
Out[3]: True
In [4]: len(a or ()) == 0
Out[4]: True
In [5]: a == None
Out[5]: True
In [6]: a is None
Out[6]: True
In [7]: a != a
Out[7]: False
In [9]: math.isnan(a)
Traceback (most recent call last):
File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
math.isnan(a)
TypeError: a float is required
In [10]: len(a) == 0
Traceback (most recent call last):
File "<ipython-input-10-65b72372873e>", line 1, in <module>
len(a) == 0
TypeError: object of type 'NoneType' has no len()
NaN类型
In [11]: b = float('nan')
In [12]: b
Out[12]: nan
In [13]: not b
Out[13]: False
In [14]: b != b
Out[14]: True
In [15]: math.isnan(b)
Out[15]: True