float(nan')表示nan(不是数字)。但我该如何检查呢?


当前回答

判断变量是NaN还是None的所有方法:

无类型

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN类型

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True

其他回答

对于panda中的字符串,请使用pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

NLTK的特征提取功能

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features

测试NaN的通常方法是查看它是否等于自身:

def isNaN(num):
    return num != num

在Python 3.6中,检查字符串值x math.isnan(x)和np.issan(x)会引发错误。所以我无法检查给定值是否为NaN,如果我事先不知道它是一个数字。以下内容似乎解决了这个问题

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')

似乎检查它是否等于自身(x!=x)是最快的。

import pandas as pd 
import numpy as np 
import math 

x = float('nan')

%timeit x != x
44.8 ns ± 0.152 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit math.isnan(x)
94.2 ns ± 0.955 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit pd.isna(x)
281 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit np.isnan(x)
1.38 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

事实上我刚碰到这个,但对我来说,它是在检查nan、-inf或inf

if float('-inf') < float(num) < float('inf'):

这对于数字是正确的,对于nan和inf都是错误的,对于字符串或其他类型(这可能是一件好事)会引发异常。此外,这不需要导入任何库,如math或numpy(numpy非常大,它的大小是任何编译应用程序的两倍)。