比较两个双精度浮点数或两个浮点数最有效的方法是什么?
简单地这样做是不正确的:
bool CompareDoubles1 (double A, double B)
{
return A == B;
}
比如:
bool CompareDoubles2 (double A, double B)
{
diff = A - B;
return (diff < EPSILON) && (-diff < EPSILON);
}
似乎是浪费加工。
有人知道更聪明的浮点比较器吗?
/// testing whether two doubles are almost equal. We consider two doubles
/// equal if the difference is within the range [0, epsilon).
///
/// epsilon: a positive number (supposed to be small)
///
/// if either x or y is 0, then we are comparing the absolute difference to
/// epsilon.
/// if both x and y are non-zero, then we are comparing the relative difference
/// to epsilon.
bool almost_equal(double x, double y, double epsilon)
{
double diff = x - y;
if (x != 0 && y != 0){
diff = diff/y;
}
if (diff < epsilon && -1.0*diff < epsilon){
return true;
}
return false;
}
我在我的小项目中使用了这个函数,它是有效的,但注意以下几点:
双精度误差可以为你制造惊喜。假设epsilon = 1.0e-6,那么根据上面的代码,1.0和1.000001不应该被认为是相等的,但在我的机器上,函数认为它们是相等的,这是因为1.000001不能精确地转换为二进制格式,它可能是1.0000009xxx。我用1.0和1.0000011测试了它,这次我得到了预期的结果。
我的课程是基于之前发布的答案。非常类似于谷歌的代码,但我使用了一个偏差,将所有NaN值推到0xFF000000以上。这样可以更快地检查NaN。
这段代码是为了演示概念,而不是通用的解决方案。谷歌的代码已经展示了如何计算所有平台特定的值,我不想复制所有这些。我对这段代码做了有限的测试。
typedef unsigned int U32;
// Float Memory Bias (unsigned)
// ----- ------ ---------------
// NaN 0xFFFFFFFF 0xFF800001
// NaN 0xFF800001 0xFFFFFFFF
// -Infinity 0xFF800000 0x00000000 ---
// -3.40282e+038 0xFF7FFFFF 0x00000001 |
// -1.40130e-045 0x80000001 0x7F7FFFFF |
// -0.0 0x80000000 0x7F800000 |--- Valid <= 0xFF000000.
// 0.0 0x00000000 0x7F800000 | NaN > 0xFF000000
// 1.40130e-045 0x00000001 0x7F800001 |
// 3.40282e+038 0x7F7FFFFF 0xFEFFFFFF |
// Infinity 0x7F800000 0xFF000000 ---
// NaN 0x7F800001 0xFF000001
// NaN 0x7FFFFFFF 0xFF7FFFFF
//
// Either value of NaN returns false.
// -Infinity and +Infinity are not "close".
// -0 and +0 are equal.
//
class CompareFloat{
public:
union{
float m_f32;
U32 m_u32;
};
static bool CompareFloat::IsClose( float A, float B, U32 unitsDelta = 4 )
{
U32 a = CompareFloat::GetBiased( A );
U32 b = CompareFloat::GetBiased( B );
if ( (a > 0xFF000000) || (b > 0xFF000000) )
{
return( false );
}
return( (static_cast<U32>(abs( a - b ))) < unitsDelta );
}
protected:
static U32 CompareFloat::GetBiased( float f )
{
U32 r = ((CompareFloat*)&f)->m_u32;
if ( r & 0x80000000 )
{
return( ~r - 0x007FFFFF );
}
return( r + 0x7F800000 );
}
};
你写的代码有bug:
return (diff < EPSILON) && (-diff > EPSILON);
正确的代码应该是:
return (diff < EPSILON) && (diff > -EPSILON);
(…是的,这是不同的)
我想知道晶圆厂是否会让你在某些情况下失去懒惰的评价。我会说这取决于编译器。你可能想两种都试试。如果它们在平均水平上是相等的,则采用晶圆厂实现。
如果你有一些关于两个浮点数中哪一个比另一个更大的信息,你可以根据比较的顺序来更好地利用惰性求值。
最后,通过内联这个函数可能会得到更好的结果。不过不太可能有太大改善……
编辑:OJ,谢谢你纠正你的代码。我相应地删除了我的评论
在这个版本中,你可以检查,这些数字之间的差异并不比某些分数(比如,0.0001%)更大:
bool floatApproximatelyEquals(const float a, const float b) {
if (b == 0.) return a == 0.; // preventing division by zero
return abs(1. - a / b) < 1e-6;
}
请注意Sneftel关于浮动可能的分数限制的评论。
还要注意的是,它不同于使用绝对的epsilon的方法——这里你不需要担心“数量级”——数字可能是,比如说1e100,或者1e-100,它们总是会被一致地比较,而且你不必为每一种情况更新epsilon。
我最终花了相当多的时间在这个伟大的线程通过材料。我怀疑每个人都想花这么多时间,所以我将强调我所学到的总结和我实施的解决方案。
快速的总结
Is 1e-8 approximately same as 1e-16? If you are looking at noisy sensor data then probably yes but if you are doing molecular simulation then may be not! Bottom line: You always need to think of tolerance value in context of specific function call and not just make it generic app-wide hard-coded constant.
For general library functions, it's still nice to have parameter with default tolerance. A typical choice is numeric_limits::epsilon() which is same as FLT_EPSILON in float.h. This is however problematic because epsilon for comparing values like 1.0 is not same as epsilon for values like 1E9. The FLT_EPSILON is defined for 1.0.
The obvious implementation to check if number is within tolerance is fabs(a-b) <= epsilon however this doesn't work because default epsilon is defined for 1.0. We need to scale epsilon up or down in terms of a and b.
There are two solution to this problem: either you set epsilon proportional to max(a,b) or you can get next representable numbers around a and then see if b falls into that range. The former is called "relative" method and later is called ULP method.
Both methods actually fails anyway when comparing with 0. In this case, application must supply correct tolerance.
实用函数实现(c++ 11)
//implements relative method - do not use for comparing with zero
//use this most of the time, tolerance needs to be meaningful in your context
template<typename TReal>
static bool isApproximatelyEqual(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = std::fabs(a - b);
if (diff <= tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//supply tolerance that is meaningful in your context
//for example, default tolerance may not work if you are comparing double with float
template<typename TReal>
static bool isApproximatelyZero(TReal a, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
if (std::fabs(a) <= tolerance)
return true;
return false;
}
//use this when you want to be on safe side
//for example, don't start rover unless signal is above 1
template<typename TReal>
static bool isDefinitelyLessThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff < tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
template<typename TReal>
static bool isDefinitelyGreaterThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff > tolerance)
return true;
if (diff > std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//implements ULP method
//use this when you are only concerned about floating point precision issue
//for example, if you want to see if a is 1.0 by checking if its within
//10 closest representable floating point numbers around 1.0.
template<typename TReal>
static bool isWithinPrecisionInterval(TReal a, TReal b, unsigned int interval_size = 1)
{
TReal min_a = a - (a - std::nextafter(a, std::numeric_limits<TReal>::lowest())) * interval_size;
TReal max_a = a + (std::nextafter(a, std::numeric_limits<TReal>::max()) - a) * interval_size;
return min_a <= b && max_a >= b;
}
'返回fabs(a - b) < EPSILON;
这是可以的,如果:
输入的数量级变化不大
极少数相反的符号可以被视为相等
否则就会给你带来麻烦。双精度数的分辨率约为小数点后16位。如果您正在比较的两个数字在量级上大于EPSILON*1.0E16,那么您可能会说:
return a==b;
我将研究一种不同的方法,假设您需要担心第一个问题,并假设第二个问题对您的应用程序很好。解决方案应该是这样的:
#define VERYSMALL (1.0E-150)
#define EPSILON (1.0E-8)
bool AreSame(double a, double b)
{
double absDiff = fabs(a - b);
if (absDiff < VERYSMALL)
{
return true;
}
double maxAbs = max(fabs(a) - fabs(b));
return (absDiff/maxAbs) < EPSILON;
}
这在计算上是昂贵的,但有时是需要的。这就是我们公司必须做的事情,因为我们要处理一个工程库,输入可能相差几十个数量级。
无论如何,关键在于(并且适用于几乎所有的编程问题):评估你的需求是什么,然后想出一个解决方案来满足你的需求——不要认为简单的答案就能满足你的需求。如果在您的评估后,您发现fabs(a-b) < EPSILON将足够,完美-使用它!但也要注意它的缺点和其他可能的解决方案。