比较两个双精度浮点数或两个浮点数最有效的方法是什么?

简单地这样做是不正确的:

bool CompareDoubles1 (double A, double B)
{
   return A == B;
}

比如:

bool CompareDoubles2 (double A, double B) 
{
   diff = A - B;
   return (diff < EPSILON) && (-diff < EPSILON);
}

似乎是浪费加工。

有人知道更聪明的浮点比较器吗?


当前回答

比较浮点数取决于上下文。因为即使改变操作的顺序也会产生不同的结果,所以知道你希望这些数字有多“相等”是很重要的。

在研究浮点数比较时,比较Bruce Dawson编写的浮点数是一个很好的开始。

以下定义来自Knuth的《The art of computer programming》:

bool approximatelyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool essentiallyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyGreaterThan(float a, float b, float epsilon)
{
    return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyLessThan(float a, float b, float epsilon)
{
    return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

当然,选择取决于上下文,并决定你想要的数字有多相等。

比较浮点数的另一种方法是查看数字的ULP(最后位置的单位)。虽然没有专门处理比较,但“每个计算机科学家都应该知道浮点数”这篇论文是了解浮点数如何工作以及陷阱是什么,包括什么是ULP的很好的资源。

其他回答

正如其他人所指出的那样,使用固定指数(例如0.0000001)对于远离该值的值是无用的。例如,如果你的两个值是10000.000977和10000,那么这两个数字之间没有32位浮点值——10000和10000.000977是你可能得到的最接近的值,而不是位对位相同。这里,小于0.0009是没有意义的;你也可以使用直接等式运算符。

同样地,当两个值的大小接近ε时,相对误差增长到100%。

Thus, trying to mix a fixed point number such as 0.00001 with floating-point values (where the exponent is arbitrary) is a pointless exercise. This will only ever work if you can be assured that the operand values lie within a narrow domain (that is, close to some specific exponent), and if you properly select an epsilon value for that specific test. If you pull a number out of the air ("Hey! 0.00001 is small, so that must be good!"), you're doomed to numerical errors. I've spent plenty of time debugging bad numerical code where some poor schmuck tosses in random epsilon values to make yet another test case work.

如果你从事任何类型的数值编程,并认为你需要达到定点的epsilon,请阅读BRUCE关于比较浮点数的文章。

浮点数比较

比较浮点数取决于上下文。因为即使改变操作的顺序也会产生不同的结果,所以知道你希望这些数字有多“相等”是很重要的。

在研究浮点数比较时,比较Bruce Dawson编写的浮点数是一个很好的开始。

以下定义来自Knuth的《The art of computer programming》:

bool approximatelyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool essentiallyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyGreaterThan(float a, float b, float epsilon)
{
    return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyLessThan(float a, float b, float epsilon)
{
    return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

当然,选择取决于上下文,并决定你想要的数字有多相等。

比较浮点数的另一种方法是查看数字的ULP(最后位置的单位)。虽然没有专门处理比较,但“每个计算机科学家都应该知道浮点数”这篇论文是了解浮点数如何工作以及陷阱是什么,包括什么是ULP的很好的资源。

General-purpose comparison of floating-point numbers is generally meaningless. How to compare really depends on a problem at hand. In many problems, numbers are sufficiently discretized to allow comparing them within a given tolerance. Unfortunately, there are just as many problems, where such trick doesn't really work. For one example, consider working with a Heaviside (step) function of a number in question (digital stock options come to mind) when your observations are very close to the barrier. Performing tolerance-based comparison wouldn't do much good, as it would effectively shift the issue from the original barrier to two new ones. Again, there is no general-purpose solution for such problems and the particular solution might require going as far as changing the numerical method in order to achieve stability.

使用任何其他建议都要非常小心。这完全取决于上下文。

我花了很长时间在一个系统中追踪错误,该系统假设|a-b|<epsilon,则a==b。潜在的问题是:

The implicit presumption in an algorithm that if a==b and b==c then a==c. Using the same epsilon for lines measured in inches and lines measured in mils (.001 inch). That is a==b but 1000a!=1000b. (This is why AlmostEqual2sComplement asks for the epsilon or max ULPS). The use of the same epsilon for both the cosine of angles and the length of lines! Using such a compare function to sort items in a collection. (In this case using the builtin C++ operator == for doubles produced correct results.)

就像我说的,这完全取决于上下文和a和b的预期大小。

顺便说一下,std::numeric_limits<double>::epsilon()是“机器epsilon”。它是1.0和下一个用double表示的值之间的差值。我猜它可以用在比较函数中,但只有当期望值小于1时。(这是对@cdv的回答的回应…)

同样,如果你的int算术是双精度的(这里我们在某些情况下使用双精度来保存int值),你的算术是正确的。例如,4.0/2.0将等同于1.0+1.0。只要你不做导致分数(4.0/3.0)的事情,或者不超出int的大小。

在https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon上找到了另一个有趣的实现

#include <cmath>
#include <limits>
#include <iomanip>
#include <iostream>
#include <type_traits>
#include <algorithm>



template<class T>
typename std::enable_if<!std::numeric_limits<T>::is_integer, bool>::type
    almost_equal(T x, T y, int ulp)
{
    // the machine epsilon has to be scaled to the magnitude of the values used
    // and multiplied by the desired precision in ULPs (units in the last place)
    return std::fabs(x-y) <= std::numeric_limits<T>::epsilon() * std::fabs(x+y) * ulp
        // unless the result is subnormal
        || std::fabs(x-y) < std::numeric_limits<T>::min();
}

int main()
{
    double d1 = 0.2;
    double d2 = 1 / std::sqrt(5) / std::sqrt(5);
    std::cout << std::fixed << std::setprecision(20) 
        << "d1=" << d1 << "\nd2=" << d2 << '\n';

    if(d1 == d2)
        std::cout << "d1 == d2\n";
    else
        std::cout << "d1 != d2\n";

    if(almost_equal(d1, d2, 2))
        std::cout << "d1 almost equals d2\n";
    else
        std::cout << "d1 does not almost equal d2\n";
}