今天,我在看一些c++代码(别人写的),发现了这一部分:
double someValue = ...
if (someValue < std::numeric_limits<double>::epsilon() &&
someValue > -std::numeric_limits<double>::epsilon()) {
someValue = 0.0;
}
我在想这到底说得通不合理。
epsilon()的文档说:
该函数返回1与可[用双精度符号]表示的大于1的最小值之间的差值。
这是否也适用于0,即()的最小值大于0?或者有没有0到0 +之间的数可以用双精度数表示?
如果不是,那么比较是不是等同于someValue == 0.0?
假设64位IEEE双精度,则有52位尾数和11位指数。让我们把它分解一下:
1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^0 = 1
大于1的最小可表示数:
1.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^0 = 1 + 2^-52
因此:
epsilon = (1 + 2^-52) - 1 = 2^-52
在0和之间有数字吗?很多……例如,最小正可表示(正常)数为:
1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^-1022 = 2^-1022
事实上,在0和之间有(1022 - 52 + 1)×2^52 = 4372995238176751616个数字,这是所有正可表示数字的47%…
假设我们正在使用适合16位寄存器的玩具浮点数。有一个符号位,一个5位指数和一个10位尾数。
这个浮点数的值是尾数,解释为二进制十进制值,乘以2的指数次方。
在1附近,指数等于0。尾数中最小的数字是1024的1分之一。
接近1/2的指数是- 1,所以尾数最小的部分是一半大。如果是5位指数,它可以达到负16,此时尾数最小的部分值为3200万分之一。在- 16指数处,这个值大约是32k的1分之1,比我们上面计算的1附近更接近于0 !
这是一个玩具式的浮点模型,它不能反映真正的浮点系统的所有怪癖,但是它反映小于的值的能力与真正的浮点值相当相似。
可以用下面的程序输出一个数(1.0,0.0,…)的近似值(可能的最小差值)。输出如下:
0.0 = 4.940656e-324
1.0的是2.220446e-16
稍微思考一下就会明白,我们用来计算它的值的数字越小,指数就越小,因为指数可以调整到这个数字的大小。
#include <stdio.h>
#include <assert.h>
double getEps (double m) {
double approx=1.0;
double lastApprox=0.0;
while (m+approx!=m) {
lastApprox=approx;
approx/=2.0;
}
assert (lastApprox!=0);
return lastApprox;
}
int main () {
printf ("epsilon for 0.0 is %e\n", getEps (0.0));
printf ("epsilon for 1.0 is %e\n", getEps (1.0));
return 0;
}
使用IEEE浮点,在最小的非零正数和最小的非零负数之间,存在两个值:正零和负零。测试一个值是否在最小的非零值之间等价于测试与零相等;然而,赋值可能会产生影响,因为它会将负0变为正0。
It would be conceivable that a floating-point format might have three values between the smallest finite positive and negative values: positive infinitesimal, unsigned zero, and negative infinitesimal. I am not familiar with any floating-point formats that in fact work that way, but such a behavior would be perfectly reasonable and arguably better than that of IEEE (perhaps not enough better to be worth adding extra hardware to support it, but mathematically 1/(1/INF), 1/(-1/INF), and 1/(1-1) should represent three distinct cases illustrating three different zeroes). I don't know whether any C standard would mandate that signed infinitesimals, if they exist, would have to compare equal to zero. If they do not, code like the above could usefully ensure that e.g. dividing a number repeatedly by two would eventually yield zero rather than being stuck on "infinitesimal".
Also, a good reason for having such a function is to remove "denormals" (those very small numbers that can no longer use the implied leading "1" and have a special FP representation). Why would you want to do this? Because some machines (in particular, some older Pentium 4s) get really, really slow when processing denormals. Others just get somewhat slower. If your application doesn't really need these very small numbers, flushing them to zero is a good solution. Good places to consider this are the last steps of any IIR filters or decay functions.
请参见:为什么将0.1f更改为0会使性能降低10倍?
和http://en.wikipedia.org/wiki/Denormal_number
X和X的下一个值之间的差值根据X而变化。
Epsilon()只是1和下一个1的值之间的差。
0和下一个0值之间的差不是()。
相反,你可以使用std::nextafter来比较双精度值和0,如下所示:
bool same(double a, double b)
{
return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b
&& std::nextafter(a, std::numeric_limits<double>::max()) >= b;
}
double someValue = ...
if (same (someValue, 0.0)) {
someValue = 0.0;
}