float和double的区别是什么?

我读过关于双精度和单精度之间的区别。然而，在大多数情况下，float和double似乎是可互换的，即使用其中一个似乎不会影响结果。事实真的如此吗?什么时候浮点数和双精度数可以互换?它们之间有什么区别?

当前回答

在数量上，正如其他答案所指出的，不同之处在于double类型的精度是float类型的两倍，范围是float类型的三倍(取决于你如何计算)。

但也许更重要的是质的差异。float类型具有良好的精度，无论您正在做什么，这通常都足够好。另一方面，Type double具有出色的精度，无论你在做什么，它几乎总是足够好。

结果是，几乎总是应该使用类型double，这一点并不广为人知。除非你有一些特别的需要，否则你几乎不应该使用float类型。

As everyone knows, "roundoff error" is often a problem when you're doing floating-point work. Roundoff error can be subtle, and difficult to track down, and difficult to fix. Most programmers don't have the time or expertise to track down and fix numerical errors in floating-point algorithms — because unfortunately, the details end up being different for every different algorithm. But type double has enough precision such that, much of the time, you don't have to worry. You'll get good results anyway. With type float, on the other hand, alarming-looking issues with roundoff crop up all the time.

And the thing that's not necessarily different between type float and double is execution speed. On most of today's general-purpose processors, arithmetic operations on type float and double take more or less exactly the same amount of time. Everything's done in parallel, so you don't pay a speed penalty for the greater range and precision of type double. That's why it's safe to make the recommendation that you should almost never use type float: Using double shouldn't cost you anything in speed, and it shouldn't cost you much in space, and it will almost definitely pay off handsomely in freedom from precision and roundoff error woes.

(尽管如此，当你在微控制器上进行嵌入式工作或编写针对GPU优化的代码时，你可能需要float类型的“特殊需求”之一。在这些处理器上，double类型可能会非常慢，或者几乎不存在，所以在这种情况下，程序员通常会选择float类型来提高速度，并可能为精度付出代价。)

2022-02-26 12:34:26

其他回答

如果使用嵌入式处理，最终底层硬件(例如FPGA或某些特定的处理器/微控制器模型)将在硬件中优化实现float，而double将使用软件例程。因此，如果浮点数的精度足以满足需求，则使用浮点数执行程序的速度将比使用浮点数执行程序的速度快几倍。正如在其他答案中提到的，要小心累积错误。

2020-05-07 13:36:32

巨大的差异。

顾名思义，double的精度是浮点数[1]的2倍。一般来说，double有15个十进制数字的精度，而float有7个。

下面是如何计算位数的:

Double有52个尾数位+ 1个隐藏位:log(253)÷log(10) = 15.95位浮点数有23个尾数位+ 1个隐藏位:log(224)÷log(10) = 7.22位数字

当重复计算时，这种精度损失可能导致更大的截断误差累积。

float a = 1.f / 81;
float b = 0;
for (int i = 0; i < 729; ++ i)
    b += a;
printf("%.7g\n", b); // prints 9.000023

而

double a = 1.0 / 81;
double b = 0;
for (int i = 0; i < 729; ++ i)
    b += a;
printf("%.15g\n", b); // prints 8.99999999999996

同样，float的最大值约为3e38，但double约为1.7e308，因此对于一些简单的事情，使用float可以比double更容易达到“无穷大”(即一个特殊的浮点数)，例如计算60的阶乘。

在测试期间，可能有一些测试用例包含这些巨大的数字，如果使用浮点数，可能会导致程序失败。

当然，有时，即使是双精度也不够精确，因此我们有时会有长双精度[1](上面的例子在Mac上给出了9.000000000000000066)，但所有浮点类型都有四舍五入错误，所以如果精度非常重要(例如货币处理)，你应该使用int或分数类。

此外，不要使用+=对大量浮点数求和，因为错误很快就会累积起来。如果使用Python，请使用fsum。否则，尝试实现Kahan求和算法。

[1]: C和c++标准没有指定float、double和long double的表示方式。这三种方法都有可能实现为IEEE双精度。然而，对于大多数架构(gcc, MSVC;x86, x64, ARM) float确实是IEEE单精度浮点数(binary32)， double是IEEE双精度浮点数(binary64)。

2010-03-05 13:06:43

我刚刚遇到了一个错误，我花了很长时间才弄清楚，这可能会给你一个浮点精度的好例子。

#include <iostream>
#include <iomanip>

int main(){
  for(float t=0;t<1;t+=0.01){
     std::cout << std::fixed << std::setprecision(6) << t << std::endl;
  }
}

输出为

正如你所看到的，在0.83之后，精度显著下降。

然而，如果我将t设为双倍，这样的问题就不会发生。

我花了五个小时才意识到这个小错误，它毁了我的程序。

2015-10-20 06:51:04

浮点计算中涉及的数字的大小并不是最相关的事情。相关的是正在进行的计算。

从本质上讲，如果您正在执行计算，而结果是一个无理数或循环小数，那么当将该数字压缩到您正在使用的有限大小的数据结构中时，将会出现舍入错误。因为double是float大小的两倍，所以舍入误差会小很多。

测试可能特别使用可能导致这种错误的数字，因此测试您是否在代码中使用了适当的类型。

2010-03-05 13:05:56

使用浮点数时，您不能相信本地测试与在服务器端执行的测试完全相同。在本地系统和运行最终测试的地方，环境和编译器可能不同。我以前在一些TopCoder比赛中看到过这个问题很多次，特别是当你试图比较两个浮点数时。

2010-03-05 13:00:57

float和double的区别是什么?

推荐文章

最新文章

标签