可存储在double类型中的最大整数

可以存储在IEEE 754双类型中而不损失精度的最大“无浮动”整数是多少?

换句话说，at会返回以下代码片段:

UInt64 i = 0;
Double d = 0;

while (i == d)
{
        i += 1; 
        d += 1;
}
Console.WriteLine("Largest Integer: {0}", i-1);

当前回答

在IEEE 754 double(64位)中可以表示的最大整数与该类型可以表示的最大值相同，因为该值本身就是一个整数。

这表示为0x7FEFFFFFFFFFFFFF，它由:

符号位0(正)而不是1(负) 最大指数0x7FE(2046表示减去偏差后的1023)而不是0x7FF(2047表示NaN或无穷大)。最大尾数0xFFFFFFFFFFFFF是52位全1。

在二进制中，值是隐式的1，后面是尾数中的另外52个1，然后是指数中的971个0(1023 - 52 = 971)。

精确的十进制值为:

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464 234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559 332123348274797826204144723168738177180919299881250404026184124858368

这大约是1.8 x 10308。

2016-09-30 04:59:58

其他回答

可以存储在double类型中而不损失精度的最大/最大整数与double类型的最大可能值相同。即DBL_MAX或大约1.8 × 10308(如果您的双精度是IEEE 754 64位双精度)。它是一个整数。它被准确地表示出来了。你还想要什么?

继续，问我最大的整数是多少，这样它和所有更小的整数都可以存储在IEEE 64位双精度中而不损失精度。IEEE 64位双精度数有52位尾数，所以我认为是253:

253 + 1不能被存储，因为开头的1和结尾的1之间有太多的零。任何小于253的数都可以存储，52位显式存储在尾数中，然后指数实际上会给你另一个。 253显然可以存储，因为它是2的小幂。

或者另一种看待它的方式:一旦偏离指数，忽略与问题无关的符号位，double存储的值是2的幂，加上一个52位整数乘以2exponent−52。因此，指数52可以存储从252到253−1的所有值。对于指数53,253之后可以存储的下一个数字是253 + 1 × 253−52。所以精度损失首先发生在253 + 1。

2009-12-04 18:21:27

加倍，“简单”的解释

最大的“双精度”数(双精度浮点数)通常是64位或8字节数，表示为:

1.79E308
or
1.79 x 10 (to the power of) 308

正如你可以猜到的，10的308次方是一个巨大的数字，就像170000000000000000000000000000000000000000000000000000000甚至更大!

在天平的另一端，双精度浮点64位数字支持使用“点”表示法的微小小数小数，最小的是:

4.94E-324
or
4.94 x 10 (to the power of) -324

任何数乘以10的负次方都是很小很小的小数，比如0.00000000000000000000000000000000000000494，甚至更小。

但让人们困惑的是，他们会听到计算机书呆子和数学专家说，“但这个数字的范围只有15个数字值”。事实证明，上面描述的值是计算机可以存储并从内存中显示的全部最大值和最小值。但在它们变得这么大之前，它们就失去了准确性和创造数字的能力。因此，大多数程序员都避免使用最大的双位数，尽量保持在一个已知的、小得多的范围内。

但是为什么呢?最好的最大双数是多少?我在网上数学网站上阅读了几十个糟糕的解释，却找不到答案。所以下面这个简单的解释可能会对你有所帮助。它帮助了我!!

加倍事实和缺陷

JavaScript (which also uses the 64-bit double precision storage system for numbers in computers) uses double precision floating point numbers for storing all known numerical values. It thus uses the same MAX and MIN ranges shown above. But most languages use a typed numerical system with ranges to avoid accuracy problems. The double and float number storage systems, however, seem to all share the same flaw of losing numerical precision as they get larger and smaller. I will explain why as it affects the idea of "maximum" values...

To address this, JavaScript has what is called a Number.MAX_SAFE_INTEGER value, which is 9007199254740991. This is the most accurate number it can represent for Integers, but is NOT the largest number that can be stored. It is accurate because it guarantees any number equal to or less than that value can be viewed, calculated, stored, etc. Beyond that range, there are "missing" numbers. The reason is because double precision numbers AFTER 9007199254740991 use an additional number to multiple them to larger and larger values, including the true max number of 1.79E308. That new number is called an exponent.

邪恶指数

事实上，9007199254740991这个最大值也是您在64位存储系统中使用的53位计算机内存中可以存储的最大值。存储在内存中的53位数字9007199254740991是JavaScript使用的典型双精度浮点数可以直接存储在内存尾数部分的最大值。

顺便说一下，9007199254740991是一种我们称之为Base10或十进制(人类使用的数字)的格式。但是它也像这个值一样以53位的形式存储在计算机内存中…

11111111111111111111111111111111111111111111111111111

这是计算机使用64位数字存储系统实际可以存储的双精度数字的整数部分的最大位数。

为了获得更大的最大值(1.79E308)， JavaScript必须使用一个额外的技巧，称为指数，将其乘以越来越大的值。因此，在计算机内存中，在53位尾数值旁边有一个11位的指数数，它允许该数字变得更大或更小，从而创建了期望双精度数表示的最终范围。(同样，正数和负数也只有一个位。)

After the computer reaches this limit of max Integer value (around ~9 quadrillion) and filling up the mantissa section of memory with 53 bits, JavaScript uses a new 11-bit storage area for the exponent which allows much larger integers to grow (up to 10 to the power of 308!) and much smaller decimals to get smaller (10 to the power of -324!). Thus, this exponent number allows for a full range of large and small decimals to be created with the floating radix or decimal point to move up and down the number, creating the complex fractional or decimal values you expect to see. Again, this exponent is another large number store in 11-bits, and itself has a max value of 2048.

您将注意到9007199254740991是一个最大整数，但没有解释存储中可能存在的更大的max值或MINIMUM十进制数，甚至没有解释如何创建和存储小数。这个计算机位值是如何创造这一切的?

答案还是通过指数!

事实证明，指数11位值本身被分为正数和负数，因此它可以创建大整数，也可以创建小小数。

To do so, it has its own positive and negative range created by subtracting 1024 from its 2048 max value to get a new range of values from +1023 to -1023 (minus reserved values for 0) to create the positive/negative exponent range. To then get the FINAL DOUBLE NUMBER, the mantissa (9007199254740991) is multiplied by the exponent (plus the single bit sign added) to get the final value! This allows the exponent to multiply the mantissa value to even larger integer ranges beyond 9 quadrillion, but also go the opposite way with the decimal to very tiny fractions.

然而，存储在指数中的-+1023数字不会乘以尾数以获得二次方，而是用于将数字2的指数次方提高。该指数是一个十进制数，但不适用于十进制指数，如10的次方或1023。它再次应用于Base2系统，并创建一个2的指数次幂的值。

That value generated is then multiplied to the mantissa to get the MAX and MIN number allowed to be stored in JavaScript, as well as all the larger and smaller values within the range. It uses "2" rather than 10 for precision purposes, so with each increase in the exponent value, it only doubles the mantissa value. This reduces the loss of numbers. But this exponent multiplier also means it will lose an increasing range of numbers in doubles as it grows, to the point where as you reach the MAX stored exponent and mantissa possible, very large swaths of numbers disappear from the final calculated number, and so certain numbers are now not possible in math calculations!

这就是为什么大多数人使用SAFE最大整数范围(9007199254740991或更小)，因为大多数人都知道在JavaScript中非常大和小的数字是非常不准确的!还要注意，2的-1023次方会得到MIN数或与典型“浮点数”相关联的小小数。因此，指数用于将尾数整数转换为非常大和非常小的数字，直到它可以存储的最大值和最小值范围。

请注意，1023的2次方转换为十进制指数，使用10的308次方作为最大值。这使您可以看到数值在Human值，或Base10数值格式的二进制计算。数学专家通常不会解释所有这些值都是相同的数字，只是不同的进制或格式。

double的真正最大值是无穷大

最后，当整数达到可能的最大数或可能的最小小数部分时会发生什么?

事实证明，双精度浮点数为64位指数和尾数保留了一组位值，以存储其他四种可能的数字:

+无限 -无限 +0 -0

For example, +0 in double numbers stored in 64-bit memory is a large row of empty bits in computer memory. Below is what happens after you go beyond the smallest decimal possible (4.94E-324) in using a Double precision floating point number. It becomes +0 after it runs out of memory! The computer will return +0, but stores 0 bits in memory. Below is the FULL 64-bit storage design in bits for a double in computer memory. The first bit controls +(0) or -(1) for positive or negative numbers, the 11-bit exponent is next (all zeros is 0, so becomes 2 to the power of 0 = 1), and the large block of 53 bits for the mantissa or significand, which represents 0. So +0 is represented by all zeroes!

0 00000000000 0000000000000000000000000000000000000000000000000000

If the double reaches its positive max or min, or its negative max or min, many languages will always return one of those values in some form. However, some return NaN, or overflow, exceptions, etc. How that is handled is a different discussion. But often these four values are your TRUE min and max values for double. By returning irrational values, you at least have have a representation of the max and min in doubles that explain the last forms of the double type that cannot be stored or explained rationally.

总结

所以正double和负double的MAXIMUM和MINIMUM范围如下:

MAXIMUM TO MINIMUM POSITIVE VALUE RANGE
1.79E308 to 4.94E-324 (+Infinity to +0 for out of range)

MAXIMUM TO MINIMUM NEGATIVE VALUE RANGE
-4.94E-324 to -1.79E308 (-0 to -Infinity for out of range)

But the SAFE and ACCURATE MAX and MIN range is really:
9007199254740991 (max) to -9007199254740991 (min)

所以你可以看到+-∞和+-0添加，双精度有额外的最大和最小范围，以帮助你当你超过最大和分钟。

As mentioned above, when you go from the largest positive value to smallest decimal positive value or fraction, the bits zero out and you get 0 Past 4.94E-324 the double cannot store any decimal fraction value smaller so it collapses to +0 in the bit registry. The same event happens for tiny negative decimals which collapse past their value to -0. As you know -0 = +0 so though not the same values stored in memory, in applications they often are coerced to 0. But be aware many applications do deliver signed zeros!

大数值则相反……超过1.79E308，它们变成+∞和-∞的负版本。这就是在JavaScript等语言中创建所有奇怪数字范围的原因。双精度数字有奇怪的返回!

Note that he MINIMUM SAFE RANGE for decimals/fractions is not shown above as it varies based on the precision needed in the fraction. When you combine the integer with the fractional part, the decimal place accuracy drops away quickly as it goes smaller. There are many discussions and debates about this online. No one ever has an answer. The list below might help. You might need to change these ranges listed to much smaller values if you want guaranteed precision. As you can see, if you want to support up to 9-decimal place accuracy in floats, you will need to limit MAX values in the mantissa to these values. Precision means how many decimal places you need with accuracy. Unsafe means past these values, the number will lose precision and have missing numbers:

            Precision   Unsafe 
            1           5,629,499,534,21,312
            2           703,687,441,770,664
            3           87,960,930,220,208
            4           5,497,558,130,888
            5           68,719,476,736
            6           8,589,934,592
            7           536,870,912
            8           67,108,864
            9           8,388,608

我花了一段时间来理解双精度浮点数和计算机的真正限制。在网上阅读了许多数学专家的大量困惑之后，我创建了上面这个简单的解释，他们擅长创造数字，但不擅长解释任何事情!我希望我对你的编程之旅有所帮助-和平:)

2022-09-29 16:40:45

正如其他人所指出的，我将假设OP要求最大的浮点值，以便所有小于其本身的整数都可以精确表示。

你可以使用float.h中定义的FLT_MANT_DIG和DBL_MANT_DIG来不依赖于显式值(例如，53):

#include <stdio.h>
#include <float.h>

int main(void)
{
    printf("%d, %.1f\n", FLT_MANT_DIG, (float)(1L << FLT_MANT_DIG));
    printf("%d, %.1lf\n", DBL_MANT_DIG, (double)(1L << DBL_MANT_DIG));
}

输出:

24, 16777216.0
53, 9007199254740992.0

2021-05-09 07:29:51

9007199254740992(即9,007,199,254,740,992或2^53)，没有保证:)

程序

#include <math.h>
#include <stdio.h>

int main(void) {
  double dbl = 0; /* I started with 9007199254000000, a little less than 2^53 */
  while (dbl + 1 != dbl) dbl++;
  printf("%.0f\n", dbl - 1);
  printf("%.0f\n", dbl);
  printf("%.0f\n", dbl + 1);
  return 0;
}

结果

9007199254740991
9007199254740992
9007199254740992

2009-12-04 18:54:31

维基百科在同样的背景下引用了IEEE 754的链接:

在典型的计算机系统中，“双精度”(64位)二进制浮点数的系数为53位(其中一个是隐含的)，指数为11位，以及一个符号位。

2^53略大于9 * 10^15。

2009-12-04 18:20:52

可存储在double类型中的最大整数

推荐文章

最新文章

标签