可以存储在IEEE 754双类型中而不损失精度的最大“无浮动”整数是多少?
换句话说,at会返回以下代码片段:
UInt64 i = 0;
Double d = 0;
while (i == d)
{
i += 1;
d += 1;
}
Console.WriteLine("Largest Integer: {0}", i-1);
可以存储在IEEE 754双类型中而不损失精度的最大“无浮动”整数是多少?
换句话说,at会返回以下代码片段:
UInt64 i = 0;
Double d = 0;
while (i == d)
{
i += 1;
d += 1;
}
Console.WriteLine("Largest Integer: {0}", i-1);
当前回答
加倍,“简单”的解释
最大的“双精度”数(双精度浮点数)通常是64位或8字节数,表示为:
1.79E308
or
1.79 x 10 (to the power of) 308
正如你可以猜到的,10的308次方是一个巨大的数字,就像170000000000000000000000000000000000000000000000000000000甚至更大!
在天平的另一端,双精度浮点64位数字支持使用“点”表示法的微小小数小数,最小的是:
4.94E-324
or
4.94 x 10 (to the power of) -324
任何数乘以10的负次方都是很小很小的小数,比如0.00000000000000000000000000000000000000494,甚至更小。
但让人们困惑的是,他们会听到计算机书呆子和数学专家说,“但这个数字的范围只有15个数字值”。事实证明,上面描述的值是计算机可以存储并从内存中显示的全部最大值和最小值。但在它们变得这么大之前,它们就失去了准确性和创造数字的能力。因此,大多数程序员都避免使用最大的双位数,尽量保持在一个已知的、小得多的范围内。
但是为什么呢?最好的最大双数是多少?我在网上数学网站上阅读了几十个糟糕的解释,却找不到答案。所以下面这个简单的解释可能会对你有所帮助。它帮助了我!!
加倍事实和缺陷
JavaScript (which also uses the 64-bit double precision storage system for numbers in computers) uses double precision floating point numbers for storing all known numerical values. It thus uses the same MAX and MIN ranges shown above. But most languages use a typed numerical system with ranges to avoid accuracy problems. The double and float number storage systems, however, seem to all share the same flaw of losing numerical precision as they get larger and smaller. I will explain why as it affects the idea of "maximum" values...
To address this, JavaScript has what is called a Number.MAX_SAFE_INTEGER value, which is 9007199254740991. This is the most accurate number it can represent for Integers, but is NOT the largest number that can be stored. It is accurate because it guarantees any number equal to or less than that value can be viewed, calculated, stored, etc. Beyond that range, there are "missing" numbers. The reason is because double precision numbers AFTER 9007199254740991 use an additional number to multiple them to larger and larger values, including the true max number of 1.79E308. That new number is called an exponent.
邪恶指数
事实上,9007199254740991这个最大值也是您在64位存储系统中使用的53位计算机内存中可以存储的最大值。存储在内存中的53位数字9007199254740991是JavaScript使用的典型双精度浮点数可以直接存储在内存尾数部分的最大值。
顺便说一下,9007199254740991是一种我们称之为Base10或十进制(人类使用的数字)的格式。但是它也像这个值一样以53位的形式存储在计算机内存中…
11111111111111111111111111111111111111111111111111111
这是计算机使用64位数字存储系统实际可以存储的双精度数字的整数部分的最大位数。
为了获得更大的最大值(1.79E308), JavaScript必须使用一个额外的技巧,称为指数,将其乘以越来越大的值。因此,在计算机内存中,在53位尾数值旁边有一个11位的指数数,它允许该数字变得更大或更小,从而创建了期望双精度数表示的最终范围。(同样,正数和负数也只有一个位。)
After the computer reaches this limit of max Integer value (around ~9 quadrillion) and filling up the mantissa section of memory with 53 bits, JavaScript uses a new 11-bit storage area for the exponent which allows much larger integers to grow (up to 10 to the power of 308!) and much smaller decimals to get smaller (10 to the power of -324!). Thus, this exponent number allows for a full range of large and small decimals to be created with the floating radix or decimal point to move up and down the number, creating the complex fractional or decimal values you expect to see. Again, this exponent is another large number store in 11-bits, and itself has a max value of 2048.
您将注意到9007199254740991是一个最大整数,但没有解释存储中可能存在的更大的max值或MINIMUM十进制数,甚至没有解释如何创建和存储小数。这个计算机位值是如何创造这一切的?
答案还是通过指数!
事实证明,指数11位值本身被分为正数和负数,因此它可以创建大整数,也可以创建小小数。
To do so, it has its own positive and negative range created by subtracting 1024 from its 2048 max value to get a new range of values from +1023 to -1023 (minus reserved values for 0) to create the positive/negative exponent range. To then get the FINAL DOUBLE NUMBER, the mantissa (9007199254740991) is multiplied by the exponent (plus the single bit sign added) to get the final value! This allows the exponent to multiply the mantissa value to even larger integer ranges beyond 9 quadrillion, but also go the opposite way with the decimal to very tiny fractions.
然而,存储在指数中的-+1023数字不会乘以尾数以获得二次方,而是用于将数字2的指数次方提高。该指数是一个十进制数,但不适用于十进制指数,如10的次方或1023。它再次应用于Base2系统,并创建一个2的指数次幂的值。
That value generated is then multiplied to the mantissa to get the MAX and MIN number allowed to be stored in JavaScript, as well as all the larger and smaller values within the range. It uses "2" rather than 10 for precision purposes, so with each increase in the exponent value, it only doubles the mantissa value. This reduces the loss of numbers. But this exponent multiplier also means it will lose an increasing range of numbers in doubles as it grows, to the point where as you reach the MAX stored exponent and mantissa possible, very large swaths of numbers disappear from the final calculated number, and so certain numbers are now not possible in math calculations!
这就是为什么大多数人使用SAFE最大整数范围(9007199254740991或更小),因为大多数人都知道在JavaScript中非常大和小的数字是非常不准确的!还要注意,2的-1023次方会得到MIN数或与典型“浮点数”相关联的小小数。因此,指数用于将尾数整数转换为非常大和非常小的数字,直到它可以存储的最大值和最小值范围。
请注意,1023的2次方转换为十进制指数,使用10的308次方作为最大值。这使您可以看到数值在Human值,或Base10数值格式的二进制计算。数学专家通常不会解释所有这些值都是相同的数字,只是不同的进制或格式。
double的真正最大值是无穷大
最后,当整数达到可能的最大数或可能的最小小数部分时会发生什么?
事实证明,双精度浮点数为64位指数和尾数保留了一组位值,以存储其他四种可能的数字:
+无限 -无限 +0 -0
For example, +0 in double numbers stored in 64-bit memory is a large row of empty bits in computer memory. Below is what happens after you go beyond the smallest decimal possible (4.94E-324) in using a Double precision floating point number. It becomes +0 after it runs out of memory! The computer will return +0, but stores 0 bits in memory. Below is the FULL 64-bit storage design in bits for a double in computer memory. The first bit controls +(0) or -(1) for positive or negative numbers, the 11-bit exponent is next (all zeros is 0, so becomes 2 to the power of 0 = 1), and the large block of 53 bits for the mantissa or significand, which represents 0. So +0 is represented by all zeroes!
0 00000000000 0000000000000000000000000000000000000000000000000000
If the double reaches its positive max or min, or its negative max or min, many languages will always return one of those values in some form. However, some return NaN, or overflow, exceptions, etc. How that is handled is a different discussion. But often these four values are your TRUE min and max values for double. By returning irrational values, you at least have have a representation of the max and min in doubles that explain the last forms of the double type that cannot be stored or explained rationally.
总结
所以正double和负double的MAXIMUM和MINIMUM范围如下:
MAXIMUM TO MINIMUM POSITIVE VALUE RANGE
1.79E308 to 4.94E-324 (+Infinity to +0 for out of range)
MAXIMUM TO MINIMUM NEGATIVE VALUE RANGE
-4.94E-324 to -1.79E308 (-0 to -Infinity for out of range)
But the SAFE and ACCURATE MAX and MIN range is really:
9007199254740991 (max) to -9007199254740991 (min)
所以你可以看到+-∞和+-0添加,双精度有额外的最大和最小范围,以帮助你当你超过最大和分钟。
As mentioned above, when you go from the largest positive value to smallest decimal positive value or fraction, the bits zero out and you get 0 Past 4.94E-324 the double cannot store any decimal fraction value smaller so it collapses to +0 in the bit registry. The same event happens for tiny negative decimals which collapse past their value to -0. As you know -0 = +0 so though not the same values stored in memory, in applications they often are coerced to 0. But be aware many applications do deliver signed zeros!
大数值则相反……超过1.79E308,它们变成+∞和-∞的负版本。这就是在JavaScript等语言中创建所有奇怪数字范围的原因。双精度数字有奇怪的返回!
Note that he MINIMUM SAFE RANGE for decimals/fractions is not shown above as it varies based on the precision needed in the fraction. When you combine the integer with the fractional part, the decimal place accuracy drops away quickly as it goes smaller. There are many discussions and debates about this online. No one ever has an answer. The list below might help. You might need to change these ranges listed to much smaller values if you want guaranteed precision. As you can see, if you want to support up to 9-decimal place accuracy in floats, you will need to limit MAX values in the mantissa to these values. Precision means how many decimal places you need with accuracy. Unsafe means past these values, the number will lose precision and have missing numbers:
Precision Unsafe
1 5,629,499,534,21,312
2 703,687,441,770,664
3 87,960,930,220,208
4 5,497,558,130,888
5 68,719,476,736
6 8,589,934,592
7 536,870,912
8 67,108,864
9 8,388,608
我花了一段时间来理解双精度浮点数和计算机的真正限制。在网上阅读了许多数学专家的大量困惑之后,我创建了上面这个简单的解释,他们擅长创造数字,但不擅长解释任何事情!我希望我对你的编程之旅有所帮助-和平:)
其他回答
更新1:
刚刚意识到5 ^ 1074不是你可以从IEEE 754双精度浮点中免费得到的真正上限,因为我只计算了非规整指数,忘记了尾数本身可以适合另外22次5的事实,所以据我所知,一个人可以从双精度格式中免费得到的5的最大次幂是:
5的最大次方:
5 ^ 1096
最大奇数:
5 ^ 1074 x 9007199254740991 5 ^ 1074 x (2 ^ 53 - 1)
开始 CONVFMT =“IEEE754: 4字节word: %.16lX”; print”“, sprintf(%。”* g ", __=(_+=_+=_^=_<_)^++_+_*(_+_), {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} {1ch88ff88} ' sprintf(* * %。g ",__,_=_*((_+=(_^=!_)+(_+=_))*_\ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ' sprintf(%。”* g ",__,_=___*= \ (_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
4.940656458412465441765687928682213723650598026143247644255856825006755072702087518652998363616359923797965646954457177309266567103559397963987747960107818781263007131903114045278458171678489821036887186360569987307230500063874091535649843873124733972731696151400317153853980741262385655911710266585566867681870395603106249319452715914924553293054565444011274801297099995419319894090804165633245247571478690147267801593552386115501348035264934720193790268107107491703332226844753335720832431936092382893458368060106011506169809753078342277318329247904982524730776375927247874656084778203734469699533647017972677717585125660551199131504891101451037862738167250955837389733598993664809941164205702637090279242767544565229087538682506419718265533447265625e-324 — IEEE754 :: 4-byte word :: 0000000000000001 494065645841246544176568792......682506419718265533447265625 } 751 dgts : 5^1,074
1.1779442926436580280698985883431944188238616052015418158187524855152976686244219586021896275559329804892458073984282439492384355315111632261247033977765604928166883306272301781841416768261169960586755720044541328685833215865788678015827760393916926318959465387821953663477851727634395732669139543975751084522891987808004020022041120326339133484493650064495265010111570347355174765803347028811562651566216206901711944564705815590623254860079132843479610128658074120767908637153514231969910697784644086106916351461663273587631725676246505444808791274797874748064938487833137213363849587926231550453981511635715193075144590522172925785791614297511667878003519179715722536405560955202126362715257889359212587458533154881546706053453699158950485070818103849887847900390625e-308 — IEEE754 :: 4-byte word :: 000878678326EAC9 117794429264365802806989858......070818103849887847900390625 } 767 dgts : 5^1,096
4.4501477170144022721148195934182639518696390927032912960468522194496444440421538910330590478162701758282983178260792422137401728773891892910553144148156412434867599762821265346585071045737627442980259622449029037796981144446145705102663115100318287949527959668236039986479250965780342141637013812613333119898765515451440315261253813266652951306000184917766328660755595837392240989947807556594098101021612198814605258742579179000071675999344145086087205681577915435923018910334964869420614052182892431445797605163650903606514140377217442262561590244668525767372446430075513332450079650686719491377688478005309963967709758965844137894433796621993967316936280457084866613206797017728916080020698679408551343728867675409720757232455434770912461317493580281734466552734375e-308 — IEEE754 :: 4-byte word :: 001FFFFFFFFFFFFF 445014771701440227211481959......317493580281734466552734375 } 767 dgts : 5^1,074 6361 69431 20394401
下面是一个快速的awk代码片段,它可以打印出2到1023的每一个正幂,5到1096的每一个正幂,以及它们的共同幂为零,对有和没有bigint库都进行了优化:
{m,g,n}awk' BEGIN {
CONVFMT = "%." ((_+=_+=_^=_<_)*_+--_*_++)(!++_) "g"
OFMT = "%." (_*_) "g"
if (((_+=_+_)^_%(_+_))==(_)) {
print __=_=\
int((___=_+=_+=_*=++_)^!_)
OFS = ORS
while (--___) {
print int(__+=__), int(_+=_+(_+=_))
}
__=((_+=_+=_^=!(__=_))^--_+_*_) substr("",_=__)
do {
print _+=_+(_+=_) } while (--__)
exit
} else { _=_<_ }
__=((___=_+=_+=++_)^++_+_*(_+_--))
_=_^(-(_^_--))*--_^(_++^_^--_-__)
_____=-log(_<_)
__^=_<_
___=-___+--___^___
while (--___) {
print ____(_*(__+=__+(__+=__))) }
do {
print ____(_) } while ((_+=_)<_____)
}
function ____(__,_) {
return (_^=_<_)<=+__ \
? sprintf( "%.f", __) \
: substr("", _=sprintf("%.*g", (_+=++_)^_*(_+_),__),
gsub("^[+-]*[0][.][0]*|[.]|[Ee][+-]?[[:digit:]]+$","",_))_
}'
= = = = = = = = = = = = = = = = = = = = = = = = = = = = =
这取决于你对"有表征的"和"可表征的"的定义有多灵活
不管典型文献怎么说,在IEEE 754双精度中实际上“最大”的整数,没有任何大int库或外部函数调用,具有完全完整的尾数,是可计算、可存储和可打印的:
9,007,199,254,740,991 * 5 ^ 1074(~2546.750773909…比特)
4450147717014402272114819593418263951869639092703291
2960468522194496444440421538910330590478162701758282
9831782607924221374017287738918929105531441481564124
3486759976282126534658507104573762744298025962244902
9037796981144446145705102663115100318287949527959668
2360399864792509657803421416370138126133331198987655
1545144031526125381326665295130600018491776632866075
5595837392240989947807556594098101021612198814605258
7425791790000716759993441450860872056815779154359230
1891033496486942061405218289243144579760516365090360
6514140377217442262561590244668525767372446430075513
3324500796506867194913776884780053099639677097589658
4413789443379662199396731693628045708486661320679701
7728916080020698679408551343728867675409720757232455
434770912461317493580281734466552734375
我使用xxhash将其与gnu-bc进行比较,并确认它确实是相同的,没有精度损失。这个数字没有任何“非规格化”的地方,尽管指数范围被这样标记。
如果你不相信,在你自己的系统上试试。(我通过现成的mawk得到了这个打印)-你也可以很容易地得到它:
一(1)次幂/幂(^ aka **) op, 一个(1)乘(*)运算, 一次(1)sprintf()调用,和 任一(一)项 - substr()或regex-gsub() 执行必要的清理
就像我们经常提到的1.79…E309数字,
都是尾数有限公司 两者都是指数受限的 两者都有大得离谱的ulp(最后一名) 两者都距离浮点单元的溢出或下溢只有一步之遥,可以返回一个可用的答案
对工作流的二进制指数进行否定,您可以完全在这个空间中完成操作,然后在工作流的尾部再次反转它,以回到我们通常认为的“较大”的一侧,
但要记住,这是倒置的 指数领域,不存在“逐渐溢出”
- 4Chan出纳员
9007199254740992(即9,007,199,254,740,992或2^53),没有保证:)
程序
#include <math.h>
#include <stdio.h>
int main(void) {
double dbl = 0; /* I started with 9007199254000000, a little less than 2^53 */
while (dbl + 1 != dbl) dbl++;
printf("%.0f\n", dbl - 1);
printf("%.0f\n", dbl);
printf("%.0f\n", dbl + 1);
return 0;
}
结果
9007199254740991 9007199254740992 9007199254740992
的确,对于64位的IEEE754双精度,所有到9007199254740992 == 2^53的整数都可以精确表示。
然而,值得一提的是,所有超出4503599627370496 == 2^52的可表示数字都是整数。 超过2^52,测试它们是否是整数就没有意义了,因为它们都隐式舍入到附近一个可表示的值。
在2^51到2^52的范围内,唯一的非整数值是以“”结尾的中点。5”,这意味着计算后的任何整数测试都必须产生至少50%的错误答案。
在2^51以下也有"25"和"。75英寸,所以比较一个数字和它的四舍五入的对应数字,以确定它是否可能是整数开始是有意义的。
TLDR:如果您想测试计算结果是否可能是整数,请避免大于2251799813685248 == 2^51的数字
1.7976931348623157 × 10^308
http://en.wikipedia.org/wiki/Double_precision_floating-point_format
正如其他人所指出的,我将假设OP要求最大的浮点值,以便所有小于其本身的整数都可以精确表示。
你可以使用float.h中定义的FLT_MANT_DIG和DBL_MANT_DIG来不依赖于显式值(例如,53):
#include <stdio.h>
#include <float.h>
int main(void)
{
printf("%d, %.1f\n", FLT_MANT_DIG, (float)(1L << FLT_MANT_DIG));
printf("%d, %.1lf\n", DBL_MANT_DIG, (double)(1L << DBL_MANT_DIG));
}
输出:
24, 16777216.0
53, 9007199254740992.0