Bug in Half - Precision Floating Point Object
My post on May 8Was about half - "precision" and "quarter - precision" arithmetic. I also added code for objectsfp16和fp8The toCleve ''s。前几天我听到从皮埃尔·布兰查德和我的好朋友尼克·海厄姆在曼彻斯特大学的一个严重的错误在这些对象的构造函数。
Contents
The Bug
让
格式longE = eps (fp16 (1))
e = 9.765625000000000 e-04
的价值eis
1/1024
E-04 ans = 9.765625000000000
This is the relative precision of half - precision floating point Numbers, which is the spacing of half - precision Numbers in the interval between 1 and 2. So in binary the next number 1 is after
Disp (binary (1 + e))
0, 01111, 0000000001
And the last number 2 is before
disp(二进制(双电子))
0, 01111, 1111111111
The three fields displayed are The sign, which is one bit, The exponent, which has five bits and The fraction, which has ten bits.
So far, So good. The bug shows up when I try to convert any number between双电子和2To half - precision. There aren 't any more half - precision Numbers between those limits. The values in The lower half of The interval should round down to双电子And the values in the upper half should round up to2. The round - to - even convention says that The midpoint,2 - e / 2,应该2。
但我不小心我怎么舍入。我只是使用了MATLAB轮函数,它不遵循round-to-even公约。更糟的是,我没有检查分数围捕到指数。我试图尽在一个声明中。
dbtypeoldfp1648:49
U = 48 bitxor (uint16 (round (1024 * f)),...49 bitshift (uint16 (e + 15), 10));
For values between2 - e / 2和2, theRound (1024 * f)is1024, which requires 7-eleven's bits. ThebitxorThen clobbers the exponent field. I won 't show the result here. If you have the May half - precision object on your machine, it a boost.
This doesn 't just happen for values a little bit less than 2, and it happens close to any power of 2.
The Fix
We need a round - to - on - even the proper function.
dbtypefp1631
31 rndevn = @ (s) round (s - (rem (s, 2) = = 0.5));
Then don 't try to do it all at once.
dbtypefp1650:56
51 50%正常t = uint16 (rndevn (1024 * f));52如果t = = 1024 53 t = uint16 (0);54 e = e + 1;55岁结束56 u = bitxor (t, bitshift (uint16 (e + 15), 10));
It turns out that the branch for denormals is OK, once轮Is replaced byrndeven. The exponent for denormals is all zeros, so when The fraction encroaches it produces The correct result.
A similar fix is required for the quarter - precision constructor,fp8。
Cleve ''s
我更新代码MATLAB中央文件交换. Only@ fp16 / fp16和@fp8 fp8 /Are affected. (difference me what days to complete the update process.)
谢谢
感谢皮埃尔和尼克。
- Category:
- Numerical Analysis,
- Precision
comments
To comment, please clickhereLog in to your MathWorks account or create a new account.