To put this into context for those reading this who may not be familiar with FP32, FP64, etc.:
You may get the (mis)impression from this discussion that the FP64 data type corresponds to ultra-high-precision numbers needed only for the most abstruse or specialized types of calculations, and that the ability to do computations natively with such precision is found only in specialized data center*/workstation hardware.
That's actually not the case at all. In fact, FP64 (equivalent to a "double" in C/C++) is the default capability of most bog-standard CPUs
, and translates to only about 16 digits of decimal precision. [FP64 has 64 bits, 53 of which are allocated to the "significand", which is the number in front of the exponent (e.g., "1.2345..." in "1.2345... x 10^-12"). And 53 binary digits is equivalent to 53 x ln(2)/ln(10) = 15.956... ≈16 decimal digits.]
[I've read some Intel CPU's also offer FP 80, which provides 64 bits for the significand (equivalent to a "long double" in C/C++), but I'm not personally familiar with this.]
So it's not that FP64 is particularly high in precision, but rather that it's high relative to what's typical for GPU's, which are specialized to do many lower-precision calculations very fast.
Indeed, when using Mathematica, I sometimes run into problems that will give unacceptable errors when run using FP64 (what Mathematica calls "Machine Precision") calculations. Fortunately, with Mathematica, you have a choice of either:
(a) emulating whatever precision you please in all intermediate calculations (up to practical run-time limits); if you do this Mathematica will actually track the precision loss as it does the calculation and give you its estimate of the precision of the final number;
(b) (for some numeric functions) setting a precision goal for the final value, in which case Mathematica will use whatever precision emulation it needs to achieve this (unless it needs more than you've allotted it, in which case it will say you need to give it more).
(c) (with some exceptions) doing calculations with exact, symbolic values (my default preference), so that there is no precision loss (you can then convert the final result to whatever precision you like).
The tradeoff is that using high-precision emulation is much slower than just running with Machine Precision, i.e., "on the hardware". [Though for certain operations the symbolic approach can be faster than working in Machine Precision.]
Here's a fun pathological example known as Rump's function, after its designer, Siegfreid Rump:
f(a,b) = 333.75 * b^6 + a^2 * (11 * a^2 * b^2 – b^6 – 121 * b^4 – 2 ) + 5.5 * b^8 + a/(2b)
Compute f(a,b) where a = 77617 and b = 33096
We need to carry 44 decimal digits (nearly three times the 16 digits offered by FP64) through the calculation to obtain an answer good to only six digits:
Exact result: –(54767/66192)
Exact result numericized to six decimal digits: –0.827396
Machine Precision (FP 64) result: +1.18059 * 10^21 (!!)
Result with 40 decimal digit emulation (2.5 times the digits of FP64): –0.827 (Mathematica only able to provide result to three digits)
Result with 44 decimal digit emulation (minimum needed to give correct result to six figures): –0.827396
Here's a neat real-world (and non-pathological) example of an error caused by insufficient numerical precision:
*I've read the IBM POWER9 CPU is capable of native FP128, which allocates 112 bits to the significand, equivalent to 112 x ln(2)/ln(10) = 33.7154... ≈ 34 decimal digits. That, I think, is properly considered high precision. But I don't know how often that is used in practice.