Are floating-point numbers fundamentally different from the sets of rational or real numbers in any way? Are there any good mathematical treatments of floating-point numbers? Are they even interesting to study mathematically?
How can one formalize and prove things about floating-point numbers, just as one can do with rational or real numbers?
1 Answers
They are fundamentally different in that (given any particular floating-point representation) there are only finitely many of them. And the basic arithmetic operations on floating-point numbers do not satisfy nearly as many nice laws as arithmetic on true reals or rationals do -- for example, $(a+b)+c = a+(b+c)$ is not true for floating point; neither is $(a+b)-b = a$.
There's some mathematical reasoning about floating-point arithmetic going on in numerical analysis, in particular with respect to how much error relative to true arithmetic on reals various formulas can introduce. That often (but not always) has a somewhat conservative character, overestimating the errors rather than going to the trouble of modeling them exactly.
Also, some people have developed complete machine-readable formalizations of floating-point arithmetic, in the context of using them to create computer-verified proofs that specific microprocessor designs implement it correctly.
-
0If you use a finite number of bit, you have the same problem as above: $a=1$, $b=1*2^{64}$. $a+b=1*2^{64}$ (or use higher exponent if you have higher precision number). $(a+b)-b=0$, but $1\neq0$... – 2012-04-24