I just came across this in my work and find it rather baffling:
The format and arithmetic of floating point numbers is
described by several standards. For instance, IEEE 754 is a
widely used standard for the case b=2. It considers, amongst
other things,
single-precision numbers:
- 32 bits: 1 bit sign, 8 bits exponent, 23 bits mantissa, bias 127
- range: approx. ± 3.403 × 10^38
- numbers closest to 0: approx. ± 1.175 × 10^-38
double-precision numbers:
- 64 bits: 1 bit sign, 11 bits exponent, 52 bits mantissa, bias 1023
- range: approx. ± 1.798 × 10^308
- numbers closest to 0: approx. ± 2.225 × 10^-308
I'm not too sure how to interpret all of this so I'll start with a basic question...
What is the main functional difference between single and double precision numbers?
Looking earlier on in my notes I found the following:
The representation of real numbers in computers is based on
the following principle:
Any r ∈ ℝ can be approximated by a floating point number
r ≈ s × m × b^e,
where
Example: 12.25 = 1 × 0.0001225 × 10^5 [ = 1 × 122.5 × 10^-1]
- s ∈ {1,-1} is the sign,
- m ∈ ℝ is the mantissa,
- b ∈ ℕ is the base and
- e ∈ ℤ is the exponent.
Is there any relation between the two concepts?