You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Java uses the IEEE 754 standard for floating point representation. In this representation, floats are encoded using 1 sign bit, 8 exponent bits, and 23 mantissa bits. Doubles are encoded and used exactly the same way, except they use 1 sign bit, 11 exponent bits, and 52 mantissa bits. These bits encode the values of s, the sign; M, the significand; and E, the exponent. Floating point numbers are then calculated as (-1)s * M * 2 E.

Ordinarily all of the mantissa bits are used to express significant figures, in addition to a leading 1, which is implied and, therefore, left out. Thus, floats ordinarily have 24 significant bits of precision, and doubles ordinarily have 53 significant bits of precision. Such numbers are called normalized numbers. All floating point numbers are limited in this sense that they have fixed precision.

Mantissa bits are used to express extremely small numbers that are too small to encode normally because of the lack of available exponent bits. Using mantissa bits extends the possible range of exponents. Because these bits no longer function as significant bits of precision, the total precision of extremely small numbers is less than usual. Such numbers are called denormalized, and they are more limited than normalized numbers. However, even using normalized numbers where precision is required can pose a risk. See recommendation NUM07-J. Avoid using floating point numbers when precise computation is needed. for more information.

Denormalized numbers can severely impair the precision of floating point numbers and should not be used.

Print Representation of Denormalized Numbers

Denormalized numbers can also be troublesome because their printed representation is unusual. Floats and normalized doubles, when formatted with the %a specifier begin with a leading nonzero digit. Denormalized doubles can begin with a leading zero to the left of the decimal point in the mantissa.

The following program produces the following output:

class FloatingPointFormats {
    public static void main(String[] args) {
        float x = 0x1p-125f;
        double y = 0x1p-1020;
        System.out.format("normalized float with %%e    : %e\n", x);
        System.out.format("normalized float with %%a    : %a\n", x);
        x = 0x1p-140f;
        System.out.format("denormalized float with %%e  : %e\n", x);
        System.out.format("denormalized float with %%a  : %a\n", x);
        System.out.format("normalized double with %%e   : %e\n", y);
        System.out.format("normalized double with %%a   : %a\n", y);
        y = 0x1p-1050;
        System.out.format("denormalized double with %%e : %e\n", y);
        System.out.format("denormalized double with %%a : %a\n", y);
    }
}
normalized float with %e    : 2.350989e-38
normalized float with %a    : 0x1.0p-125
denormalized float with %e  : 7.174648e-43
denormalized float with %a  : 0x1.0p-140
normalized double with %e   : 8.900295e-308
normalized double with %a   : 0x1.0p-1020
denormalized double with %e : 8.289046e-317
denormalized double with %a : 0x0.0000001p-1022

Noncompliant Code Example

This code attempts to reduce a floating point number to a denormalized value and then restore the value.

#include <stdio.h>
float x = 1/3.0f;
System.out.println("Original      : " + x);
x = x * 7e-45f;
System.out.println("Denormalized? : " + x);
x = x / 7e-45f;
System.out.println("Restored      : " + x);

This operation is very imprecise. The code produces the following output:

Original      : 0.33333334
Denormalized? : 2.8E-45
Restored      : 0.4

Compliant Solution

Do not use code that could use denormalized numbers. If calculations using float are producing denormalized numbers, use double instead.

#include <stdio.h>
double x = 1/3.0;
System.out.println("Original      : " + x);
x = x * 7e-45;
System.out.println("Denormalized? : " + x);
x = x / 7e-45;
System.out.println("Restored      : " + x);

This code produces the following output:

Original      : 0.3333333333333333
Denormalized? : 2.333333333333333E-45
Restored      : 0.3333333333333333

Risk Assessment

Floating point numbers are an approximation; using subnormal floating point number are a worse approximation.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

NUM08-J

low

probable

high

P2

L3

Related Guidelines

CERT C Secure Coding Standard FLP05-C. Don't use denormalized numbers

Bibliography

[[IEEE 754]]
[[Bryant 2003]] Computer Systems: A Programmer's Perspective. Section 2.4 Floating Point


FLP04-C. Check floating point inputs for exceptional values      05. Floating Point (FLP)      FLP30-C. Do not use floating point variables as loop counters

  • No labels