You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 28 Next »

Java uses the IEEE 754 standard for floating-point representation. In this representation, floats are encoded using 1 sign bit, 8 exponent bits, and 23 mantissa bits. Doubles are encoded and used exactly the same way, except they use 1 sign bit, 11 exponent bits, and 52 mantissa bits. These bits encode the values of s, the sign; M, the significand; and E, the exponent. Floating point numbers are then calculated as (-1)s * M * 2 E.

Ordinarily, all of the mantissa bits are used to express significant figures, in addition to a leading 1, which is implied and, as a result, left out. Floats, consequently, have 24 significant bits of precision; doubles have 53 significant bits of precision. Such numbers are called normalized numbers. All-floating point numbers are limited in this sense because they have fixed precision.

When the value to be represented is too small to encode normally, it is encoded in denormalized form, indicated by an exponent value of Float.MIN_EXPONENT - 1 or Double.MIN_EXPONENT - 1. Denormalized floating-point numbers have an assumed 0 in the ones place and have a zero or more leading zeros in the represented portion of their mantissa. These leading zero bits no longer function as significant bits of precision; consequently, the total precision of denormalized floating-point numbers is less than that of normalized floating-point numbers. Note that even using normalized numbers where precision is required can pose a risk. See rule "NUM04-J. Avoid using floating-point numbers when precise computation is required" for more information.

Using denormalized numbers can severely impair the precision of floating-point calculations; as a result, denormalized numbers must not be used.

Detecting Denormalized Numbers

The following code tests whether a float value is denormalized in strictfp mode, or for platforms that lack extended range support. Testing for denormalized numbers in the presence of extended range support is platform dependent; see rule "NUM06-J. Use the strictfp modifier for floating point calculation consistency across platforms" for additional information.

strictfp public static boolean isDenormalized(float val) {
if (val == 0) {
    return false;
  }
if ((val > -Float.MIN_NORMAL) && (val < Float.MIN_NORMAL)) {
    return true;
  }
return false;
}

Testing whether values of type double are denormalized is exactly analogous.

Print Representation of Denormalized Numbers

Denormalized numbers can also be troublesome because their printed representation is unusual. Floats and normalized doubles, when formatted with the %a specifier, begin with a leading nonzero digit. Denormalized doubles can begin with a leading zero to the left of the decimal point in the mantissa.

The following program produces this output:

class FloatingPointFormats {
    public static void main(String[] args) {
        float x = 0x1p-125f;
        double y = 0x1p-1020;
        System.out.format("normalized float with %%e    : %e\n", x);
        System.out.format("normalized float with %%a    : %a\n", x);
        x = 0x1p-140f;
        System.out.format("denormalized float with %%e  : %e\n", x);
        System.out.format("denormalized float with %%a  : %a\n", x);
        System.out.format("normalized double with %%e   : %e\n", y);
        System.out.format("normalized double with %%a   : %a\n", y);
        y = 0x1p-1050;
        System.out.format("denormalized double with %%e : %e\n", y);
        System.out.format("denormalized double with %%a : %a\n", y);
    }
}
normalized float with %e    : 2.350989e-38
normalized float with %a    : 0x1.0p-125
denormalized float with %e  : 7.174648e-43
denormalized float with %a  : 0x1.0p-140
normalized double with %e   : 8.900295e-308
normalized double with %a   : 0x1.0p-1020
denormalized double with %e : 8.289046e-317
denormalized double with %a : 0x0.0000001p-1022

Noncompliant Code Example

This code attempts to reduce a floating-point number to a denormalized value and then restore the value.

float x = 1/3.0f;
System.out.println("Original      : " + x);
x = x * 7e-45f;
System.out.println("Denormalized? : " + x);
x = x / 7e-45f;
System.out.println("Restored      : " + x);

This operation is imprecise. The code produces the following output:

Original      : 0.33333334
Denormalized? : 2.8E-45
Restored      : 0.4

Compliant Solution

Do not use code that could use denormalized numbers. When calculations using float produce denormalized numbers, use of double can provide sufficient precision.

double x = 1/3.0;
System.out.println("Original      : " + x);
x = x * 7e-45;
System.out.println("Denormalized? : " + x);
x = x / 7e-45;
System.out.println("Restored      : " + x);

This code produces the following output:

Original      : 0.3333333333333333
Denormalized? : 2.333333333333333E-45
Restored      : 0.3333333333333333

Exceptions

NUM05-EX0: Denormalized numbers are acceptable when competent numerical analysis demonstrates that the computed values will meet all accuracy and behavioral requirements that are appropriate to the application. Note that "competent numerical analysis" generally requires a specialized professional numerical analyst; lesser levels of rigor fail to qualify for this exception.

Risk Assessment

Floating-point numbers are an approximation; denormalized floating-point numbers are a less precise approximation. Use of denormalized numbers can cause unexpected loss of precision, possibly leading to incorrect or unexpected results. Although the severity stated below for violations of this rule is low, applications that require accurate results should consider the severity of this violation to be high.

Rule

Severity

Likelihood

Remediation Cost

Priority

Level

NUM05-J

low

probable

high

P2

L3

Related Guidelines

Bibliography

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="92799f54-8733-419d-ab90-0efc22e77101"><ac:plain-text-body><![CDATA[

[[IEEE 754

AA. Bibliography#IEEE 754 2006]]

 

]]></ac:plain-text-body></ac:structured-macro>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="3d943d22-7c29-41d8-b22e-cce7a1ed5319"><ac:plain-text-body><![CDATA[

[[Bryant 2003

AA. Bibliography#Bryant 03]]

Computer Systems: A Programmer's Perspective. Section 2.4 Floating Point

]]></ac:plain-text-body></ac:structured-macro>


NUM04-J. Avoid using floating-point numbers when precise computation is required      03. Numeric Types and Operations (NUM)      NUM06-J. Use the strictfp modifier for floating point calculation consistency across platforms

  • No labels