Friday, December 16, 2005

Java Floating point

Good link for java floating point : http://people.uncw.edu/tompkinsj/133/Numbers/Reals.htm

IEEE 754:
The IEEE 754 Standard uses 1-plus form of the binary normalized fraction (rounded). The fraction part is called the mantissa.
1-plus normalized scientific notation base two is then

N = ± (1.b1b2b3b4 ...)2 x 2+E

The 1 is understood to be there and is not recorded.
The Java primitive data type float is 4 bytes, or 32 bits:

Sign: 0 ® positive, 1 ® negative.

Exponent: excess-127 format for float, excess-1023 format for double.

* Float: Emin = -126, Emax = 127
* Double: Emin = -1022, Emax = 1023
* Consider an 8 bit number which has a range of 0 - 256. We use the formula excess - 127 = E to assign the value of our exponent with excess = 127 representing 0.
* In this manner, excess = 120 is an exponent of -7 since 120 - 127 = -7, and excess = 155 is the exponent +28 since 155 - 127 = +28. Excess values of 0, (Emin - 1), and 255, (Emax + 1), have special meanings which are discussed below.

Mantissa: normalized 1-plus fraction with the 1 to the left of the radix point not recorded, float: b1b2b3b4…b23, double: b1b2b3b4…b52. This value is rounded based on the value of the next least significant bit not recorded (if there is a 1 in b24, b53 respectively, increment the least significant bit).

No comments: