IEEE Floating Point

Binary formats such as two's complement arithmetic provide an excellent way to represent integer values. However, they cannot represent decimal values. The IEEE 754 standard for floating point numbers was created to solve this problem. It is the most commonly used binary format used to represent decimal values. Conceptually, this representation enables us to express values using a base 2 version of scientific notation.

This binary format for floating point value is notably more complex than the format for two's complement. There are four categories of floating point values. The are sumarized in the table below.

Value TypeDescription
NormalizedMost numbers are in this category
DenormalizedValues close to and including 0
InfinityThis category has two values: +∞ and -∞
NaNThere is a special value called "Not A Number" that represents the absence of a valid number (e.g. the result of zero divided by zero)

The binary format for floating point values contains three fields: The sign, the exponent, and the fraction. The value of the exponent and fraction fields determine which of the four categories the value belongs to. You can use the interactive visualization, variables table, and decoding flowchart below to see how floating point values are encoded.

Variables

TermDescription
sThe value of the sign bit as an unsigned integer
SThe sign value
bias The value subracted from e to calulate E
It is equal to 2(the number of exponent bits - 1) - 1
eThe value of the exponent field as an unsigned integer
E The biased value of the exponent
E = e - bias, if the value is normalized
E = 1 - bias, if the value is denormalized
f The value of the fraction
Reading from left to right, the bits in the fraction field represent 1/2, 1/4, 1/8, etc.
M The value of the significand (aka the mantissa)
M = 1 + f, if the value is normalized
M = 0 + f, if the value is denormalized
VThe value of the floating point number

Decoding Flowchart