In-Depth Analysis of Floating Point Numbers in C: Understanding Overflow and Underflow

In-Depth Analysis of Floating Point Numbers in C: Understanding Overflow and Underflow

Today, we will discuss a concept that is very important yet often overlooked in the use of floating point numbers (float) in C language—overflow and underflow.

Do you remember the integer overflow phenomenon we encountered when discussing integer types? When a value exceeds the range that its type can express, it may “wrap around” to become negative or produce other unexpected results. Floating point numbers exhibit similar “behavior,” but their manifestations and terminology are entirely different from integers. Understanding this is beneficial for your future studies in computer architecture, operating systems, and even graduate school interviews!

Why are we discussing this today?

Because in practical programming, especially in scenarios such as scientific computing and graphics processing where numerical precision is critical, improper handling of floating point overflow can lead to inaccurate results, program crashes, or even more severe security issues. Therefore, today we will delve into the overflow and underflow of the float type.

01

Maximum and Minimum Values of Float Type: No More Guessing!

In integer types, we have macros like INT_MAX, UINT_MAX to represent maximum and minimum values. Floating point numbers also have such convenient methods!

To obtain the maximum (or minimum) value of the float type, we need to include a system-provided header file: float.h.

#include <stdio.h>
#include <float.h> // Include float.h header file
int main() {
    // Get the maximum value of float type
    float maxFloat = FLT_MAX;
    printf("The maximum value of float type is: %e\n", maxFloat); // Display in scientific notation using %e
    // Get the minimum value of float type (minimum normalized number)
    float minFloat = FLT_MIN;
    printf("The minimum normalized number of float type is: %e\n", minFloat);
    return 0;
}

FLT_MAX represents the maximum representable value of the float type, while FLT_MIN represents the smallest normalized positive number of the float type (the closest to 0 but still non-zero). Both macros are defined in float.h, which is a “benefit” provided by the C standard library!

Tip:By holding down the CTRL key and clicking on FLT_MAX or FLT_MIN, you can access the float.h header file to view its specific definitions and further understand how the C language handles these values at a low level. These library files are an important pathway to learning the “inner workings” of C language!

02

Floating Point Numbers’ “Temperament”: Overflow Phenomenon

When the result of a floating point calculation exceeds the maximum value that the float type can represent, overflow occurs. Unlike integer overflow, which may “wrap around,” floating point overflow produces a special “value”—infinity.

Professional Correction and Supplement:When we mention “1000.0F,” we emphasize writing it as 1000.0F instead of 1000. This is because 1000 defaults to the int type. In C language, when a float type is operated with an int type, implicit type conversion occurs. The compiler attempts to convert the int type to float type, which may lead to precision loss in some cases (although 1000 can be represented precisely here). To ensure consistency in the type of the operation result and avoid potential issues caused by unnecessary implicit conversions, we should explicitly represent the literal as a float type by adding .0F after the number (e.g., 1000.0F).

#include <stdio.h>
#include <float.h>
int main() {
    float maxFloat = FLT_MAX;
    printf("The maximum value of float type is: %e\n", maxFloat);
    // Attempt to multiply the maximum value by 1000.0F to cause overflow
    float overflowResult = maxFloat * 1000.0F; // Note the .0F suffix here
    printf("Overflow result (maxFloat * 1000.0F) is: %e\n", overflowResult);
    // 1.0F / 0.0F also produces infinity
    float infinityByDivision = 1.0F / 0.0F;
    printf("1.0F / 0.0F result is: %e\n", infinityByDivision);
    return 0;
}

Running the above code, you will find that both overflowResult and infinityByDivision output are inf or INF (abbreviation for infinity), indicating infinity. This is a typical manifestation of floating point overflow.

Concept Deepening: What is “Infinity”?

In the IEEE 754 floating point standard, when the result of a calculation exceeds the maximum range that float or double can represent, it is not “wrapped around” like integers, but is represented as positive infinity (+inf) or negative infinity (-inf). This representation allows the program to continue executing to some extent rather than crashing immediately. You can think of it as a mathematical concept that is larger than any representable number.

03

Floating Point Numbers’ “Exactness”: Underflow Phenomenon and Precision Loss

In contrast to overflow, underflow occurs when the result of a floating point calculation is very close to zero, but less than the smallest normalized number that the float type can represent.Underflow.

#include <stdio.h>
#include <float.h>
int main() {
    float minFloat = FLT_MIN; // Minimum normalized positive number of float
    printf("The minimum normalized number of float is: %e\n", minFloat);
    // Attempt to divide the minimum value by 1000.0F to cause underflow
    float underflowResult = minFloat / 1000.0F;
    printf("Underflow result (minFloat / 1000.0F) is: %e\n", underflowResult);
    return 0;
}

Running this code, you may see that the output of underflowResult could be 0.000000e+00 (i.e., zero), or a very small number with noticeable precision loss (for example, 1.175494e-41 becomes 1.175494e-44, or even smaller).

Concept Deepening: Why does “Underflow” occur? And precision loss!

The metaphor of “not being able to see water molecules” is very vivid!

Floating point representation follows the IEEE 754 standard, which consists of a sign bit, exponent bits, and mantissa bits. The float type is typically 32 bits, where the mantissa determines its precision (approximately 7-8 decimal digits of significant figures).

When we divide by a very large number, making the result very close to zero, this value may become too small to be represented in the standard “normalized” form (where the highest bit of the mantissa M is 1). At this point, the floating point number enters a “denormalized” state. In denormalized numbers, the highest bit of the mantissa can be zero, which means it sacrifices some precision to represent smaller values.

When we mention the metaphor of “implicit bits being dragged out,” it refers to denormalized numbers sacrificing the default hidden highest bit 1 in normalized numbers, using 0 at the start instead, and adjusting the exponent accordingly. This process may lead to a reduction in the significant figures of the calculation result, resulting in precision loss.

In simple terms, the CPU will try its best to represent this value, but when it is small enough to exceed its precision “bottom line,” it can only find the closest value it can represent. It’s like what was said before, “it will find a number that is close to it, and that approximate number is the only one available.”That’s the precision loss phenomenon we observe. Your program may not get the exact “number of water molecules,” but only an “approximate value,” or even directly become zero..

Tip: Different compilers, CPU architectures, and even operating systems may have slight differences in handling floating point underflow, leading to results that appear slightly different. It’s like “a person cutting in line, who doesn’t reason with you; you can’t reason with them.” You just need to understand the core principle—exceeding the minimum representable precision of the computer leads to approximation and precision loss, which is more meaningful than getting caught up in specific values!

04

Summary

Today, we explored the overflow and underflow phenomena of floating point numbers.

  • Overflow: When the result is too large, float will be represented as infinity (INF).

  • Underflow: When the result is too small (close to zero), float will become zero or a denormalized number with precision loss.

Understanding these concepts is crucial. They reveal the limitations and characteristics of floating point representation within computers. In practical programming, we should always pay attention to the range and precision issues of floating point numbers to avoid logical errors in programs due to overflow or underflow. This is especially important when performing extensive iterative calculations or handling extreme values.

Remember, “Depiction is not Endorsement.” Today, we have merely “depicted” the “temperament” of floating point numbers, not “endorsed” their behavior leading to precision loss. But as professional developers, we need to understand and avoid these potential issues.

In-Depth Analysis of Floating Point Numbers in C: Understanding Overflow and Underflow

Leave a Comment