Essential Knowledge for Cross-Compilation: Choosing the Wrong -mfloat-abi Can Slow Floating Point Operations by 100 Times (Including Practical Example with Cortex-M4)

Hello everyone, I am a programmer who loves to share. I am happy to share my experiences and understanding from my work.

-begin-

In embedded development, the handling of floating point operations directly affects the performance and compatibility of programs, and -mfloat-abi is the key option that controls the floating point operation strategy. It determines whether the compiler uses the hardware floating point unit (FPU) or software emulation to handle floating point operations. Choosing incorrectly may lead to program runtime exceptions (such as undefined floating point instructions) or a significant drop in performance.

Function of the option:

-mfloat-abi=type is used to specify the application binary interface (ABI) for floating point operations, which defines how floating point data is passed (via registers or stack) and how operations are implemented (hardware or software). There are three common types:

soft: Completely simulates floating point operations in software, does not rely on hardware FPU, best compatibility but slowest speed; softfp: Simulates operations in software, but passes floating point parameters through FPU registers, balancing compatibility and some performance; hard: Directly uses the hardware FPU to perform floating point operations, passing parameters through FPU registers, fastest speed but relies on hardware support.

Usage scenarios:

If the target CPU has no FPU (such as ARM Cortex-M0), must use soft or softfp; If the target CPU has FPU (such as ARM Cortex-M4F, Cortex-A7), prefer to use hard to leverage hardware performance; If compatibility with devices with/without FPU is required, can choose softfp (but performance is not as good as hard).

Detailed example:

Taking an embedded device based on ARM Cortex-M4F (with FPU) as an example, compile a program that includes floating point operations (such as temperature sensor data calibration).

1. Incorrect choice: Compile using soft mode:

arm-none-eabi-gcc -mcpu=cortex-m4 -mfloat-abi=soft -o temp_calib temp_calib.c

At this point, the compiler will generate code that simulates floating point operations in software, even though Cortex-M4F has an FPU, resulting in floating point multiplication that originally could be completed in 100ns taking hundreds of instructions to simulate, increasing the time to several microseconds.

2. Correct choice: Compile using hard mode:

arm-none-eabi-gcc -mcpu=cortex-m4 -mfloat-abi=hard -mfpu=fpv4-sp-d16 -o temp_calib temp_calib.c

-mcpu=cortex-m4 specifies that the CPU supports FPU; -mfloat-abi=hard enables hardware floating point operations; -mfpu=fpv4-sp-d16 specifies the FPU model (the FPU model of Cortex-M4F), which must be used in conjunction with hard.

At this point, the compiler will generate FPU instructions (such as vadd.f32 for hardware addition), with floating point parameters passed through FPU registers (such as s0, s1), and the operation speed is improved by 10-100 times compared to soft mode.

3. Verify the effect:

Use readelf to check the libraries the program depends on to determine if the floating point mode is correct:

arm-none-eabi-readelf -d temp_calib | grep “Shared library”

If the output includes libgcc_s.so (software floating point support library), it indicates that soft or softfp was used; If there is no special floating point library, and disassembly shows FPU instructions starting with v (such as vldr, vstr), it indicates that hard mode is effective.

Notes:

hard mode must be used in conjunction with -mfpu to specify the specific FPU model (such as ARM’s fpv4-sp-d16, neon-vfpv4), otherwise compilation will report an error; All files in the same project (including library files) must use the same -mfloat-abi type, mixing will lead to linking errors (such as “floating point register usage mismatch”); In embedded Linux systems, if the kernel is compiled with hard mode, the application program must also be compiled with hard, otherwise floating point data transfer errors may occur.

Lessons learned from practice:

In a project, due to team members mixing hard and softfp modes, the program encountered “floating point value disorder” at runtime—modules compiled with hard placed floating point parameters into FPU registers, while modules compiled with softfp read parameters from the stack, resulting in mismatches. The problem was resolved by unifying the use of hard mode.

The core of -mfloat-abi is “matching hardware capabilities”: using FPU when available is not wasteful, while forcing it without FPU can lead to crashes. In resource-constrained embedded scenarios, the performance improvement brought by hardware floating point may directly determine whether the product can meet real-time requirements (such as floating point operations for drone attitude control must be completed in microseconds). Tomorrow we will discuss the -march option that controls the instruction set, and see how it ensures compatibility across CPU models.

Advanced Cross-Compilation: The -mcpu Option Boosts Program Performance, Including Practical Cases with ARM Cortex

-end-

If this article has helped you, please like, share, and follow. Thank you very much.

Leave a Comment