Organizer: Engineer Huang
Reference Source: Arm Official Website
Readers who have used the Keil MDK (Arm Compiler 6) version V6 should have noticed that the compilation speed of V6 is much faster than that of V5.
(Note: It is the V6 version compiler, not the V6 version MDK)
Have you noticed the differences between Arm Compiler V6 and V5? What are the differences in the optimization options integrated in MDK?
1. About Arm Compiler 6
Arm Compiler 6 (referred to as AC6) is a compilation toolchain for Arm processors, currently the latest version: Arm Compiler V6.14.
There are many compilers used for compiling Cortex-M processors, Arm Compiler is one of them, commonly used in Keil MDK, Arm Development Studio (DS-5), and can also be installed as a standalone toolchain.
Of course, in addition to Arm Compiler, there are many compilers targeting Cortex-M, such as: GNU Compiler, IAR Compiler, CCS Compiler, etc.
Arm Compiler 6 toolchain includes:
armclang: A compiler and integrated assembler based on LLVM and Clang technology.
armasm: The old assembler for armasm syntax assembly code. Use armclang integrated assembler for all new assembly files.
armar: Collects ELF object files together.
armlink: The linker that combines objects and libraries to generate executable files.
fromelf: Image converter and disassembler.
Arm C libraries: Runtime support libraries for embedded systems.
Arm C++ libraries: Libraries based on the LLVM libc++ project.
ARM Compiler 5 (and earlier versions) uses the armcc compiler, while ARM Compiler 6 replaces armcc with armclang, which is based on LLVM and has different command line parameters, instructions, etc., so it is considered a new compiler.
2. AC5 and AC6
Arm Compiler 5 (AC5) is a widely used generation of compilers, used in Keil MDK V4 and early V5 versions.
In 2015, AC6 was released, and it was integrated into subsequent new versions of MDK, up to the latest version of MDK which integrates AC6.13 (modifiable version):
Advantages of AC6 over AC5
AC6 has made many changes compared to previous versions of the compiler, the most intuitive feeling is that the compilation speed has improved a lot, as well as the code size.
Of course, in addition to speed and size, there are many other advantages, such as: supporting C++14 standards, using TrustZone for Armv8-M to create secure and non-secure code for devices, and being compatible with source code created based on GCC, meaning that source code compilable by GCC can also be compiled by it.
This is the official code size comparison:
Upgrading from AC5 to AC6
AC5 and AC6 are different compilers, and there are compatibility differences that require migration. The official documentation for this migration process is provided:
https://developer.arm.com/docs/100068/0614/migrating-from-arm-compiler-5-to-arm-compiler-6
Of course, you can also refer to the article I shared earlier:
What needs to be done to upgrade the MDK-ARM compiler from V5 to V6?
Related videos:
3. Keil MDK Optimization Options
In Keil MDK, compared to AC5, using AC6 will add several optimization options: code size, speed, balance, etc.
Optimization options include:
Optimization Level -O0
<span>-O0</span>
disables all optimizations. This optimization level is the default setting. Using <span>-O0</span>
results in faster compilation and build times, but the generated code is slower than that produced by other optimization levels. Compared to <span>-O0</span>
other optimization levels, code size and stack usage are significantly higher. The generated code closely relates to the source code, but the amount of generated code is larger, including unnecessary code.
Optimization Level -O1
<span>-O1</span>
enables core optimizations in the compiler. This optimization level provides a good debugging experience and has better code quality than <span>-O0</span>
, and stack usage is also improved.Arm recommends using this option for a good debugging experience.
<span>-O1</span>
differs from <span>-O0</span>
in that:
-
Optimizations are enabled, which may reduce the completeness of debugging information.
-
Inline and tail calls are enabled, which means backtraces may not provide the stack for activated functions.
-
Functions that are not used or not expected to be called will not be called, resulting in smaller code size.
-
The values of variables may be unavailable within their scope after being unused. For example, their stack positions may have been reused.
Optimization Level -O2
<span>-O2</span>
has higher performance optimizations compared to <span>-O1</span>
. It introduces some new optimizations and changes the heuristic methods of optimizations. This is the first optimization level where the compiler may generate vector instructions. It also reduces the debugging experience.
<span>-O2</span>
differs from -O1 in that:
-
The threshold for the compiler to consider inline call sites profitable may increase.
-
The number of loop unrollings performed may increase.
-
Vector instructions may be generated for related sequences of simple loops and independent scalar operations.
You can use the armclang command line option to prohibit the creation of vector instructions <span>-fno-vectorize</span>
.
Optimization Level -O3
<span>-O3</span>
has higher performance optimizations compared to -O2. This optimization level allows optimizations that require extensive compile-time analysis and resources, and changes the heuristic methods of optimizations compared to -O2.<span>-O3</span>
instructs the compiler to optimize for the performance of the generated code, ignoring the size of the generated code, which may lead to an increase in code size.
<span>-O3</span>
differs from -O2 in that:
-
The threshold for the compiler to consider inline call sites profitable increases.
-
The amount of loop unrolling performed increases.
-
More aggressive instruction optimizations are enabled in the compiler pipeline.
Optimization Level -Os
<span>-Os</span>
aims to provide high performance without significantly increasing code size. Depending on your application, the performance provided may be similar to <span>-O2</span>
or <span>-O3</span>
.
<span>-Os</span>
reduces code size compared to -O3, but may degrade the debugging experience.
<span>-Os</span>
differs from -O3 in that::
-
The threshold for the compiler to consider inline call sites profitable is reduced.
-
The amount of loop unrolling performed is significantly reduced.
Optimization Level -Oz
<span>-Oz</span>
aims to provide the smallest code size possible.Arm recommends using this option for the best code size. This optimization level will degrade the debugging experience.
<span>-Oz</span>
differs from -Os in that:
-
The compiler optimizes solely for code size, ignoring performance optimizations, which may slow down the code.
-
Function inlining is not disabled. In some cases, inlining may overall reduce code size, for example, if a function is only called once. Inlining heuristics are only adjusted for inlining when expected code size will be reduced.
-
Optimizations that may increase code size, such as loop unrolling and loop vectorization, are disabled.
-
Loops are generated as while loops instead of do-while loops.
Optimization Level -Ofast
<span>-Ofast</span>
performs optimizations from the level, including optimizations executed with the <span>-ffast-math</span>
armclang option.
This level also performs further optimizations that may violate strict adherence to language standards.
Compared to -O3, this level reduces the debugging experience and may increase code size.
Optimization Level -Omax
<span>-Omax</span>
is the maximum optimization level, specifically targeting performance optimization. It supports all optimizations performed from the level, as well as link-time optimization (LTO).
At this optimization level, Arm Compiler may violate strict adherence to language standards. Using this optimization level provides the fastest performance.
Compared to -Ofast, this level reduces the debugging experience and may increase code size.
If you compile using -Omax and have separate compilation and linking steps, you must also include -Omax in the armlink command line.
The above is the relevant content and comparison about the compiler, hope it helps everyone. For more about Keil content, you can reply “Keil Series Tutorial” in the background.
Long press to go to the public account included in the image to follow