Detailed Explanation of ARM Compilers (armcc/armclang)

Click the blue text
Detailed Explanation of ARM Compilers (armcc/armclang)
Follow us

Due to changes in the public account’s push rules, please click “View” and add “Star” to get exciting technical shares at the first time

Source from the internet, please delete if infringing

Learning the ARM Compiler

First, let’s understand the compiler, which is usually divided into three parts: frontend + optimizer + backend.
  • Frontend: Lexical, syntax, and semantic analysis, converting source code into an abstract syntax tree, generating intermediate code.
  • Optimizer: Optimizes the obtained intermediate code to make it more efficient.
  • Backend: Converts the optimized code into machine code for each platform.
In simpler terms, the work of a compiler is: source code -> preprocessing -> compilation -> object code -> linking -> executable program.
Detailed Explanation of ARM Compilers (armcc/armclang)
Next, let’s briefly look at the history of some compilers, such as GCC, LLVM, and Clang, as well as the armcc and armclang introduced in this article.
  1. GCC (GNU Compiler Collection) is a compiler developed by GNU, licensed as free software under GPL;
  2. GCC originally could only handle C, but now it can handle C++, Pascal, Objective-C, Java, etc.
  3. Apple used to use GCC as the compiler, but GCC’s support for Objective-C was not very good, and many new features were not added, so Apple began to seek alternatives to the compiler.
  4. At this time, LLVM appeared, which was proposed and formed by Chris Lattner during his master’s and doctoral studies, but it uses GCC’s frontend for semantic analysis, and then LLVM performs optimization and generates object code, which can be called LLVM-GCC.
  5. Later, Apple planned to bypass GCC directly, so they recruited Dr. Chris Lattner to develop the compiler, and Clang was born, which is a C/C++/Obj-C compiler developed based on LLVM, essentially a compiler frontend to replace or surpass GCC.
  6. armcc is a compiler developed by ARM, integrated into KEIL and ARM DS IDE, and has stagnated since version 5.06 (AC5) without further maintenance; its frontend is based on the Edison Design Group.
  7. armclang is integrated into armcc, based on the new architecture clang and LLVM, as ARM’s sixth-generation compiler, AC6, which will become the main promoted compiler in the future.

armcc Compiler

The arm compiler developed by ARM was integrated into the KEIL IDE after acquiring KEIL in 2005, as well as the ARM DS5 developed in-house. Documentation related to the compiler and IDE can be downloaded from ARM’s official website.
The downloaded documentation is mainly divided into several parts: armcc compiler, armasm assembler, armlink linker, armar archiver, and fromelf bin file.

1. armcc

The armcc compiler mainly compiles .c/.cpp source files, generating object files, supporting various features through various command-line compilation options. Let’s list some common compilation options.
The general syntax of the armcc compiler is as follows:
armcc [options] [source] 
For example:
armcc -I ../common/ -I ../driver  -g --apcs=interwork --cpu=Cortex-R5 -c ../common/led.c -o ../out/led.o
  • -c/-C/-o/-D -c means just compile, without entering the linking step, -C retains the output of preprocessing, then -E can specify the preprocessing output to a designated file.
armcc -c -C -E  -I ../common/ -I ../driver -g --apcs=interwork --cpu=Cortex-R5 ../common/led.c -o ../out/led.i
After this, you can see the preprocessing results, such as the results after macro replacement, which is convenient for analyzing problems.
-o specifies the output file name

Detailed Explanation of ARM Compilers (armcc/armclang)

-D defines macro names, for example: -DLOG -DUART=1 -U removes already defined macro names
#define LOG
#define UART 1

When specifying the above macros on the compiler command line, it is equivalent to defining the above code in the program.
  • -I: Specifies the include directory. If the path is not specified, the compilation phase will report an error, saying that the related file cannot be found. I believe everyone has encountered this error!

    Detailed Explanation of ARM Compilers (armcc/armclang)

  • –c99 –c90 refers to the syntax version of the C language,

    Detailed Explanation of ARM Compilers (armcc/armclang)

  • –cpu=name, for example –cpu=Cortex-R5

    Detailed Explanation of ARM Compilers (armcc/armclang)Detailed Explanation of ARM Compilers (armcc/armclang)

  • -M/–md These two are used to generate compilation dependencies for each source file, –md generates .d files, indicating the header files that this object file depends on. This is very useful in incremental compilation; after finding the dependencies, updating the dependencies allows you to compile only the modified files and their dependencies.

    Detailed Explanation of ARM Compilers (armcc/armclang)

armcc  -c -M  -I ..\SYSTEM\sys  -I ...  sys.c --no_depend_single_line --md  
Detailed Explanation of ARM Compilers (armcc/armclang)
Insert image description here
  • –diag_error/–diag_suppress/–diag_warning Manage compilation warnings and errors, such as suppressing a certain compilation warning/error.
--diag_error=warning                      Treat err compilation messages as warnings,
--diag_suppress=3017,1256,1148            Suppress diagnostic messages with compilation messages coded as 3017,1256,1148
--diag_warning=1234,5678                  Suppress diagnostic messages with codes 1234,5678
--diag_warning=error                      Treat warnings as errors

Detailed Explanation of ARM Compilers (armcc/armclang)

For example, the encoding numbers such as 20, 223.
Detailed Explanation of ARM Compilers (armcc/armclang)
Insert image description here
  • –feedback=filename Compilation feedback, mainly used to remove unused code (data and code), needs to be used with link options, usually requires compiling twice.
--feedback=unused_section.txt   During the compiler phase, put unused code and code into a separate section, convenient for the linking phase to remove. In the linking phase, generate an unused section.
--feedback=image_none           Ignore the linking script in the linking phase, ignoring code layout, thus not generating axf files.
--remove                        Remove unused sections.
--keep memory_alyout.o(rw)    Can set the rw section in memory_out.o not to be deleted.
Through feedback, space reduced from 950k to 800k (space required for dual-core bin)
  • –inline/–forceinline
    The former considers whether to inline functions, while the latter forces all functions to inline. To inline a single function, consider modifying the function with __forceinline.
    • It should be noted that not all functions can be inlined, such as recursive functions.
Detailed Explanation of ARM Compilers (armcc/armclang)
  • –littleend/–bigend Data endianness settings,
  • -O0/O1/O2/O3/Otime/Ospace Compilation optimization options
-O0 Minimal optimization. Disables most optimizations. This option provides the best debugging view when debugging because the structure of the generated code directly corresponds to the source code. All optimizations that interfere with the debugging view are disabled.
  • You can set breakpoints at any reachable point, including dead code (places the program cannot execute or places that have not been called).
  • Variable values are available anywhere within their scope, except for the position where they are uninitialized.
  • Backtrace provides the expected function call stack relationship when reading the source code.
  • Although the debugging view generated by -O0 is closest to the source code, users may prefer the debugging view generated by -O1, as this improves the quality of the code without changing the basic structure.
  • Dead code includes reachable code that has no effect on the program’s result, such as assignments to local variables that have never been used. Inaccessible code is specifically code that cannot be accessed by any control flow path, such as code statements immediately following a return.
-O1 Limited optimization. The compiler only performs optimizations that can be described as debugging information. Removes unused inline functions and unused static functions. Turns off optimizations that severely degrade the debugging view. If used with –debug, this option provides a generally satisfactory debugging view with good code density. The difference in the debugging view compared to –O0 is:
  • Cannot set breakpoints on dead code.
  • Variable values may be unavailable in their scope after initialization. For example, if their allocation position has been reused.
  • Functions with no effect may be called out of order, or if the result is not needed.
  • Backtrace may be inaccurate because stack handling has changed, there are calling optimizations.
  • Optimization level –O1 produces a good correspondence between source code and object code, especially when the source code does not contain dead code.
  • The generated code may be significantly smaller than the code at –O0, which can simplify the analysis of target code.
-O2 High optimization. If used with –debug, the debugging view may be less satisfactory because the mapping of target code to source code is not always clear. The compiler may perform optimizations that debugging information cannot describe. This is the default optimization level. The difference in debugging view compared to –O1 is:
  • The mapping from source code to target code may be many-to-one because multiple source code locations may map to one point in the target file, more aggressive instruction optimization.
  • Allows instruction scheduling across sequence points. This may cause the reported value of a variable at a specific point to not match the expected value.
  • The compiler automatically inlines functions
-O3 Maximum optimization. When debugging is enabled, this option usually provides a poor debugging view. ARM recommends debugging at lower optimization levels. If -O3 and -Otime are used together, the compiler will perform more aggressive additional optimizations, such as:
  • Advanced scalar optimizations, including loop unrolling. This can provide significant performance advantages at a small cost to code size but carries the risk of longer build times.
  • More aggressive inlining and automatic inlining.
  • These optimizations effectively rewrite the input source code, leading to the lowest correspondence between target code and source code and the worst debugging view. –loop_optimization_level=option controls the effect of loop optimization executed with –O3 –Otime. The higher the number of loop optimizations, the worse the correspondence between source code and target code.
  • Using the –vectorize option also reduces the correspondence between source code and target code. For more information on advanced transformations applied to the source code, refer to the –O3 –Otime using the –remarks command line option.
  • Because optimizations affect the mapping from target code to source code, using -Ospace and -Otime to select optimization levels usually affects the debugging view.
  • If a simple debugging view is needed, option -O0 is the best choice. Choosing -O0 usually increases the size of the ELF image by 7% to 15%. To reduce the size of the debugging tables, use the –remove_unneeded_entities option.
  • –split_sections creates a section for each function of each source file, facilitating the removal of unused functions from .o files during linking. –attribute((section(…))) can modify data and functions to place them in a specified section instead of the default section.
  • –thumb compiles the .c file into thumb instructions,
#pragma  arm         Compile to arm instructions
#pragma  thumb       Compile to thumb instructions
#pragam  push        Save #pragma state
#pragma  pop         Pop state Can be used together with the above
#pragma  pack(n)   Sets n-byte alignment, for structures.
  • –use_frame_pointer sets the stack pointer; each time a function is entered, the stack top is first pushed onto the stack, and then other registers are pushed onto the stack. The benefit of this is that the backtrace call relationship is easy to find. For details, seeDetailed Explanation of Several Common Registers in ARM Development
  • -apcs=interwork supports switching between thumb and arm instructions, such as BLX, which is often used in places that support thumb instructions.

2. armasm

  • Embedded assembly
    • The function parameter list can use variables, but the function body must use registers, and the function body is implemented in assembly language.
    • Assembly language must handle return instructions.
__asm return-type function-name(parameter-list)
{
// ARM/Thumb assembly code
instruction{;comment is optional}
...
instruction
}

/*Example 1*/
__asm int f(int i)
{
 ADD r0, r0, #1 
}

/*Example 2*/
#include <stdio.h>
__asm void my_strcpy(const char *src, char *dst)
{
loop
 LDRB r2, [r0], #1
 STRB r2, [r1], #1
 CMP r2, #0
 BNE loop
 BX lr
}
int main(void)
{
 const char *a = "Hello world!";
 char b[20];
 my_strcpy (a, b);
 printf("Original string: '%s'\n", a);
 printf("Copied string: '%s'\n", b);
 return 0;
}

  • Inline assembly
    • If there are multiple instructions in the same line, a semicolon (;) must be used.
    • If an instruction exceeds one line, a backslash (
      ) must be added.
    • In multi-line format, C and C++ comments can be used anywhere in the inline assembly language block. However, comments cannot be embedded in the lines of multiple instructions.
    • In assembly language, commas (,) are used as separators, so the C expression’s comma operator must be enclosed in parentheses to differentiate them.
    • Labels must be followed by a colon, :, just like C and C++ labels.
    • asm statements must be located within C++ functions. asm statements can be used anywhere that requires C++ statements.
    • Register names in inline assembly code are treated as C or C++ variables. They are not necessarily related to physical registers with the same name. If a register is not declared as a C or C++ variable, the compiler will generate a warning.
    • Do not save and restore registers in inline assembly code; the compiler will perform this operation. Moreover, inline assembly programs do not provide direct access to physical registers. However, registers can be accessed indirectly through variables.
    • pc/lr/sp:__current_pc,__current_sp, and __return_address can be read.
    • Do not modify processor modes or coprocessor states in inline assembly.
int f(int x)
{
 __asm
 {
  STMFD sp!, {r0} // save r0 - illegal: read before write
  ADD r0, x, 1
  EOR x, r0, x
  LDMFD sp!, {r0} // restore r0 - not needed.
 }
 return x;
}
The function must be written as:
int f(int x)
{
 int r0;
 __asm
 {
  ADD r0, x, 1
  EOR x, r0, x
 }
 return x;
}

int foo(int x, int y)
{
__asm
{
 SUBS x,x,y
 BEQ end
}
return 1;
end:
 return 0;
}

If you are over 18 years old and find learning C language too difficult? Want to try other programming languages, then I recommend you learn Python. Currently, a free Python zero-based course worth 499 yuan is available for a limited time, with only 10 spots!



▲ Scan the QR code - Get it for free


Detailed Explanation of ARM Compilers (armcc/armclang)
Click to read the original text to learn more

Leave a Comment