Overview of Embedded Toolchain Tools

Introduction

This article summarizes and organizes some rules and principles of toolchains, providing a reference for everyone when using toolchains across platforms in the future.

It explains how to understand the meaning of <span><span>arm</span></span><span><span>-</span></span><span><span>none</span></span><span><span>-</span></span><span><span>eabi</span></span><span><span>-</span></span><span><span>gcc</span></span> and how to quickly identify which compiler to use.

Of course, if there are any inaccuracies in the writing, feel free to correct them in the comments section, and let’s discuss what the compilers we use mean.

Toolchain Naming Rules

First, we understand the naming rules based on the most familiar <span><span>arm</span></span><span><span>-</span></span><span><span>none</span></span><span><span>-</span></span><span><span>eabi</span></span><span><span>-</span></span><span><span>gcc</span></span>. The last part, gcc, is the tool, which could also be gdb or others. The preceding parts are as follows:

The naming rule is: arch [-vendor] [-os] [-(gnu)eabi]

  • arch – Architecture, such as ARM, MIPS
  • vendor – Toolchain provider (if there is no vendor name, ‘none’ and ‘unknown’ are commonly used)
  • os – Target operating system (this could be Linux or other OS platforms)
  • eabi – Embedded Application Binary Interface

Looking closely at this naming rule, we can see that the middle ‘os’ is actually optional.

The first parameter is the arch architecture, which will be introduced later.

The second parameter is <span><span>vendor</span></span>, which refers to the toolchain compiled by a specific vendor. We will not discuss this further, as many small companies define their own toolchains.

We will focus on the common <span><span>none</span></span> and <span><span>unknown</span></span>.

  • none: Typically indicates that there is no specific vendor, a generic compiler.
  • unknown: This indicates that the operating system is unknown or there is no specific operating system. This is the default in the code, which could mean either no OS or no vendor (many may compile themselves without setting a specific value, resulting in this value).

For example: <span><span>arm</span></span><span><span>-</span></span><span><span>linux</span></span><span><span>-</span></span><span><span>gnueabi</span></span><span><span>-</span></span><span><span>gcc</span></span>

This parameter includes the Linux parameter, indicating a compiler based on the Linux platform, typically used to compile binaries related to the Linux operating system.

GCC Source Code

All of this comes from the GCC source code:

https://gcc.gnu.org/git/gcc.git

Let’s take a look at <span><span>arm</span></span><span><span>-</span></span><span><span>none</span></span><span><span>-</span></span><span><span>eabi</span></span><span><span>-</span></span><span><span>gcc</span></span>

Official website

https://developer.arm.com/downloads/-/gnu-rm

The official site has an introduction

Features: All GCC 10.3 features.

Overview of Embedded Toolchain Tools

So it is likely compiled based on version 10.3.0.

Additionally, from the content to be downloaded, the last parameter is related to the platform, indicating whether you want to run an exe on Windows or a non-exe package on Linux. Here you need to know your operating system; for Windows, just download win32. For Linux, choose according to whether your laptop or computer chip is ARM or x86, and macOS also has options.Overview of Embedded Toolchain Tools

Let’s execute the <span><span>-</span></span><span><span>v</span></span> command.

There is a lot of content here; which parts are we interested in? There are many, mostly parameters used when compiling with gcc.Overview of Embedded Toolchain Tools

Configured with: /mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/src/gcc/configure --build=x86_64-linux-gnu --host=i686-w64-mingw32 --target=arm-none-eabi --prefix=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw --libexecdir=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/lib --infodir=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/share/doc/gcc-arm-none-eabi/info --mandir=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/share/doc/gcc-arm-none-eabi/man --htmldir=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/share/doc/gcc-arm-none-eabi/html --pdfdir=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/share/doc/gcc-arm-none-eabi/pdf --enable-languages=c,c++ --enable-mingw-wildcard --disable-decimal-float --disable-libffi --disable-libgomp --disable-libmudflap --disable-libquadmath --disable-libssp --disable-libstdcxx-pch --disable-nls --disable-shared --disable-threads --disable-tls --with-gnu-as --with-gnu-ld --with-headers=yes --with-newlib --with-python-dir=share/gcc-arm-none-eabi --with-sysroot=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/install-mingw/arm-none-eabi --with-libiconv-prefix=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-gmp=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-mpfr=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-mpc=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-isl=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-libelf=/mnt/workspace/workspace/GCC-10-pipeline/jenkins-GCC-10-pipeline-338_20211018_1634516203/build-mingw/host-libs/usr --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-pkgversion='GNU Arm Embedded Toolchain 10.3-2021.10' --with-multilib-list=rmprofile,aprofile

We use the command <span><span>arm</span></span><span><span>-</span></span><span><span>none</span></span><span><span>-</span></span><span><span>eabi</span></span><span><span>-</span></span><span><span>gcc</span></span><span><span>-</span></span><span><span>march</span></span><span><span>=?</span></span>

The result is as follows

arm-none-eabi-gcc: error: unrecognized -march target:? arm-none-eabi-gcc: note: valid arguments are: armv4 armv4t armv5t armv5te armv5tej armv6 armv6j armv6k armv6z armv6kz armv6zk armv6t2 armv6-m armv6s-m armv7 armv7-a armv7ve armv7-r armv7-m armv7e-m armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8-m.base armv8-m.main armv8-r armv8.1-m.main iwmmxt iwmmxt2 arm-none-eabi-gcc: error: missing argument to '-march=' arm-none-eabi-gcc: fatal error: no input files compilation terminated.

It is understood that this supports so many architectures; of course, each architecture has many types of chips, and currently, most are

ARM

The previous page is a dedicated page for embedded systems; now let’s go deeper and provide more comprehensive content:

https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads

Source code: https://git.gitlab.arm.com/tooling/gnu-devtools-for-arm.git

There are many toolchains here; will there be a selection difficulty? Understand the following concepts:

Architecture

  • AArch32:
  • It is a 32-bit ARM architecture, suitable for ARMv7 and earlier versions.
  • Mainly used for low-power, resource-constrained embedded devices.
  • Generally similar to STM32F4, F1 series microcontrollers.
  • AArch64:
  • It is a 64-bit ARM architecture, belonging to ARMv8 and later versions.
  • Provides larger addressing space and more efficient instruction set.
  • Generally similar to ARM 64-bit SoCs like RK3588.

Executable Files

  • ELF (Executable and Linkable Format):
  • A general file format used for executable files, object files, and shared libraries.
  • Applicable to various architectures, including 32-bit and 64-bit.
  • EABI (Embedded Application Binary Interface):
  • A binary interface specification for embedded systems. This is smaller in size.
  • Defines standards for data types, register usage, stack organization, etc.
  • Mainly used for 32-bit ARM architecture.

Simply put: ELF paired with AArch64 supports a larger architecture, like 64-bit addressing, and ELF file types are more universal, but the required resources and compiled binaries are larger.

EABI paired with AArch32 targets small 32-bit embedded systems, and the EABI interface specification results in smaller compiled binaries, making it suitable for small systems.

  • arm-none-eabi: Embedded systems like STM32, no OS, 32-bit MCUs.
  • arm-none-linux-gnueabihf: Represents executable files generated by the toolchain, running on Linux operating systems. ‘none’ indicates general use.
  • aarch64-none-elf: This supports larger 64-bit MCUs, running without an OS, such as directly executing bare-metal programs on RK3588.
  • aarch64-none-linux-gnu: This generates binary files executed on Linux.

Host Machine

The host machine is where you want the compiler to run, which is the architecture of your current computer.

Host machines are divided into the following types:

  • i686: A 32-bit x86 architecture, mainly used for personal computers and servers, such as Pentium Pro, Pentium II, Pentium III, Pentium 4, etc. The i686 architecture has gradually been replaced by the 64-bit x86_64 architecture, so it is not a concern.
  • x86_x64: Backward compatible with 32-bit applications while supporting 64-bit applications; this is the common architecture for Intel and AMD CPUs.
  • aarch64: This could also be possible, for example, if you use RK3588 as a small computer.
  • linux/macos/windows: Next, choose your commonly used operating system, usually Windows.

So for common use, choose Windows, and for Ubuntu, select Linux.

Note: Here we do not see the term ‘unknown’; in fact, I believe this understanding is that ‘unknown’ usually indicates default values during compilation, not from official or normally released toolchain organizations.

Next, we know that if it is Windows and we want to compile STM32 or other bare-metal systems, we will likely use the following compiler:

arm-gnu-toolchain-14.2.rel1-mingw-w64-x86_64-arm-none-eabi.zip

If encountering large Linux embedded systems, we will likely use the following:

arm-gnu-toolchain-14.2.rel1-mingw-w64-x86_64-arm-none-linux-gnueabihf.zip

MinGW is mainly to indicate that the toolchain is compiled within MinGW.

<span><span>MinGW</span></span><span><span>-</span></span><span><span>w64</span></span> is an open-source compiler toolchain used for developing native C/C++ programs on the Windows platform. It is an extension of the <span><span>MinGW</span></span> (Minimalist GNU for Windows) project, providing support for both 64-bit and 32-bit programs.

macOS is also divided into <span><span>darwin</span></span><span><span>-</span></span><span><span>x86_64</span></span> (Intel CPU) and <span><span>darwin</span></span><span><span>-</span></span><span><span>arm64</span></span> (ARM CPU M1).

GNU

Additionally, for the Linux platform, here are some professional terms explained:

arm-none-linux-gnueabihf: Uses hardware floating point (Hard Float), with default floating-point operations using floating-point registers (-mfloat-abi=hard), resulting in better performance. This means that during compilation, floating-point operations will be converted to use the floating-point registers in the CPU.

aarch64-none-linux-gnu: By default, it also supports hardware floating point, but usually does not need to explicitly specify the floating-point ABI, as the AArch64 architecture itself supports floating-point operations.

gnu’s eabihf indicates support for hardware floating point, generally FPU; for Linux ARM 32-bit, it is usually default to support hardware floating-point registers.

What does GNU mean:

GNU stands for: GNU’s Not Unix. GNU provides a series of tools for software development and compilation, including GCC (GNU Compiler Collection), GDB (GNU Debugger), Binutils (Binary Utilities), etc.

Usually, Linux is followed by gnu, linux-gnu; it is estimated that when using Linux, it supports a more complete GNU.

You can see that the toolchain below also has the ‘gnu’ prefix, indicating that it includes integrated tools like gcc and gdb.

arm-gnu-toolchain-14.2.rel1-mingw-w64-x86_64-arm-none-eabi.zip

EABI

EABI (Embedded Application Binary Interface) is a standard developed by ARM and its partners. It defines how compilers, assemblers, linkers, and other tools generate object files and executable files on ARM architecture to ensure interoperability of code generated by different compilers on ARM systems.

RISCV

The official toolchain for RISCV is as follows:

https://github.com/riscv-collab/riscv-gnu-toolchain

From the repository, we can see the following sub-repositories:

binutils

https://sourceware.org/git/binutils-gdb.git

GCC:

https://gcc.gnu.org/git/gcc.git

glibc:

https://sourceware.org/git/glibc.git

gdb:

https://sourceware.org/git/binutils-gdb.git

From the website https://toolchains.bootlin.com/

We can roughly understand that a toolchain generally requires the following components:

binutils    2.41 gcc    13.3.0 gdb    14.2 glibc    2.39-74-g198632... linux-headers    4.19.315

Next, let’s discuss the basic content of these components.

  • binutils: Tools like addr2line, ar, objdump, objcopy, which are used besides gcc, are generated from this repository.
* addr2line: (binutils) addr2line. Convert addresses to file and line. * ar: (binutils) ar. Create, modify, and extract from archives. * c++filt: (binutils) c++filt. Filter to demangle encoded C++ symbols. * cxxfilt: (binutils) c++filt. MS-DOS name for c++filt. * dlltool: (binutils) dlltool. Create files needed to build and use DLLs. * nm: (binutils) nm. List symbols from object files. * objcopy: (binutils) objcopy. Copy and translate object files. * objdump: (binutils) objdump. Display information from object files. * ranlib: (binutils) ranlib. Generate index to archive contents. * readelf: (binutils) readelf. Display the contents of ELF format files. * size: (binutils) size. List section sizes and total size. * strings: (binutils) strings. List printable strings from files. * strip: (binutils) strip. Discard symbols. * elfedit: (binutils) elfedit. Update ELF header and property of ELF files. * windmc: (binutils) windmc. Generator for Windows message resources. * windres: (binutils) windres. Manipulate Windows resources.
  • gdb: gdb is actually also part of binutils, but it is in a separate folder alongside other tools.
  • gcc: This has been discussed before; it is the source file that generates gcc.exe in bin, used to compile .o files, and besides GCC, there is also llvm.
  • glibc: This is the libc part; for Linux, glibc is basically used as the libc content. The major content includes
  • Reference for libc content: https://www.rt-thread.org/document/site/#/rt-thread-version/rt-thread-standard/programming-manual/libc/introduction

    Basically, it includes header files and implementations for common library functions like memcpy, strcpy, etc. The header files are generally similar, but the implementations vary among the following mainstream options:

    • Newlib: https://sourceware.org/git/newlib-cygwin.git This is the default libc for ARM embedded 32-bit systems.
    • Picolibc: https://github.com/picolibc/picolibc This is also a small libc suitable for embedded STM32, promoted and used by Zephyr.
    • glibc: https://sourceware.org/git/glibc.git This is the official GNU libc used on Linux, comprehensive and widely used on Linux platforms.
    • uClibc: https://github.com/wbx-github/uclibc-ng.git This is compatible with glibc and supports multiple architectures, LGPLv2.1.
    • musl: https://git.musl-libc.org/cgit/musl, highly standardized, lightweight, and efficient, with a friendly license: MIT license.
    • <
  • linux-headers: This will not be introduced, as it depends on the version of the Linux system, usually some header files for Linux systems.

From the following path, we can see that ELF uses newlib, glibc is glibc, and musl is musl libc.Overview of Embedded Toolchain Tools

LLVM

LLVM is a parallel system to GCC; you can think of it as a competitor to GCC. The source code is as follows:

https://github.com/llvm/llvm-project/tree/main

LLVM has been adopted by major companies such as Apple, Microsoft, Google, and Facebook.

LLVM is a framework system for building compilers, written in C++, used to optimize the compile-time, link-time, run-time, and idle-time of programs written in any programming language, keeping it open for developers and compatible with existing scripts.Overview of Embedded Toolchain Tools

The compilation of LLVM mainly uses <span><span>clang</span></span>, not gcc.

Its compilation and assembly are done using clang, with the prefix “llvm-“.

PREFIX ='llvm-' CC ='clang' AS ='clang' AR = PREFIX +'ar' CXX ='clang++' LINK ='clang'

LLVM is intended to replace GCC.

LIBC

Below are some common libc usage scenarios and characteristics.

glibc

  • Characteristics: Comprehensive functionality, includes a large number of standard and extended features; good compatibility, adheres to POSIX, ISO C standards; supports multiple hardware architectures and operating systems; well-optimized performance, suitable for general computing scenarios.
  • Usage Scenarios: Widely used in various mainstream Linux distributions, such as desktops, servers, etc. When rich standard library functions and good compatibility are needed, glibc is a common choice.

musl

  • Characteristics: Lightweight, small code size, low resource usage; strictly adheres to POSIX and ISO C standards, focusing on code correctness and simplicity; license-friendly, using the MIT license, suitable for commercial applications.
  • Usage Scenarios: Suitable for embedded systems with high standardization requirements and limited resources, such as IoT devices, small embedded devices, etc. Also suitable for scenarios that require concise and efficient code, with strict requirements for startup speed and execution efficiency.

uClibc-ng

  • Characteristics: Strong portability, supports various architectures including processors without memory management units; highly configurable, can trim functional modules according to needs; good compatibility with glibc, most applications that support glibc can run on uClibc-ng with just recompilation.
  • Usage Scenarios: Commonly used in resource-constrained embedded Linux systems, such as IoT devices, routers, smart appliances, etc., effectively reducing system footprint, optimizing startup time and power consumption.

dietlibc

  • Characteristics: Extremely small size, the minimum program size for static linking can reach very low levels; focuses on performance optimization, achieving high efficiency in certain operations; mainly focuses on basic C library functions, with fewer complex extended features.
  • Usage Scenarios: Suitable for embedded systems or specific optimization scenarios that require extremely small program sizes, such as small embedded devices, applications with extreme requirements for startup speed and program size.

newlib

  • Characteristics: Relatively comprehensive functionality, includes some features suitable for embedded systems, such as network protocol stack support; good portability, supports various embedded architectures; stable and reliable.
  • Usage Scenarios: Commonly used in the development of embedded systems, such as small routers, industrial control devices, etc., and often used in embedded applications that require network functionality.

Picolibc

  • Characteristics: Designed for small embedded systems, lightweight and efficient; developed based on Newlib and AVR Libc, inheriting its advantages and optimizing them; supports various architectures, including ARM, RISC-V, MIPS, etc.
  • Usage Scenarios: Suitable for resource-limited embedded devices, such as IoT devices, smart home products, sensor nodes, microcontroller applications, etc.

Cosmopolitan Libc

  • Characteristics: Lightweight, highly configurable; achieves cross-platform compatibility through modular and hierarchical design, allowing seamless switching across different operating systems and hardware platforms.
  • Usage Scenarios: Suitable for C language applications that require cross-platform deployment, such as embedded system development, cloud computing, and distributed systems.

Companies Producing and Releasing Toolchains

https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads

  • Sourcery: Siemens http://www.codesourcery.com/ This no longer exists.
  • Linaro: No longer updated https://releases.linaro.org/components/toolchain/binaries/

References:

Various versions of GCC for RISC-V

Conclusion

This article has covered some basic parameters of toolchains; there are certainly some shortcomings. If anyone has various supplements, feel free to suggest them.

What other questions do you have regarding toolchains? Please feel free to ask, and I can answer them one by one.

Leave a Comment