Introduction
The author has been involved in embedded software development for nearly five years, primarily using ARM Cortex-M core series microcontrollers. During these five years, thanks to the existence of C language compilers, I was able to develop without touching assembly language, but it felt like I missed some scenery and did not experience the beauty of compilers and CPUs. Therefore, I decided to explore the beauty of the ARM CPU architecture and the mysteries of C language compilers during my weekend leisure time by seeking information, conducting experiments, and drawing conclusions. (Because I personally do not agree with the teaching methods of microcomputer principles courses in schools).
-
Exploring ARM | 1. Getting to Know the ARM Cortex-M Family
1. ARM Instruction Set Architecture
The ARM instruction set architecture, abbreviated as ISA, supports three instruction sets: A64, A32, and T32.
-
The A64 instruction set is used in Armv8-A to support 64-bit architecture -
The A32 instruction set is known as the ARM instruction set in Armv6 and Armv7 architectures -
The T32 instruction set is known as the Thumb instruction set in Armv6 and Armv7 architectures

A32 Instruction Set
The A32 instruction set, before armv8, is also known as “the ARM instruction set, with a fixed instruction length of 32 bits and 4-byte alignment”.
T32 Instruction Set
The T32 instruction set, before armv8, is known as the Thumb instruction set.
Initially, the length of the ARM instruction set was fixed at 32 bits. To improve code density, “the Thumb instruction set was designed as a 16-bit instruction set”, allowing developers to use both the ARM instruction set and the Thumb instruction set to reduce code size. However, these are two sets of instruction sets, requiring back-and-forth switching between ARM state and Thumb state, which is very cumbersome.
As time went on and with the introduction of Thumb-2 technology, “most of the features of the ARM instruction set were incorporated into the Thumb instructions as a supplement to the Thumb instruction set, evolving the Thumb instruction set into a mixed-length instruction set of 16 and 32 bits, known as the Thumb-2 instruction set”.
The birth of the Thumb-2 instruction set allows the compiler to balance performance and code size within a single instruction set, providing excellent code density, minimizing system memory size and costs.
2. ARM Architecture Extensions
ARM also provides a series of architecture extensions to meet the needs of the next generation of processors, which offer new functionalities for ARM processors.
DSP Extensions
The DSP for Cortex-M provides high-performance “signal processing capabilities” for ARM Cortex-M processors, used in scenarios such as sound, audio, sensor hubs, and machine learning, completing signal processing tasks without the need for additional DSP devices.
Processors with DSP extensions include Cortex-M4, Cortex-M7, Cortex-M33, Cortex-M35P, and Cortex-M55 processors.
The DSP extension instructions are added on top of the Thumb instruction set and optional floating-point units, maintaining the ease of use of the original Cortex-M programming model while adding digital signal processing capabilities to Cortex-M processors.
SIMD Instructions
These Cortex-M processors with DSP extensions also provide “SIMD instructions” to operate on 8-bit or 16-bit integers.
SIMD stands for single instruction multiple data, and based on the fact that all registers are still 32 bits, “SIMD instructions can simultaneously operate on 2 16-bit values or 4 8-bit values”. Instructions operating on 8-bit or 16-bit data are very useful for processing video or audio data, as these data do not require a width of 32 bits. SIMD instructions provide the capability to process this data in parallel.
Floating Point Units
ARM floating point unit technology provides high performance and efficiency hardware support for “half precision, single precision, and double precision floating point operations”.
The ARM floating point unit uses a complete software library support, fully compliant with the IEEE-754 standard, making it particularly suitable for applications requiring high precision in floating point calculations.
Application scenarios for floating point data types include:
-
Automotive control programs -
3D graphics -
Industrial control systems -
Motion control systems
Helium
ARM Helium technology is the M configuration vector extension of the ARM Cortex-M processor series, abbreviated as MVE. This technology is an extension of the Armv8.1-M architecture, providing significant performance improvements for machine learning applications in small embedded devices.
The Cortex-M55 is the first processor with this extension.
The Helium technology adds more than 150 new scalar and vector instructions, enabling efficient computation of 8-bit, 16-bit, and 32-bit fixed-point data. The 16-bit and 32-bit fixed-point formats are widely used in traditional signal processing, such as audio processing, while the 8-bit fixed-point format is important in machine learning processing, such as neural network computing and image processing.
Similarly, Helium also supports floating point data types, including single precision floating point (32 bits) and half precision floating point (16 bits).
3. What Instruction Set Does Cortex-M Use?
Having discussed the basic instruction set and extended instruction set of ARM, it is time to answer the question we raised: What instruction set does ARM Cortex-M use?
“The entire ARM Cortex-M series supports a single instruction set: the Thumb instruction set or the Thumb-2 instruction set. Specifically, it supports the T32 instruction set.”
Among the many members of the Cortex-M family, each processor has different support for the Thumb/Thumb-2 instruction set, “most processors support a subset of the Thumb/Thumb-2 instruction set”.
The Thumb instruction set supported by the Cortex-M0, Cortex-M3, Cortex-M4, and Cortex-M7 series is shown in the figure. The Thumb instructions supported by the Cortex-M23 and Cortex-M33 cores are shown in the following figure, with the yellow part indicating the newly added instructions in the ArmV8-M series:

Thus, the second stop of the ARM exploration journey comes to an end! See you at the next stop!
1. Replacing STM32 with GD32: Insights on Humidity and Temperature Sensor Development
2. Embedded Software Not Only Serves Customer Needs but Also Production
3. Fault Mechanism in Embedded Development: Stay Away from 996!
4. Personal Privacy and National Security Risks from Data Collection in Smart Cars
5. Can We Surpass Moore’s Law?
6. Why is the Semiconductor Industry Facing Shortages and Price Mechanism Failure?
Disclaimer: This article is a network reprint, and the copyright belongs to the original author. If there are any copyright issues, please contact us, and we will confirm copyright based on the copyright certificate you provide and pay remuneration or delete the content.