Understanding the Instruction Set of Cortex-M Series Processors

In most cases, application code can be written in C or other high-level languages. However, a basic understanding of the instruction set supported by Cortex-M processors helps developers choose the appropriate Cortex-M processor for specific applications. The instruction set architecture (ISA) is part of the processor architecture, and Cortex-M processors can be divided into several architectural specifications. In this issue, let Joseph take us through the most comprehensive instruction set of Cortex-M processors.

Cortex-M Processor ARM Architecture Specifications

All Cortex-M processors support the Thumb instruction set. The entire Thumb instruction set becomes quite large when extended to the Thumb-2 version. However, different Cortex-M processors support different subsets of the Thumb instruction set, as shown in the diagrams below.

Instruction Set of Cortex-M Processors

>>>>

Cortex-M0/M0+/M1 Instruction Set

Cortex-M0/M0+/M1 processors are based on the ARMv6-M architecture. This is a small instruction set that supports only 56 instructions, most of which are 16-bit instructions, as shown in Figure 3, which occupies only a small part. However, the registers and data length processed in such processors are 32 bits. For most simple I/O control tasks and general data processing, these instructions are sufficient. Such a small instruction set can be implemented with a minimal number of logic gates in processor design, with the minimum configuration of Cortex-M0 and Cortex-M0+ being only 12K gates. However, many of the instructions cannot use the high registers (R8 to R12), and the ability to generate immediate values is limited. This is a result of balancing ultra-low power consumption and performance needs.

>>>>

Cortex-M3 Instruction Set

Cortex-M3 processors are based on the ARMv7-M architecture and support a richer instruction set, including many 32-bit instructions that can efficiently utilize high registers. Additionally, M3 also supports:

Table jump instructions and conditional execution (using IT instructions)
Hardware division instructions
Multiply-Accumulate instructions (MAC)
Various bit manipulation instructions

A richer instruction set enhances performance through several means: for example, 32-bit Thumb instructions support a larger range of immediate values, jump offsets, and address offsets for memory data ranges. They support basic DSP operations (e.g., several MAC instructions that require multiple clock cycles to execute, along with saturation operation instructions). Finally, these 32-bit instructions allow for bucket shifting operations on multiple data with a single instruction.

Supporting a richer instruction set leads to greater area costs and higher power consumption. A typical microcontroller, the gate count of Cortex-M3 is more than double that of Cortex-M0 and Cortex-M0+. However, the area of the processor is just a small part of most modern microcontrollers, and the extra area and power consumption are often not as significant.

>>>>

Cortex-M4 Instruction Set

Cortex-M4 is similar to Cortex-M3 in many ways: pipeline, programming model. Cortex-M4 supports all features of Cortex-M3, plus various DSP-oriented instructions, such as SIMD, saturation operation instructions, a series of single-cycle MAC instructions (Cortex-M3 only supports a limited number of MAC instructions which are multi-cycle), and optional single-precision floating-point operation instructions.

Cortex-M4’s SIMD operations can process two 16-bit data and four 8-bit data in parallel. For example, the QADD8 and QADD16 operations shown in the diagram below:

Examples of SIMD instructions: QADD8 and QADD16

In some DSP operations, using SIMD can accelerate the computation of 16-bit and 8-bit data faster, as these operations can be processed in parallel. However, in general programming, C compilers cannot fully utilize SIMD capabilities. This is why the typical benchmark scores of Cortex-M3 and Cortex-M4 are similar. However, Cortex-M4’s internal data paths differ from Cortex-M3’s, and in some cases, Cortex-M4 can process faster (for example, single-cycle MAC can write back to two registers in one cycle).

>>>>

Cortex-M7 Instruction Set

The instruction set supported by Cortex-M7 is similar to that of Cortex-M4, adding:

The floating-point data architecture is based on FPv5, rather than the FPv4 of Cortex-M4, so Cortex-M7 supports additional floating-point instructions
Optional double-precision floating-point processing instructions
Support for cache data prefetch instructions (PLD)

The pipeline of Cortex-M7 is very different from that of Cortex-M4. Cortex-M7 is a 6-stage dual-issue pipeline that can achieve higher performance. Most software designed for Cortex-M4 can run directly on Cortex-M7. However, to fully utilize the pipeline differences for optimal performance, the software needs to be recompiled, and in many cases, the software needs some minor upgrades to fully leverage new features like Cache.

>>>>

Cortex-M23 Instruction Set

The instruction set of Cortex-M23 is based on the ARMv8-M Baseline sub-specification, which is a superset of ARMv6-M. The extended instructions include:

Hardware division instructions
Comparison and jump instructions, 32-bit jump instructions
Instructions supporting TrustZone security extensions
Mutual exclusion data access instructions (commonly used for semaphore operations)
16-bit immediate value generation instructions
Load acquire and store release instructions (supporting C11)

In some cases, these enhanced instruction sets can improve processor performance and are useful for SoC designs containing multiple processors (for example, mutual exclusion access helps with semaphore handling in multi-processor systems).

>>>>

Cortex-M33 Instruction Set

Because the Cortex-M33 design is highly configurable, some instructions are also optional. For example:

DSP instructions (supported by Cortex-M4 and Cortex-M7) are optional.
Single-precision floating-point operation instructions are optional, these instructions are based on FPv5 and have a few more than Cortex-M4.

Cortex-M33 also supports new instructions introduced by ARMv8-M Mainline:

Instructions supporting TrustZone security extensions
Load acquire and store release instructions (supporting C11)

Summary of Instruction Set Features Comparison

ARMv6-M, ARMv7-M, and ARMv8-M architectures have many instruction set feature characteristics, making it difficult to introduce all the details. However, the table below summarizes those key differences.

Summary of Instruction Set Features

The most important feature of the instruction set of Cortex-M processors is upward compatibility. The instructions of Cortex-M3 are a superset of those of Cortex-M0/M0+/M1. Therefore, theoretically, if the storage space allocation is consistent, binary files running on Cortex-M0/M0+/M1 can run directly on Cortex-M3. The same principle applies to Cortex-M4/M7 and other Cortex-M processors; the instructions supported by Cortex-M0/M0+/M1/M3 can also run on Cortex-M4/M7.

Although Cortex-M0/M0+/M1/M3/M23 processors do not have a floating-point unit configuration option, they can still utilize software for floating-point data operations. This also applies to products based on Cortex-M4/M7/M33 that do not have a configured floating-point unit. In this case, when floating-point numbers are used in the program, the compiler toolchain will insert the necessary runtime software libraries at the linking stage. Software-based floating-point operations require longer runtime and slightly increase code size. However, if floating-point operations are not frequently used, this solution is suitable for such applications.

Disclaimer: This article is a reprint from the internet, and the copyright belongs to the original author. If there is any infringement, please contact the editor for deletion (contact email: [email protected]).

MCU Development Station

Follow the latest news and related technical applications in MCU development, exchange insights on MCU innovation design, and connect with MCU development experts!

Understanding the Instruction Set of Cortex-M Series Processors

Related posts

Leave a Comment Cancel reply