Comprehensive Guide to ARM Cortex-M3

Chapter 1 Introduction

1. Introduction to ARM Cortex-M3 Processor

The CM3 processor core is the central processing unit (CPU) of microcontrollers. A complete MCU based on CM3 requires many other components. After chip manufacturers obtain authorization to use the CM3 processor core, they can incorporate the CM3 core into their silicon designs, adding memory, peripherals, I/O, and other functional blocks. Different manufacturers will design microcontrollers with different configurations, including unique memory capacities, types, peripherals, etc. This book focuses on the processor core itself. To understand a specific model of the processor, one must consult the documentation provided by the relevant manufacturer.

Comprehensive Guide to ARM Cortex-M3

2. Various ARM Architecture Versions

Comprehensive Guide to ARM Cortex-M3

Starting from ARMv7, the core architecture has changed from a single style to three styles:

Style A: Designed for high-performance “open application platforms” – getting closer to computers.

Style R: Used for high-end embedded systems, especially those with real-time requirements – both fast and real-time.

Style M: Used in deeply embedded, microcontroller-style systems.

Let’s take a closer look at these three styles:

Style A (ARMv7-A): An “application processor” that needs to run complex applications. Supports large embedded operating systems (not necessarily real-time), such as Symbian (used in Nokia smartphones), Linux, and Microsoft’s Windows CE and smartphone operating system Windows Mobile. These applications require powerful processing performance, and a complete and robust virtual memory mechanism implemented with hardware MMU, and typically include Java support, sometimes requiring a secure program execution environment (for e-commerce). Typical products include high-end mobile phones and handheld devices, electronic wallets, and financial transaction processors.

Style R (ARMv7-R): A hard real-time and high-performance processor. Targeted at the high-end real-time market. Those advanced gadgets, like components for luxury cars, large generator controllers, robotic arm controllers, etc., use processors that not only need to be good and powerful but also extremely reliable, with extremely fast response to events.

Style M (ARMv7-M): Tailored to applications of the old generation microcontrollers. In these applications, especially for real-time control systems, low cost, low power consumption, rapid interrupt response, and high processing efficiency are crucial. The Cortex series is the first appearance of the v7 architecture, where the Cortex-M3 is designed according to Style M.

Comprehensive Guide to ARM Cortex-M3

3. Development of Instruction Sets

Due to historical reasons (starting from ARM7TDMI), ARM processors have always supported two relatively independent forms of instruction sets, which are:

32-bit ARM instruction set. Corresponding processor state: ARM state

16-bit Thumb instruction set. Corresponding processor state: Thumb state

It can be seen that these two instruction sets correspond to two processor execution states. During the execution of the program, the processor can dynamically switch between the two execution states. In fact, the Thumb instruction set is functionally a subset of the ARM instruction set, but it brings higher code density, reducing the size of the target code.

Comprehensive Guide to ARM Cortex-M3

Thumb-2 is the fruit of the summer of 2003, which is a superset of Thumb, supporting both 16-bit and 32-bit instructions.

4. Thumb-2 Instruction Set Architecture (ISA)

Comprehensive Guide to ARM Cortex-M3

Chapter 2 Overview of Cortex-M3

1. Introduction

The CM3 is a 32-bit processor core. The internal data path is 32 bits, the registers are 32 bits, and the memory interface is also 32 bits.

The CM3 adopts a Harvard architecture, with separate instruction and data buses. However, the instruction and data buses share the same memory space (a unified memory system). In other words, having two buses does not mean the addressable space becomes 8GB.

The CM3 provides an optional MPU, and an external cache can be used if needed.

The CM3 supports both big-endian and little-endian modes.

The CM3 also comes with many debugging components to support debugging operations at the hardware level, such as instruction breakpoints and data watchpoints. Additionally, to support more advanced debugging, there are other optional components, including instruction tracing and various types of debugging interfaces.

Comprehensive Guide to ARM Cortex-M3

2. Register Group

The CM3 processor has a register group from R0-R15. Among them, R13 serves as the stack pointer SP. There are two SPs, but at any one time, only one can be seen, which is known as a “banked” register.

Comprehensive Guide to ARM Cortex-M3

R0-R12: General Purpose Registers

R0-R12 are all 32-bit general-purpose registers used for data operations. However, note that most 16-bit Thumb instructions can only access R0-R7, while 32-bit Thumb-2 instructions can access all registers.

Banked R13: Two Stack Pointers

The Cortex-M3 has two stack pointers, but they are banked, so only one can be used at any time.

Main Stack Pointer (MSP): The stack pointer used by default after reset, used for operating system kernels and exception handling routines (including interrupt service routines).

Process Stack Pointer (PSP): Used by user application code.

The lowest two bits of the stack pointer are always 0, meaning the stack is always 4-byte aligned.

In ARM programming, any event that interrupts the sequential execution of a program is called an exception. In addition to external interrupts, when an instruction performs an “illegal operation” or accesses a forbidden memory area, various errors that cause faults, and unmaskable interrupts occur, they will interrupt the execution of the program; these situations are collectively referred to as exceptions. In a loose context, exceptions and interrupts can also be used interchangeably. Additionally, program code can also actively request to enter an exception state (commonly used for system calls).

R14: Link Register

When calling a subroutine, the return address is stored by R14.

Unlike most other processors, ARM reduces the number of memory accesses (which often require more than 3 instruction cycles, especially with MMU and cache) by directly storing the return address in a register. This allows many codes with only one level of subroutine calls to avoid accessing memory (stack memory), thereby improving the efficiency of subroutine calls. If there are more than one level, the previous level’s R14 value needs to be pushed onto the stack. When programming on ARM, intermediate results should be saved in registers as much as possible, and memory should only be accessed when absolutely necessary. In RISC processors, to emphasize that memory access crosses the boundary of the processor and adversely affects performance, it has a specialized term: spill.

R15: Program Counter Register

Points to the current program address. If its value is modified, it can change the program’s execution flow (many advanced techniques are involved here).

Special Function Registers

The Cortex-M3 also features several special function registers at the core level, including:

Program Status Register groups (PSRs)

Interrupt Mask Register groups (PRIMASK, FAULTMASK, BASEPRI)

Control Register (CONTROL)

Comprehensive Guide to ARM Cortex-M3

Special Function Register Functions

Comprehensive Guide to ARM Cortex-M3

3. Operating Modes and Privilege Levels

The CM3 processor supports two operating modes and two levels of privilege operations.

Two operating modes: handler mode and thread mode.

Two levels of privilege operations: privileged level and user level.

Comprehensive Guide to ARM Cortex-M3

When the CM3 runs the main application program (thread mode), both privileged and user levels can be used; however, exception service routines must be executed at the privileged level. After reset, the processor defaults to thread mode with privileged access. At the privileged level, programs can access all ranges of memory (if there is an MPU, also outside the prohibited areas defined by the MPU), and can execute all instructions.

Programs at the privileged level can do whatever they want, but they may also end up playing themselves into a corner – switching to user level. Once in user level, returning requires a “legal procedure” – user-level programs cannot simply attempt to rewrite the CONTROL register to return to privileged level; they must first “appeal”: execute a system call instruction (SVC). This will trigger an SVC exception, which is then taken over by the exception service routine (usually part of the operating system), and if approved, the exception service routine modifies the CONTROL register to re-enter the privileged level in user mode thread mode.

In fact, the only way to go from user level to privileged level is through exceptions: if an exception is triggered during program execution, the processor always switches to privileged level first, and when the exception service routine finishes executing, it returns to the previous state (it can also manually specify the return state).

Comprehensive Guide to ARM Cortex-M3

4. Built-in Nested Vector Interrupt Controller

The CM3 features a built-in interrupt controller – the Nested Vectored Interrupt Controller (NVIC). NVIC provides the following functionalities:

Support for Nested Interrupts

Support for nested interrupts has a wide scope, covering all external interrupts and most system exceptions. The external manifestation is that these exceptions can all be assigned different priorities. The current priority is stored in a dedicated field of the xPSR. When an exception occurs, the hardware automatically compares the priority of that exception with the current exception priority. If a higher priority exception arrives, the processor interrupts the current interrupt service routine (or normal program) and serves the new exception – that is, it immediately preempts.

Vector Interrupt Support

When starting to respond to an interrupt, the CM3 automatically locates a vector table and finds the entry address of the ISR from the table based on the interrupt number, then jumps to execute it.

Dynamic Priority Adjustment Support

Software can change the priority of interrupts at runtime. If the priority of the corresponding interrupt is modified in a certain ISR, and this interrupt has new instances pending, it will not interrupt itself, thus avoiding the risk of re-entry.

Significantly Shortened Interrupt Latency

To shorten interrupt latency, the CM3 introduces several new features, including automatic context saving and restoration, and other measures to shorten ISR-to-ISR delays during nested interrupts.

Interrupt Masking

Interrupts/exceptions with priority lower than a certain threshold can be masked (set BASEPRI register), or all can be completely masked (set PRIMASK and FAULTMASK registers).

5. Memory Mapping

Unlike other ARM architectures, whose memory mapping is determined by semiconductor manufacturers, CM3 has predefined a “coarse” memory mapping.

6. Bus Interface

The CM3 has several bus interfaces to enable it to simultaneously address and access memory:

Instruction Storage Bus (two, I-Code bus and D-Code bus): The former is used for fetching instructions, while the latter is used for table lookups and similar operations, optimized for best execution speed.

System Bus: Used to access memory and peripherals, covering areas including SRAM, on-chip peripherals, off-chip RAM, off-chip expansion devices, and parts of the system-level storage area.

Private Peripheral Bus: Responsible for accessing some private peripherals, mainly for accessing debugging components. They are also in the system-level storage area.

7. Memory Protection Unit

The CM3 has an optional Memory Protection Unit. With it, different access restrictions can be imposed on privileged and user-level accesses. When a violation is detected, the MPU will generate a fault exception, which can be analyzed by the fault exception service routine and corrected if possible.

The MPU has many applications. The most common is for the operating system to use the MPU to prevent user programs from corrupting the data of privileged level code, including the operating system itself. The MPU manages memory protection by “regions.” It can set certain memory regions to read-only, thus preventing accidental changes to their contents; it can also isolate data areas between different tasks in multitasking systems. In short, it makes embedded systems more robust and reliable.

8. Instruction System

The CM3 only uses the Thumb-2 instruction set. It makes the CM3 more advanced in several ways compared to traditional ARM processors:

Eliminates the overhead of state switching, saving execution time and instruction space.

No longer requires source code files to be divided into ARM-compiled and Thumb-compiled, greatly reducing the management burden of software development.

No need to repeatedly verify and test: when and where to switch states for the most efficient operation of my program. Software development is much easier.

Please note: The CM3 does not support all Thumb-2 instructions; the ARMv7-M specification only requires the implementation of a subset of Thumb-2.

9. Interrupts and Exceptions

All interrupt mechanisms of the CM3 are implemented by the NVIC. In addition to supporting 240 interrupts, the NVIC also supports 16-4-1=11 internal exception sources, enabling fault management mechanisms. As a result, the CM3 has 256 predefined exception types.

Comprehensive Guide to ARM Cortex-M3

Source: Internet (Copyright belongs to the original author, If there is any infringement, please contact us for deletion or other handling)

This article is provided by “135 Editor” for technical support

MCU Development Station

WeChat ID: mcugeek

Comprehensive Guide to ARM Cortex-M3

(Scan the QR code for more exciting content!)

Leave a Comment

Your email address will not be published. Required fields are marked *