Beginner-Friendly Guide to Understanding ARM Architecture

1. Address Space

1.1 Accessing Registers in ARM

How do you access registers in ARM? Just like accessing memory: how to access a register? Use a pointer:

int   a;  unsigned int *p  = &a;   // p equals "address of a"
*p = val;   // Writing to this address means writing to a
val = *p;   // Reading from this address means reading a
unsigned int *p  = 0x40010800; // p equals the address of a certain register
*p = val;   // Writing to this address means writing to this register
val = *p;  // Read from the register

1.2 Differences Between ARM and X86

In the view of ARM and CPU, memory and IO operations are the same:

Beginner-Friendly Guide to Understanding ARM Architecture

The address sent by the CPU allows it to directly access the corresponding peripherals. These peripherals’ addresses belong to the CPU’s address space, but in the above image, the address sent by the CPU cannot reach Flash. If the CPU wants to access Flash, it must go through the EMMC controller, as Flash belongs to another address space, divided into several family relationships, meaning it can only be accessed in a generational manner, not across generations.

In the X86 architecture, memory and IO are separated:

Beginner-Friendly Guide to Understanding ARM Architecture

1.3 RISC vs. CISC

1.3.1 RISC

ARM chips belong to RISC (Reduced Instruction Set Computing), which uses simpler instructions with the following characteristics:

① Only read and write instructions for memory

② Data operations are implemented within the CPU

③ CPUs using RISC instructions have slightly lower complexity, making them easier to design

Beginner-Friendly Guide to Understanding ARM Architecture

For the multiplication operation shown in the above image a = a * b, in RISC, four assembly instructions are required:

① Read memory a

② Read memory b

③ Calculate a*b

④ Write the result back to memory

1.3.2 CISC

X86 belongs to CISC (Complex Instruction Set Computing), which uses more complex instructions. For instance, some complex instructions are implemented via “microprograms”.

For example, when executing a multiplication instruction, it actually executes a “microprogram” that performs the same four operations:

① Read memory a

② Read memory b

③ Calculate a*b

④ Write the result back to memory

Beginner-Friendly Guide to Understanding ARM Architecture

The above operations may seem straightforward to the programmer, as they appear to accomplish everything with a single instruction!

1.3.3 Comparison of RISC and CISC

  • CISC has strong instruction capabilities, but most instructions are used infrequently, increasing CPU complexity; instructions are of variable length;

  • RISC instructions are mostly single-cycle instructions, fixed length, and operate on registers, with only Load/Store operations for memory;

  • CISC supports various addressing modes; RISC supports various addressing modes;

  • CISC is implemented through microprogram control technology;

  • RISC increases general-purpose registers, primarily using hardwired logic control, and employs pipelining;

  • CISC has a long development cycle;

  • RISC optimizes compilation, effectively supporting high-level languages;

2. ARM Internal Registers

2.1 Internal Registers of the CPU

Whether it’s cortex-M3/M4 or cortex-A7, the CPU has registers R0, R1, …, R15; they can be used to “temporarily store” data.

Beginner-Friendly Guide to Understanding ARM Architecture
  • Registers R0–R12 are general-purpose registers, with the first eight (R0–R7) also known as low registers. Due to limited space in instructions, many 16-bit instructions can only access low registers. High registers (R8–R12) can be used for 32-bit instructions and several 16-bit instructions, such as MOV (move). The initial values of R0–R12 are undefined.

  • R13 is the stack pointer, allowing access to stack storage via PUSH and POP operations. There are physically two stack pointers: the main stack pointer (MSP, sometimes referred to as SP_main in some ARM literature) is the default stack pointer, used by the processor after reset or when in handling mode. The other stack pointer is called the process stack pointer (PSP, sometimes referred to as SP_process in some ARM literature), which can only be used in thread mode. The choice of stack pointer is determined by a special register CONTROL, described in section 4.2.3. For general programs, only one of these two registers will be visible. Both MSP and PSP are 32-bit, but the lowest two bits of the pointer (MSP or PSP) are always 0, and writing to these two bits has no effect. For ARM Cortex-M processors, PUSH and POP are always 32-bit, and the addresses for stack operations must also be aligned to 32-bit word boundaries.

  • R14 is also known as the link register (LR), used to save the return address during function or subroutine calls. At the end of a function or subroutine, program control can return to the calling program by loading the value of LR into the program counter (PC) and continuing execution. After a function or subroutine call, the value of LR is automatically updated. If a function needs to call another function or subroutine, it must first save the value of LR onto the stack; otherwise, the current value of LR will be lost after the function call.

  • R15 is the program counter (PC), which is readable and writable, reading returns the current instruction address plus 4 (due to the design of the pipeline feature and compatibility with the ARM7TDMI processor). Writing to PC (for example, using data transfer/processing instructions) will cause a jump operation. (Indicating the current instruction address, writing a new value will cause a jump)

2.2 Classification of Internal CPU Registers

Cortex-M3/M4:

Beginner-Friendly Guide to Understanding ARM Architecture

For comparison, cortex-A7 is similar:

Beginner-Friendly Guide to Understanding ARM Architecture

2.3 Example

When comparing two numbers, how different CPU registers handle it: for cortex-M3/M4, there is also a Program Status Register

Beginner-Friendly Guide to Understanding ARM Architecture

For cortex-M3/M4, xPSR actually corresponds to three registers:

① APSR: Application PSR

② IPSR: Interrupt PSR

③ EPSR: Execution PSR

The meanings of these three registers are shown in the following image:

Beginner-Friendly Guide to Understanding ARM Architecture

These three registers can be accessed individually:

MRS  R0, APSR  ; Read APSR
MRS  R0, IPSR    ; Read IPSR
MSR  APSR, R0   ; Write APSR

These three registers can also be accessed all at once:

MRS  R0,  PSR  ; Read combined program status
MSR  PSR, R0   ; Write combined program status

The so-called combined program status is shown in the following image:

Beginner-Friendly Guide to Understanding ARM Architecture

For cortex-A7, there is also a Current Program Status Register

Beginner-Friendly Guide to Understanding ARM Architecture

3. ARM Assembly

3.1 Overview

Initially, ARM released two types of instruction sets:

① ARM instruction set, which is 32-bit, with each instruction occupying 32 bits, efficient but takes up too much space

② Thumb instruction set, which is 16-bit, with each instruction occupying 16 bits, saving space

Use Thumb instructions when saving space, and ARM instructions when efficiency is needed.

A CPU can run both Thumb and ARM instructions.

How to distinguish whether the current instruction is a Thumb or ARM instruction?

There is a bit in the program status register called “T”; when it equals 1, it indicates that the current instruction being executed is a Thumb instruction.

Beginner-Friendly Guide to Understanding ARM Architecture

We can write the address of function A or B into the PC register to call A or B, but how do we make the CPU enter Thumb state when executing function A and ARM state when executing function B?

When calling function A, set BIT0 of the PC register to 1, that is: PC = address of function A + (1<<0);

When calling function B, set BIT0 of the PC register to 0: that is: PC = address of function B

In daily work, only a few assembly instructions are needed, and their meanings can be guessed from their names:

MOV LDR/STR LDM STM AND/OR ADD/SUB B LD C D ADR/LDR CMP

3.2 Assembly Instruction Format

Refer to “DEN0013D_cortex_a_series_PG.pdf” P70 and “ARM Cortex-M3 and Cortex-M4 Authority Guide.pdf” Chapter 5

Beginner-Friendly Guide to Understanding ARM Architecture

Assembly instructions can be divided into several categories: data processing, memory access, jumps, saturation operations, and other instructions. For example, the UAL assembly format for “data processing” instructions is:

Operation{cond}iS} Rd, Rn, Operand2  
  • Operation represents various assembly instructions, such as ADD, MOV;

  • cond represents the condition under which the instruction executes;

  • S indicates that the instruction will modify the program status register;

  • Rd is the destination register used to store the result of the operation;

  • Rn and Operand2 are the two source operands. Operation represents various assembly instructions, such as ADD, MOV; as shown in the following image:

Beginner-Friendly Guide to Understanding ARM Architecture

cond can take various values, as shown below:

Beginner-Friendly Guide to Understanding ARM Architecture

3.3 Branch/Jump Instructions

Refer to “DEN0013D_cortex_a_series_PG.pdf” P327, P328, P329

Core instructions are B, BL: B: Branch, jump; BL: Branch with Link, save the return address in LR before jumping; BX: Branch and eXchange, switch to ARM or Thumb state based on BIT0 of the jump address (0: ARM state, 1: Thumb state); BLX: Branch with Link and eXchange, switch to ARM or Thumb state based on BIT0 of the jump address (0: ARM state, 1: Thumb state)
Beginner-Friendly Guide to Understanding ARM Architecture

3.4 Immediate Values

This instruction: MOV R0, #VAL intends to store the value of VAL in register R0.

Question: Can VAL be any value?

Answer: No, it must be an immediate value.

Question: Why?

Answer: If VAL could be any number, “MOV R0, #VAL” itself is 16 or 32 bits, where would the space be to store any value of VAL?

Therefore, VAL must meet certain specifications.

Beginner-Friendly Guide to Understanding ARM Architecture

3.5 LDR Pseudo Instruction

It can be cumbersome to determine whether a VAL is an immediate value!

And if I just want to assign any value to R0, what should I do?

I can use the pseudo-instruction: LDR R0, =VAL

“Pseudo-instruction” means it’s a fake, non-existent instruction.

Note that LDR as a “pseudo-instruction” has an “=”, otherwise it’s the real LDR (load register) instruction.

There are two cases (divided into immediate and non-immediate):

The compiler will replace the “pseudo-instruction” with the real instruction, for example:

LDR  R0,  =0x12    

0x12 is an immediate value, so it is replaced with:

MOV  R0,  #0x12
LDR  R0, =0x12345678

0x12345678 is not an immediate value, so it is replaced with:

LDR  R0, [PC, #offset]          // 2. Use Load Register to read the memory instruction, offset is determined at link time……Label  DCD  0x12345678    // 1. The compiler saves this value somewhere in the program

3.6 ADR Pseudo Instruction

ADR means address, used to read the address of a certain label

Beginner-Friendly Guide to Understanding ARM Architecture

Example:

ADR  R0,  Loop
Loop    ADD  R0, R0, #1

It is a “pseudo-instruction” that will be converted into a real instruction, for example:

ADD R0, PC, #val   ; val is determined at link time
Loop    ADD  R0, R0, #1

3.7 Differences Between ARM Compiler and GCC Compiler Syntax

Beginner-Friendly Guide to Understanding ARM Architecture

4. ARM Assembly Simulator

4.1 Simulator VisUAL

VisUAL is an ARM assembly simulator,

Download link:

https://salmanarif.bitbucket.io/visual/downloads.html

Usage method:

https://salmanarif.bitbucket.io/visual/user_guide/index.html

The ARM board simulated by VisUAL is shown in the image below; it does not simulate peripherals, only simulates CPU, ROM, RAM.

  • The red area is ROM, read-only, can only run programs within it;

  • The ROM area could originally be readable, this is a limitation of VisUAL;

  • The RAM area is readable and writable.

Beginner-Friendly Guide to Understanding ARM Architecture

4.2 Supported Assembly Instructions in VisUAL

Note: DCD, FILL, END, etc., in the image below are ARM assembler syntax, which differs slightly from GCC assembly syntax, as will be introduced later.

Beginner-Friendly Guide to Understanding ARM Architecture

4.3 VisUAL Settings

Generally, no settings are needed; I only set the font size and background color: after setting the size, you must press enter for it to take effect.

Beginner-Friendly Guide to Understanding ARM Architecture

4.4 Writing Assembly Instructions

Enter assembly code in the left code column:

Beginner-Friendly Guide to Understanding ARM Architecture

Run:

Beginner-Friendly Guide to Understanding ARM Architecture

However, there are certain issues in the above assembly program:

① The 0x20000 in the first instruction is not an immediate value; modify it to be an immediate value;

② The 0x21234 in the second instruction is not an immediate value; modify it to be an immediate value;

Beginner-Friendly Guide to Understanding ARM Architecture

Modify the errors:

Click “Reset” and modify it to:

MOV R0, #0x20000
MOV R1, #0x1234
Beginner-Friendly Guide to Understanding ARM Architecture

Check pointers and memory:

Beginner-Friendly Guide to Understanding ARM Architecture

Tools can be used to view the memory values at memory addresses:

Beginner-Friendly Guide to Understanding ARM Architecture

Modify the above assembly code to let R1 be stored at R0+4, and view the pointer diagram and memory diagram:

Beginner-Friendly Guide to Understanding ARM Architecture

We can also write slightly more complex assembly programs:

Beginner-Friendly Guide to Understanding ARM Architecture

This article has eight chapters; due to length, more exciting content can be accessed by clicking the bottom left corner “Read Original”!

Instructor Recruitment

Recruitment Requirements

Complete high-quality video production related to robots

Total duration must reach over 3 hours

Video content must be high-quality courses, ensuring high quality and professionalism

Instructor Rewards

Enjoy course revenue sharing

Receive 2 courses from GuYue Academy’s premium courses (excluding training camps)

Contact Us

Add staff WeChat: GYH-xiaogu

Beginner-Friendly Guide to Understanding ARM Architecture

Beginner-Friendly Guide to Understanding ARM Architecture

Beginner-Friendly Guide to Understanding ARM Architecture

Beginner-Friendly Guide to Understanding ARM Architecture

Beginner-Friendly Guide to Understanding ARM Architecture

Leave a Comment