Course 4: Detailed Explanation and Practical Application of ARMv8 Assembly Pseudo Instructions
Pseudo instructions (Directives) are auxiliary instructions provided by the assembler to control code generation, data allocation, segment structure, etc.,which do not directly translate into machine code. Below is a classification and example analysis of commonly used pseudo instructions in ARMv8 assembly:
1. Segment Definition and Code Organization Pseudo Instructions
1. Segment Declaration Pseudo Instructions
-
<span>.section</span>
Defines code/data segments, specifying the attributes and names of the segments..section .text // Code segment (executable) .section .data // Data segment (read/write) .section .rodata // Read-only data segment
-
<span>.text</span>
/<span>.data</span>
/<span>.bss</span>
Quickly switch segments (equivalent to<span>.section .text</span>
etc.):.text // Subsequent code goes into .text segment .data // Subsequent data goes into .data segment
2. Alignment and Padding
-
<span>.align</span>
Aligns the address to the specified byte (must be a power of 2):.align 3 // 8-byte alignment (2^3 = 8)
-
<span>.skip</span>
/<span>.space</span>
Fills with specified bytes of empty data:buffer: .space 64 // Allocate 64 bytes of uninitialized space
2. Data Definition Pseudo Instructions
1. Basic Data Types
-
<span>.byte</span>
/<span>.hword</span>
/<span>.word</span>
/<span>.quad</span>
Defines 8-bit, 16-bit, 32-bit, and 64-bit data:values: .byte 0x12 // 1 byte: 0x12 .hword 0x1234 // 2 bytes: 0x1234 .word 0x12345678 // 4 bytes: 0x12345678 .quad 0x123456789ABCDEF0 // 8 bytes
-
<span>.asciz</span>
/<span>.string</span>
Defines a string ending with<span>\0</span>
:msg: .asciz "Hello, ARMv8!" // Automatically adds the ending 0x00
2. Arrays and Repeated Data
-
<span>.fill</span>
Repeats filling specified times of data:array: .fill 10, 4, 0xFF // Generate 10 of 4-byte 0xFFFFFFFF
-
<span>.rept</span>
/<span>.endr</span>
Repeats generating code or data blocks:.rept 3 nop // Generate 3 nop instructions .endr
3. Symbol Management Pseudo Instructions
1. Symbol Visibility
-
<span>.global</span>
/<span>.globl</span>
Declares the symbol as globally visible (can be referenced by other files):.global _start // _start symbol is globally visible
-
<span>.extern</span>
Declares the symbol as externally defined (similar to C’s<span>extern</span>
):.extern printf // Declare that the printf function is provided externally
-
.local Declares the symbol as locally visible
2. Symbol Assignment
-
<span>.equ</span>
Defines a symbol constant (similar to C’s<span>#define</span>
):.equ UART_BASE, 0x9000000 // Define UART base address constant
3. Other Key Operators
Symbols and expressions supported by the assembler.
Operator | Function | Example |
---|---|---|
<span>#</span> |
Line comment (content from<span>#</span> to the end of the line is ignored) |
<span>mov x0, 1 # Set return value</span> |
<span>/* */</span> |
Multi-line comment (GAS supported) | <span>/* This is a multi-line comment */</span> |
<span>$</span> |
Represents the current address (synonymous with<span>.</span> ) |
<span>ldr x0, =$ + 8</span> |
<span>:lo12:<symbol></span> |
Gets the low 12 bits of the symbol address (used for<span>ADRP</span> +<span>ADD</span> combination) |
<span>add x0, x0, :lo12:my_data</span> |
<span>:got:<symbol></span> |
Gets the global offset table (GOT) address of the symbol | <span>ldr x0, :got:printf</span> |
4. Conditional Assembly and Macros
1. Conditional Assembly
-
<span>.if</span>
/<span>.else</span>
/<span>.endif</span>
Decides whether to assemble a segment of code based on conditions:.if DEBUG_MODE == 1 mov x0, #1 // Enable additional code in debug mode. .else mov x0, #0 .endif
2. Macro Definition
-
<span>.macro</span>
/<span>.endm</span>
Defines reusable code blocks (similar to functions):.macro PRINT_STRING str_addr ldr x1, =\str_addr bl uart_print .endm// Use macro PRINT_STRING msg // Expands to ldr x1, =msg; bl uart_print
5. Practical Example: Comprehensive Application of Pseudo Instructions
Scenario:Initialize the data segment and print a formatted message
.section .data// Define read-only data (automatically 4-byte aligned)
.align 2
welcome_msg: .asciz "System Initialized!\n"
version: .word 0x01020304 // Version number: 1.2.3.4
.section .text
.global _start
_start: // Set stack pointer
ldr x0, =stack_top
mov sp, x0 // Print welcome message
ldr x0, =welcome_msg
bl uart_print // Read version number and print
ldr x1, =version
ldr w2, [x1]
bl print_hex
halt: b halt// Print hexadecimal function print_hex: // Implementation omitted (convert w2 value to ASCII and output)
ret// Stack space definition
.section .bss
.align 12
stack_bottom: .space 4096
stack_top:
Pseudo Instruction Analysis:
-
<span>.section .data</span>
:Switch to data segment. -
<span>.align 2</span>
:Data aligned to 4 bytes (2^2=4). -
<span>.asciz</span>
:Defines a string ending with<span>\0</span>
. -
<span>.word</span>
:Defines a 32-bit integer. -
<span>.global _start</span>
:Declares the entry symbol as globally visible. -
<span>.section .bss</span>
:Defines an uninitialized segment for stack space.
6. Advanced Pseudo Instruction Techniques
1. Link Script Collaboration
By using pseudo instructions in conjunction with link scripts, precise control of memory layout can be achieved:
// Define symbols pointing to specific addresses
.section .special_section
.global __my_symbol__
__my_symbol__ = . // Assign current address to symbol
.word 0xDEADBEEF
Corresponding link script (<span>link.ld</span>
):
.special_section 0x8000 : { *(.special_section)}
2. Debug Information Embedding
Use pseudo instructions to generate debug symbols (requires debugger support):
.file "main.s" // Specify source file name
.loc 1 5 // Mark line number (line 1, column 5)
nop
7. Summary and Exercises
Core Pseudo Instruction Quick Reference Table
Pseudo Instruction | Function | Example |
---|---|---|
<span>.text</span> |
Defines code segment | <span>.text</span> |
<span>.data</span> |
Defines initialized data segment | <span>.data</span> |
<span>.asciz</span> |
Defines a string with a trailing 0 | <span>.asciz "Hello"</span> |
<span>.global</span> |
Declares a global symbol | <span>.global main</span> |
<span>.equ</span> |
Defines a constant | <span>.equ SIZE, 64</span> |
<span>.align</span> |
Address alignment | <span>.align 3</span> (8-byte alignment) |
<span>.macro</span> |
Defines a macro | <span>.macro ADD a, b</span> |
Exercises::
-
Use
<span>.rept</span>
to generate an array containing 10 instances of 0x1234. -
Write a macro
<span>DELAY_MS</span>
, generating a delay loop for the specified milliseconds based on input parameters. -
Combine with the link script to fix key data at memory address
<span>0x10000</span>
.
References:
https://sourceware.org/binutils/docs/as/AArch64-Directives.html
https://sourceware.org/binutils/docs/as/Pseudo-Ops.html