Understanding ARM Assembly and ARM GNU Assembly

1. Two Ways of ARM Assembly Development

ARM assembly development refers to the use of ARM-provided assembly instructions for ARM program development.

There are two ways of ARM assembly development: one is using ARM assembly, and the other is using ARM GNU assembly. The assembly instructions used in both development methods are exactly the same; the difference lies in the macro instructions, pseudo-instructions, and pseudo-operations. The real distinction between the two methods is the compilation tools used.

For ARM assembly, the compiler developed by ARM is used, while ARM GNU assembly uses the GNU compiler developed for the ARM instruction set, specifically arm-gcc.

2. ARM Compilation Development Environment

Two commonly used ARM compilation development environments:

  • DS5: An integrated development software provided by ARM. It uses the toolchain provided by ARM for program compilation.

  • GNU Development Environment: Composed of the GNU assembler as, cross-compiler gcc, and linker ld, etc.

3. Pseudo-operations, Macro Instructions, and Pseudo-instructions

Pseudo-operations: Some special instruction mnemonics in ARM assembly language programs, mainly used to prepare for various tasks in the assembly program, processed by the assembler during the assembly of the source program, rather than executed by the machine during runtime. For example, defining a program segment is a pseudo-operation.

Macro instructions: A segment of independent program code that can be inserted into the source program, defined through pseudo-operations.

Pseudo-instructions: Some special instruction mnemonics in ARM assembly language programs that are not executed during processor runtime; instead, they are replaced by appropriate ARM machine instructions during assembly, thus achieving actual instruction operations.

4. ARM Assembly Pseudo-operations

Pseudo-operation

Syntax Format

Function

GBLA

GBLA Variable

Declare a global arithmetic variable and initialize it to 0

GBLL

GBLL Variable

Declare a global logical variable and initialize it to {FALSE}

GBLS

GBLS Variable

Declare a global string variable and initialize it to an empty string

LCLA

LCLA Variable

Declare a local arithmetic variable and initialize it to 0

LCLL

LCLL Variable

Declare a local logical variable and initialize it to {FALSE}

LCLS

LCLS Variable

Declare a local string variable and initialize it to an empty string

SETA

SETA Variable expr

Assign a value to a global or local arithmetic variable

SETL

SETL Variable expr

Assign a value to a global or local logical variable

SETS

SETS Variable expr

Assign a value to a global or local string variable

RLIST

name LIST {list of registers}

Define a name for a list of general-purpose registers

CN

name CN expr

Define a name for a coprocessor register

CP

name CP expr

Define a name for a coprocessor

DN/SN

name DN/SN expr

Define a name for a double/single precision VFP register

FN

name FN expr

Define a name for an FPA floating-point register

LTORG

LTORG

Declare the start of a data buffer pool (literal pool)

MAP

MAP expr {, base-register}

Define the starting address of a structured memory table (storage map)

FIELD

{label} FIELD expr

Define a data field in a structured memory table

SPACE

{label} SPACE expr

Allocate a block of contiguous memory units and initialize with 0

DCB

{label} DCB expr {,expr}..

Allocate a block of byte memory units and initialize with expr

DCD/ DCDU

{label} DCD/DCDU expr {,expr}…

Allocate a block of word memory units and initialize with expr

DCDO

{label} DCDO expr {,expr}…

Allocate a block of word-aligned memory units and initialize with expr

DCFD/DCFDU

{label} DCFD{U} fpliteral

,{,fpliteral}…

Allocate word-aligned memory units for double precision floating-point numbers

DCFS/DCFSU

{label} DCFS{U} fpliteral

,{,fpliteral}…

Allocate word-aligned memory units for single precision floating-point numbers

DCI

{label} DCI expr, {expr}…

ARM code allocates a segment of word-aligned memory units, filling expr (binary instruction code); in THUMB code, allocates a segment of half-word aligned half-word memory units.

DCQ/ DCQU

{label} DCQ{U} {-} literal,

{, {-} literal}…

Allocate a segment of memory in double words (8 bytes)

DCW/DCWU

{label} DCW{U} {-} literal,

{, {-} literal}…

DCW is used to allocate a segment of half-word aligned half-word memory units

1. AREA

Create a new program code or data area.

Format: AREA name, {,attr,} …

Where name is the segment name, attr is the segment name attribute.

For attributes, there are the following:

  • CODE: Used to define a code segment, default is READONLY.

  • DATA: Used to define a data segment, default is READWRITE.

  • READONLY: Specifies that the contents of this segment are read-only.

  • READWRITE: Specifies that the contents of this segment are readable and writable.

  • ALIGN: Specifies alignment to a power of 2.

  • COMMON: Defines a common segment. It does not contain any user code and data. COMMON segments with the same name in different source files share the same storage unit.

2. ALIGN

Specifies alignment.

ALIGN 4 indicates 4-byte address alignment.

ALIGN 8 indicates 8-byte address alignment.

Note: The difference between using ALIGN in AREA and using ALIGN alone lies in the format and alignment calculation.

3. ENTRY

Specifies the entry point of the assembly program.

A program must have at least one entry point, and it can have multiple entry points, but there can be at most one ENTRY in a source file. When multiple source files have ENTRY, the linker specifies the actual entry point of the program.

4. END

Indicates the end of the source program.

Therefore, assembly language source files must end with END; when the assembler encounters END, it will terminate compilation.

5. EXPORT

Format: EXPORT label [,WEAK]

Declare a global label that can be used by other source files. WEAK indicates that if there are other labels with the same name, the other labels take priority.

6. IMPORT

Format: IMPORT label [,WEAK]

Indicates that the referenced label is in other source files and only needs to be referenced in the current file. WEAK indicates that if the label cannot be found, no error will be reported; generally, the label is set to 0. If this label is used in B or BL instructions, the instruction will be set to nop.

This label will be added to the symbol table of the current source file.

7. EXTERN

Similar to IMPORT, except that if the current file does not reference this label, the label will not be added to the symbol table of the current source file.

8. GET (or INCLUDE)

Includes a source file into the current source file.

9. EQU

Assigns a constant label a value.

Format: name EQU expression

Where name is the symbol name, and expression is a fixed value related to the register or program.

For example:

num EQU 2; assigns the number 2 to the symbol.

EQU is equivalent to #define in C language for defining a constant.

10. SPACE

Used to allocate a block of contiguous memory units and initialize with 0. SPACE can be replaced with %.

Format: {label} SPACE expr

label: is an optional label.

expr: number of bytes allocated.

For example:

stack SPACE 100; allocates 100 bytes of memory units and initializes with 0. The label stack is the starting address of this space.

11. DCB

Used to allocate segment byte memory units and initialize with expr in the pseudo-operation.

Format: {label} DCB expr {,expr}

label: is an optional label.

expr: can be a value from -128 to 255 or a string.

For example:

string DCB “HELLO”; allocates space for the string HELLO, where string is the starting address of this space.

12. DCD and DCDU

Used to allocate segment word memory units (the allocated memory is word-aligned, DCDU is not strictly word-aligned) and initialize with expr in the pseudo-operation. DCD can be replaced with &.

Format: {label} DCD expr, {,expr}

label: is an optional label, indicating the starting address of the memory unit.

expr: numeric expression or label in the program.

For example:

data DCD 1,2,3,4; allocates word-aligned word unit space, initialized to 1, 2, 3, 4.

5. ARM Assembly Pseudo-instructions

ARM pseudo-instructions include: ADR, ADRL, LDR, NOP.

THUMB pseudo-instructions include: ADR, LDR, NOP.

Pseudo-instruction

Syntax Format

Function

ADR

ADR{cond} register, expr

Reads a PC-relative or register-relative address value into a register. Small range address read.

ADRL

ADRL{cond} register, expr

Reads a PC-relative or register-relative address value into a register. Medium range address read.

LDR

LDR {cond} register,

=[expr|label]

Reads a 32-bit immediate value or an address value into a register. Large range address read.

NOP

NOP

Replaced by a no-operation during assembly.

6. ARM GNU Compilation Environment

Pseudo-operation

Syntax Format

Function

.byte

.byte expr {,expr}…

Allocates a segment of byte memory units and initializes with expr.

.hword/.short

.hword expr {,expr}…

Allocates a segment of half-word memory units and initializes with expr.

.ascii

.ascii expr {,expr}…

Defines string expr.

.asciz/.string

.asciz expr {,expr}…

Defines string expr (adds /0 as a terminator).

.float/.single

.float expr {,expr}…

Defines 32-bit IEEE floating-point number expr.

.double

.double expr {,expr}…

Defines 64-bit IEEE floating-point number expr.

.word/.long/.int

.word expr {,expr}…

Allocates a segment of word memory units and initializes with expr.

.fill

.fill repeat {,size} {,value}

Allocates a segment of byte memory units, filling repeat times with value of size length.

.zero

.zero size

Allocates a segment of byte memory units and fills the memory with 0.

.space/.skip

.space size, {,value}

Allocates a segment of memory units and initializes the memory with value.

.section

.section expr

Defines a section.

.text

.text {subsection}

Code segment.

.data

.data{subsection}

Data segment.

.bss

.bss{subsection}

bss segment.

.cond 16/.thumb

.code 16/.thumb

Indicates that subsequent assembly instructions use the THUMB instruction set.

.code 32/.arm

.code 32/.arm

Indicates that subsequent assembly instructions use the ARM instruction set.

.end

.end

Marks the end of the assembly file.

.include

.include “filename”

Includes a source file into the current source file.

.align/.balign

.align {alignment} {,fill},{max}

Fills bytes to satisfy certain alignment format.

7. Differences Between the Two Development Environments

The assembly code under the two development environments has many differences, mainly in symbols and pseudo-operations.

Pseudo-operators of ARM Assembly

Pseudo-operators of GNU Assembly

INCLUDE

.include

NUM EQU 25

.equ NUM, 25

EXPORT

.global

IMPORT

.extern

DCD

.long

IF: DEF:

.ifdef

ELSE

.else

ENDIF

.endif

OR

|

SHL

<<

RN

.req

GBLA

.global

NUM SETA 16

.equ NUM , 16

MACRO

.macro

MEND

.endm

END

.end

AREA WORD, CODE, READONLY

.text

AREA BLOCK, DATA, READWRITE

.data

CODE32

.arm

CODE16

.thumb

LTORG

.ltorg

%

.fill

ENTRY

ENTRY:

ldr x0,=0xff

ldr x0,=0xff

Recruitment Information

Understanding ARM Assembly and ARM GNU Assembly

Training Information

Understanding ARM Assembly and ARM GNU Assembly

You can also directly click the URL to visit

http://gk.chinaaet.com

Understanding ARM Assembly and ARM GNU Assembly

Understanding ARM Assembly and ARM GNU Assembly

Understanding ARM Assembly and ARM GNU Assembly

Leave a Comment