Fundamentals of Machine Language and Assembly Language

1. Basics of Machine Language

Composition of Machine Instructions

Machine instructions are binary codes that the CPU can execute directly, consisting of two parts:

  • Opcode: Specifies the operation to be performed (such as addition, subtraction, transfer, etc.)
  • Operand: Specifies the target of the operation and the location for storing the result (registers, memory addresses, etc.)

Characteristics of Machine Language

  1. CPU Dependency: Different CPU architectures have different machine instruction sets
  2. Binary Format: Represented directly in binary code
  3. High Execution Efficiency: Can be executed directly by the CPU without translation
  4. Poor Readability: Not user-friendly for humans, often represented in hexadecimal

Example of Machine Language

Below is a simple Intel 8086 machine language program snippet that adds two numbers:

A0 00 22   ; MOV AL, [2200H]  - Load the content at address 2200H into AL register
02 06 01 22 ; ADD AL, [2201H]  - Add the content at address 2201H to AL
A2 02 22   ; MOV [2202H], AL  - Store the value of AL at address 2202H

2. Basics of Assembly Language

Relationship Between Assembly Language and Machine Language

Assembly language is a symbolic representation of machine language, using mnemonics to replace binary opcodes and symbols to replace operand addresses. Assembly language needs to be converted to machine language by an assembler before it can be executed.

Basic Structure of Assembly Language

section .data       ; Data segment definition
    num1 db 25      ; Define byte variable num1 with value 25
    num2 db 30      ; Define byte variable num2 with value 30
    result db 0     ; Define result variable, initialized to 0

section .text       ; Code segment
    global _start   ; Program entry point

_start:
    mov al, [num1]  ; Load the value of num1 into AL register
    add al, [num2]  ; Add the value of num2 to AL register
    mov [result], al ; Store the result in result variable

    ; Exit program
    mov eax, 1      ; System call number 1 indicates exit
    mov ebx, 0      ; Return code 0
    int 0x80        ; Call kernel

Common Assembly Instructions

Instruction Function Example
MOV Data transfer <span>MOV AX, BX</span>
ADD Addition <span>ADD AX, 10</span>
SUB Subtraction <span>SUB CX, DX</span>
INC Increment by 1 <span>INC AL</span>
DEC Decrement by 1 <span>DEC BL</span>
JMP Unconditional jump <span>JMP label</span>
CMP Comparison <span>CMP AX, BX</span>
INT Interrupt call <span>INT 0x80</span>

3. Assembly Language Programming Examples

Example 1: Adding Two Numbers

; Example of adding two numbers
section .data
    num1 db 15      ; First number
    num2 db 27      ; Second number
    sum db 0        ; Store sum

section .text
    global _start

_start:
    mov al, [num1]  ; Load num1 into AL
    add al, [num2]  ; Add num2
    mov [sum], al   ; Store result

    ; Exit program
    mov eax, 1      ; System call number 1 (exit)
    mov ebx, 0      ; Return code 0
    int 0x80        ; Call kernel

Example 2: Summation Loop

; Calculate the sum from 1 to 10
section .data
    count equ 10    ; Number of iterations
    total dw 0      ; Store total sum

section .text
    global _start

_start:
    mov cx, count   ; Set loop counter
    mov ax, 0       ; Initialize total sum to 0

sum_loop:
    add ax, cx      ; Add CX value to AX
    loop sum_loop   ; Decrement CX, loop if not zero

    mov [total], ax ; Store result

    ; Exit program
    mov eax, 1
    mov ebx, 0
    int 0x80

Example 3: Conditional Judgment

; Compare two numbers
section .data
    numA db 50
    numB db 30
    msg1 db 'A >= B', 0xa
    len1 equ $ - msg1
    msg2 db 'A < B', 0xa
    len2 equ $ - msg2

section .text
    global _start

_start:
    mov al, [numA]
    cmp al, [numB]  ; Compare numA and numB
    jge greater     ; Jump if numA >= numB

    ; Output "A < B"
    mov eax, 4      ; System call number 4 (write)
    mov ebx, 1      ; File descriptor 1 (stdout)
    mov ecx, msg2
    mov edx, len2
    int 0x80
    jmp exit

greater:
    ; Output "A >= B"
    mov eax, 4
    mov ebx, 1
    mov ecx, msg1
    mov edx, len1
    int 0x80

exit:
    ; Exit program
    mov eax, 1
    mov ebx, 0
    int 0x80

4. Assembly Language Development Process

  1. Write Assembly Code: Use a text editor to write <span>.asm</span> files
  2. Assemble: Use an assembler (nasm) to convert source code to object files
    nasm -f elf32 program.asm -o program.o
    
  3. Link: Use a linker (ld) to convert object files to executable files
    ld -m elf_i386 program.o -o program
    
  4. Run: Execute the generated executable file
    ./program
    

5. Basics of Registers

General Purpose Registers

32-bit 16-bit 8-bit High 8-bit Low Usage
EAX AX AH AL Accumulator
EBX BX BH BL Base Register
ECX CX CH CL Counter
EDX DX DH DL Data Register

Segment Registers

Register Usage
CS Code Segment
DS Data Segment
SS Stack Segment
ES Extra Segment

6. Memory Addressing Modes

  1. Immediate Addressing: The operand is a constant

    mov ax, 1234h
    
  2. Register Addressing: The operand is in a register

    mov bx, ax
    
  3. Direct Memory Addressing: The operand is in memory, with the address given directly

    mov al, [2000h]
    
  4. Register Indirect Addressing: The operand address is in a register

    mov ax, [bx]
    
  5. Base-Indexed Addressing: Operand address = base register + index register

    mov ax, [bx+si]
    

7. Stack Operations

push ax   ; Push AX onto the stack
pop bx    ; Pop the top of the stack into BX

; Function call example
call function  ; Call function
...
function:
    push bp     ; Save base pointer
    mov bp, sp  ; Set new base pointer
    ; Function body
    pop bp      ; Restore base pointer
    ret         ; Return

8. System Call Example

; String output example
section .data
    msg db 'Hello, Assembly!', 0xa
    len equ $ - msg

section .text
    global _start

_start:
    ; Output string
    mov eax, 4      ; System call number 4 (write)
    mov ebx, 1      ; File descriptor 1 (stdout)
    mov ecx, msg    ; String address
    mov edx, len    ; String length
    int 0x80        ; Call kernel

    ; Exit program
    mov eax, 1      ; System call number 1 (exit)
    mov ebx, 0      ; Return code 0
    int 0x80

Although assembly language is complex, understanding it can help developers better grasp the underlying workings of computers, which is crucial for system programming, performance optimization, and embedded development.

Leave a Comment