Scan the QR code to follow “Finger Tips Sound” and learn together, grow together.
Part1 Definition of Content
1.1 Definition of Data Segment
Assembly language programs are written in segments, generally defining data in the data segment and the program in the code segment. The syntax for defining a segment is as follows:
segment_name SEGMENT
...(content of the segment)...
segment_name ENDS
Notes:
Assembly language does not distinguish between uppercase and lowercase letters;
In assembly language, a line can only contain one statement;
The name of the segment must start with a letter or underscore, be meaningful, and not conflict with reserved words;
In assembly language, comments are indicated by a semicolon following the English text;
One segment cannot define another segment within it, meaning segments are independent of each other.
1.2 Definition of Data
The definition of data refers to allocating storage units for given data and storing them in the data segment in a standard format. The elements of data definition statements include DB, DW, DD, DQ, DT, etc.
1.2.1 Define Byte Data DB
Below is a segment of assembly code:
DATA SEGMENT
X DB -1,255,'A',3+2,?
DB "ABC",0FFH,11001010B
Y DB 3 DUP(?)
DATA ENDS
The following explains the above code segment:
Definition of Variables:X and Y are variable names, indicating that the programmer has defined two variables X and Y. Unlike high-level languages, in assembly language, variables actually represent the address of the first data that follows; the variable name represents several subsequent data items;
Definition of Byte Data:DB indicates that the defined data type is byte type.DB can be used to define integers (including positive and negative numbers, which can be in decimal, hexadecimal, or binary) as well as characters;
Evaluated Expression:When defining data, simple evaluated expressions can appear. For example, DB 3+2 is equivalent to DB 5;
Definition of Unknown Values:A question mark indicates a value that is temporarily uncertain, generally filled with 0 for this unit;
Definition of Multiple Characters:Multiple characters enclosed in double quotes can appear, which will be stored separately in order;
Repeated Definition of Same Data:DUP indicates repeated definition of multiple identical data.The syntax is as follows;
Defining Across Lines:If there is too much data for one line, a new line can be started to continue the definition.No need to rewrite the variable name, but the DB pseudo-instruction needs to be rewritten.
1.2.2 Define Word Data DW
Word data is 16 bits, simply change the DB in the above byte definition syntax to DW.
1.2.3 Define Double Word Data DD
Double word data is 32 bits, simply change the DB in the above byte definition syntax to DD. Note that the high byte of the data is stored in the higher address unit, and the low byte is stored in the lower address unit.
1.2.4 Define Quad Word and Ten Byte Data DQ DT
Simply change the DB pseudo-instruction to DQ and DT.
Part2 Data Transfer
2.1 Format of Instruction Statements
Instruction statements correspond to a machine instruction, and the general format is as follows:
[Label:] Opcode [Operands] [;Comment]
Syntax Explanation:
Label refers to the name given by the programmer for this instruction statement. Most instruction statements do not need a label; only some special instruction statements require it;
Opcode specifies the operation type of this instruction, and all opcodes are reserved words;
Operands can be 0-3, and if there are multiple operands, they are separated by commas. The rightmost operand is the source operand, and the leftmost operand is the destination operand.
2.2 Classification of Operands
Operands can be classified into three types: register operands, immediate operands, and memory operands. Regarding register operands, note that register IP and FLAGS cannot appear as operands in instructions; regarding immediate operands, note that immediate operands cannot be used as destination operands. Below, we focus on memory operands and first introduce two basic concepts:
Memory operands indicate access to a memory unit, requiring both the segment base address and offset address to perform;
In most cases, the instruction will automatically use the content in the DS register as the segment base address of the operand, therefore, when writing assembly language source programs, the first thing to do is to load the data segment base address into the DS register.;
Now that we have set the segment base address, we just need the offset address to locate the correct memory unit. There are two methods to provide the offset address: direct and indirect. The direct method refers to directly writing the offset address of the memory unit in the instruction, while the indirect method involves loading the offset address into a register in advance and using the value in that register to locate the memory unit when needed.
(1) Direct Method Syntax:
MOV Destination_Register, Variable_Name[+Byte_Offset]
This statement’s function is to use the content in the DS register as the segment base address and the sum of the specified variable name’s offset in the data segment (with the byte offset) as the offset address, placing the value from the specified memory unit into the destination register.
(2) Indirect Method Syntax:
MOV Indirect_Address_Register, OFFSET Variable_Name
(Here are the statements when using the offset address)
MOV Destination_Register, Indirect_Address_Register
Syntax Explanation:
OFFSET is a reserved word that indicates extracting the offset address of the following variable;
Indirect address registers can only be one of BX, BP, SI, DI. Unless otherwise specified, using BX, SI, and DI automatically uses the content of DS as the segment base address, while using BP automatically uses the value of SS as the segment base address.
2.3 Definition of Program Segment
The general format of a program segment is as follows:
CODE SEGMENT
ASSUME CS:CODE, DS:DATA
START: MOV AX, DATA
MOV DS, AX
...(other instruction parts)...
MOV AX, 4C00H
INT 21H
CODE ENDS
END START
Syntax Interpretation:
The first two instructions of the program are used to load the data segment register DS. After entering the program, the code segment register CS’s value is automatically set by the operating system as the segment base address of the code segment, while the data segment’s base address needs to be manually loaded into DS by the programmer;
ASSUME pseudo-instruction is used to specify the segment base address register corresponding to each data segment. In the above code, the segment base address register for CODE segment is CS, and for DATA segment is DS;
INT 21H indicates calling the service program provided by the operating system numbered 21H. The type of service is determined by the function number in AH; in this example, 4CH indicates returning to the operating system’s operation; The code in AL is called the return code, with return code 00H indicating a normal return;
END pseudo-instruction marks the end of the entire program. Any code written below the END statement will not be assembled.The label after END indicates the entry address of the program, which is where the assembly program begins execution.
2.4 Basic Transfer Instructions
Basic transfer instructions are the most frequently used instructions and need to be mastered. The format is as follows:
MOV Destination_Operand, Source_Operand
Syntax Explanation:
The types of source operand and destination operand must be the same. If they are not the same, a forced type conversion must be used first. The syntax for forced type conversion can be seen below;
Source operand and destination operand cannot both be memory operands, nor can they both be segment registers;
Destination operand cannot be an immediate number;
The code segment base address register CS cannot be a destination operand;
When using an immediate number as a source operand, the immediate number will be extended according to the type of destination operand.
Forced Type Conversion Syntax (use with caution):
Data_Type PTR[Variable_Name]
Part3 Stack
3.1 Definition of Stack
The stack is also part of the memory used by the user, for storing temporary data and other information. The syntax for defining a stack segment is as follows:
Stack_Name SEGMENT STACK
(stack content)
Stack_Name ENDS
Syntax Explanation:
The only difference between stack definition and general segment definition is the use of STACK;
For stack segments, the system automatically places the segment base address of SSEG into the SS register when loading the program, and the number of bytes in the stack is automatically placed into the SP register;
Contents in stack segments are allocated and used starting from larger addresses;
For the 8086 CPU, only 2-byte data can be pushed or popped from the stack.
3.2 Methods for Using Stack
Common stack-related instructions include PUSH, POP, PUSHF, and POPF, with the following syntax:
PUSH Source_Operand ;Push specified operand onto stack for protection
POP Destination_Operand ;Restore top operand from stack to specified location
PUSHF ;Push flag register content onto stack for protection
POPF ;Pop flag register from stack for restoration
Part4 Common Operand Expressions
4.1 Symbol Definition Pseudo-Instruction
Symbol definition is equivalent to C language’s #define preprocessor, used for equivalent replacement of symbols. The syntax for symbol definition is as follows:
Symbol_Name EQU Expression
Syntax Explanation:
During assembly, the symbol name defined by EQU is replaced with the corresponding expression;
Symbol names defined by EQU cannot be redefined.
Another way to define a symbol is to use the “=” symbol, with the specific syntax as follows:
Symbol_Name = Constant_Expression
Syntax Explanation:
When defining symbols with an equal sign, only constant expressions can be used.
4.2 Get Segment Base Address
The SEG can be used to obtain the segment base address of the address expression, with the specific method as follows:
SEG Address_Expression
Part5 Arithmetic Operations
5.1 Addition Instruction
To add two operands, the ADD instruction should be used, with the instruction syntax as follows:
ADD Destination_Operand, Source_Operand
Syntax Explanation:
This instruction adds the destination operand to the source operand, with the result stored in the original location of the destination operand;
After the ADD instruction is executed, the CPU’s status flags will be refreshed.
Additionally, there is an INC instruction that increments the operand, with the syntax as follows:
INC Operand
Syntax Explanation:
The increment operation does not affect the CPU’s status flags;
Increment instructions are often used to modify counters and memory pointer values.
5.2 Subtraction Instruction
The use of subtraction instructions is symmetrical to addition instructions. ADD in addition corresponds to SUB in subtraction; INC in addition corresponds to DEC in subtraction.
5.3 Multiplication and Division Instructions
The multiplication instruction is MUL, and the division instruction is DIV, with usage similar to addition and subtraction. Since multiplication and division are used less frequently, they will not be elaborated further.
Part6 Looping
The syntax for loop instructions is as follows:
LOOP Label
Syntax Explanation:
The number of loops is determined by the value in the CX register. After each loop, the value in the CX register decreases by 1; when CX=0, the loop terminates, hence CX is also called the counter;
The process of loading the CX register should be completed before the loop begins;
Each successful loop returns to the statement at the label.
Part7 Logical Operations
Logical operations include AND, OR, XOR, and NOT, with the usage syntax as follows:
Logical_Operation_Opcode Destination_Operand Source_Operand
Usage Conditions:
AND instruction is mainly used to selectively clear bits of the operand;
OR instruction is mainly used to selectively set bits of the operand;
XOR instruction is mainly used to selectively invert bits of the operand;
NOT instruction is mainly used to invert the entire operand.
Part8 Interrupt Calls
All DOS system function calls are implemented via the soft interrupt instruction INT 21H. INT 21H is an interrupt service program with over 90 sub-functions. Each sub-function of INT 21H is numbered, known as the function number.
Method for DOS system function calls:
MOV Function_Number ;Place function number into register AH
......
(Place the entry parameters required by the function in other registers)
......
INT 21H ;Call DOS system function
Common Functions:
8.1 Keyboard Input Single Character
Function number 1, input character stored in AL as ASCII code and displayed simultaneously.
MOV AH 01
INT 21H
8.2 Screen Display Single Character
Function number 2, display the character stored in DL register on the screen.
MOV AH 02
MOV DL Character_To_Display
INT 21H
8.3 Screen Display String
Function number 9, used to display a string stored in the DX register on the monitor, the displayed string must end with ‘$’.
MOV AH 09
MOV DX Address_Of_String_To_Display
INT 21H
8.4 Return to DOS
A function to exit a program normally and return to DOS, with function number 4CH.
MOV AH 4CH
INT 21H
Part9 Definition and Calling of Subroutines
9.1 Define Subroutine
Subroutine_Name PROC
...
RET ;Indicate return from subroutine
Subroutine_Name ENDP ;Indicate end of subroutine definition
9.2 Call Subroutine
CALL Subroutine_Name
Part10 Read and Write Ports in Interfaces
MOV DX Port_Address
......
(Initialize other registers)
......
OUT DX Data_To_Transmit_To_Port
Part11 Empty Instruction Delay
Using NOP indicates executing an empty instruction, doing nothing. When a delay is needed between instructions, NOP can be inserted.
NOP
Part12 Selection Structure
12.1 CMP Instruction
The CMP instruction format is as follows:
CMP Destination_Operand, Source_Operand
Syntax Explanation:
CMP is used to compare the sizes of two operands of the same type;
The result of the instruction execution does not modify the two operands, but modifies the flags;
CMP is often used in conjunction with the following instructions.
JGE Before >= After Jump if greater or equal
JG Before > After Jump if greater
JLE Before <= After Jump if less or equal
JL Before < After Jump if less
JNE Before not equal After Jump if not equal
JE Before equal After Jump if equal
