0x00
This section is dedicated to daily learning and note sharing to help everyone learn assembly language. Why learn assembly language? Because in red-blue confrontations, our tools are often detected and killed by some AV/EDR. Therefore, we need to counter AV, which is the evasion technique. To learn evasion techniques, we must start from the basics. In the future, I may also share some notes on C++, PE file structures, etc. Additionally, I might introduce knowledge related to reverse engineering.
0x01
6. More Flexible Methods for Locating Memory Addresses
Previously, we used the methods [0] and [bx] to locate the addresses of memory units in memory access instructions.
AND and OR Instructions
AND instruction: logical AND instruction, performs bitwise AND operation.
For example, the instruction:
mov al,01100011B
and al,00111011B
After execution, al=00111011B, this instruction can set the corresponding bit of the operand to 0 while keeping other bits unchanged. For example:
- The instruction to set the sixth bit of al to 0 is and al,10111111B
- The instruction to set the seventh bit of al to 0 is and al,01111111B
- The instruction to set the eighth bit of al to 0 is and al,11111110B
OR instruction: logical OR instruction, performs bitwise OR operation.
For example, the instruction:
mov al,01100011B
or al,00111011B
After execution, al=01111011B, this instruction can set the corresponding bit of the operand to 1 while keeping other bits unchanged. For example:
- The instruction to set the sixth bit of al to 1 is or al,01000000B
- The instruction to set the seventh bit of al to 1 is or al,10000000B
- The instruction to set the eighth bit of al to 1 is or al,00000001B
About ASCII Code
During text editing, encoding and decoding are performed according to ASCII encoding rules. When we press the ‘a’ key on the keyboard, the information is sent to the computer, encoded according to ASCII rules, and stored as 61H in a specified space in memory. The text editing software retrieves 61H from memory, sends it to the video memory, and the graphics card interprets the information in video memory using ASCII rules as the character ‘a’, which the graphics card driver displays on the screen.
Data Given in Character Form
We can indicate that data is given in character form in the assembly program using the ‘……’ method, and the compiler will convert them into corresponding ASCII codes, as shown in the following program:
assume cs:code,ds:data
data segment
db 'unIX'
db 'foRK'
data ends
code segment
start:
mov al,'a'
mov bl,'b'
mov ax,4c00h
int 21h
code ends
end start
In the above source program, db unIX is equivalent to db 75h,6eh,49h,58h, and mov bl,’b’ is equivalent to mov bl,62h
Using the d command to view the data segment, since ds=0B2D, the program starts from segment 0B3DH, and since the data segment is the first segment in the program, its segment address is 0B3DH.
Using the d command to view the data segment, debug displays its contents in hexadecimal and ASCII character format.
Case Conversion Issues
Consider the following problem: fill in the code in codesg to convert the first character in datasg to uppercase and the second string to lowercase.
assume cs:codesg,ds:datasg
datasg segment
db 'BaSiC'
db 'iNf0rMaTi0n'
datasg ends
codesg segment
start:
mov ax,14c00h
int 21h
codesg ends
end start
The ASCII codes for the same letter in uppercase and lowercase are different, with the ASCII value of lowercase letters being 20H greater than that of uppercase letters.
However, to convert characters in the program between cases, we first need to determine their case. Taking BaSiC as an example:
assume cs:codesg,ds:datasg
datasg segment
db 'BaSiC'
db 'iNf0rMaTi0n'
datasg ends
codesg segment
start:
mov ax,datasg
mov ds,ax
mov bx,0
mov cs,5
s:
mov al,[bx]
; if (al) > 61h, it is the ASCII code of a lowercase letter, then sub al, 20h
mov [bx],al
inc bx
loop s
mov ax,14c00h
int 21h
codesg ends
end start
The judgment here will use some instructions that we have not yet learned, so we consider other methods.
From the binary form of ASCII codes, except for the fifth bit (counting from 0), the other bits of uppercase and lowercase letters are the same, with uppercase being 0 and lowercase being 1. Therefore, we can directly change the fifth bit without judging the case.
assume cs:codesg,ds:datasg
datasg segment
db 'BaSiC'
db 'iNf0rMaTi0n'
datasg ends
codesg segment
start:
mov ax,datasg
mov ds,ax
mov bx,0
mov cs,5
s:
mov al,[bx]
and al,11011111B ; Clear the fifth bit to convert to uppercase
mov [bx],al
inc bx
loop s
mov bx,5
s0:
mov al,[bx]
or al,00100000B ; Set the fifth bit to 1 to convert to lowercase
mov [bx],al
inc bx
loop s0
mov ax,14c00h
int 21h
codesg ends
end start
[bx+idata]
Previously, we used [BX] to indicate a memory unit, and we can also use [bx+idata] to indicate a memory unit, where its offset address is (bx)+idata.
For example, the instruction mov ax,[bx+200] indicates that the content of a memory unit is sent to ax, where the length of this memory unit is 2 bytes, storing a word, and the offset address is the value in bx plus 200, with the segment address in ds, described numerically as (ax) = ((ds)*16+(bx)=200).
This instruction can also be written as mov ax,[200+bx],mov ax,200[bx], mov ax,[bx].200
Processing Arrays Using [bx+idata]
Requirement: Convert the first string defined in datasg to uppercase and the second string to lowercase.
assume cs:codesg,ds:datasg
datasg segment
db 'BaSiC'
db 'MinIX'
datasg ends
codesg segment
start:
mov ax,datasg
mov ds,ax
mov bx,0
mov cx,5
s:
mov al,[bx]
and al,11011111b
mov [bx],al
mov al,[5+bx]
or al,00100000b
mov [5+bx],al
inc bx
loop s
codesg ends
end start
SI and DI
SI and DI are registers in the 8086 that are similar in function to bx. SI and DI cannot be divided into two 8-bit registers. The following three sets of instructions achieve the same functionality:
mov bx,0
mov ax,[bx]
mov si,0
mov ax,[si]
mov di,0
mov ax,[di]
The following three sets of instructions also achieve the same functionality:
mov bx,0
mov ax,[bx+123]
mov si,0
mov ax,[si+123]
mov di,0
mov ax,[di+123]
[bx+si] and [bx+di]
Previously, we used [bx(si or di)] or [bx(si or di)+idata] to indicate a memory unit, and we can also use a more flexible method:[bx+si] and [bx+di]
[bx+si] indicates a memory unit at the offset address (bx)+(si), with a length of bytes.
mov ax,[bx+si] indicates (ax) = ((ds)*+(bx)+(si)), this instruction can also be written as mov ax,[bx][si]
[bx+si+idata] and [bx+di+idata]
[bx+si+idata] indicates a memory unit, with an offset address of (bx)+(si)+idata, the instruction mov ax,[bx+si+idata] means (ax) = ((ds)*16+(bx)+(si)+idata), this instruction can also be written as:
mov ax,[bx+200+si]
mov ax,[200+bx+si]
mov ax,200[bx][si]
mov ax,[bx].200[si]
mov ax,[bx][si].200
Flexible Application of Different Addressing Methods
If we compare the several methods used for locating memory (addressing methods) mentioned earlier, we can find:
- [idata] uses a constant to represent an address, which can be used to directly locate a memory unit.
- [bx] uses a variable to represent a memory address, which can be used to indirectly locate a memory unit.
- [bx+idata] uses a variable and a constant to represent an address, which can indirectly locate a memory unit based on a starting address.
- [bx+si] uses two variables to represent an address.
- [bx+si+idata] uses two variables and a constant to represent an address.
Programming: Capitalize the first letter of each word in the datasg segment.
assume cs:codesg,ds:datasg
datasg segment
db '1. file '
db '2. edit '
db '3. search '
db '4. view '
db '5. options '
db '6. help '
datasg ends
codesg segment
start:
mov ax,datasg
mov ds,ax
mov bx,0
mov cx,6
s:
mov al,[bx+3]
and al,11011111b
mov [bx+3],al
add bx,16
loop s
mov ax,4c00h
int 21h
codesg ends
end start
0x02
Previous Notes:
Assembly Language Day 05
Assembly Language Day 04
Assembly Language Day 03
Assembly Language Day 02
Assembly Language Day 01
Basic Knowledge of Assembly Language
Share
Collect
Looking
Like

Scan to Follow UsBecome an Excellent Network Security Guard