Introduction to Assembly Language Tutorial

Public Awareness of Information Security, Enlightenment of Information Security Knowledge.

Add WeChat group reply to public account: WeChat Group; QQ Group: 16004488

Joining the WeChat or QQ group can get: Learning Tutorial

Tutorial ListSee the bottom menu of the public account

Previous articles on assembly language

▼

Writing a virus in assembly language

Brief analysis of assembly language

Learning programming is actually learning high-level languages, which are computer languages designed for humans.

However, computers do not understand high-level languages; they must be compiled into binary code to run. Learning a high-level language does not equate to understanding the actual operation steps of a computer.

The only language that computers truly understand is low-level language, which is specifically used to control hardware. Assembly language is a low-level language that directly describes/controls the operation of the CPU. If you want to understand what the CPU actually does and the steps of code execution, you must learn assembly language.

Assembly language is not easy to learn, and even concise introductions are hard to find. Below, I will attempt to write the most understandable assembly language tutorial, explaining how the CPU executes code.

1. What is Assembly Language?

We know that the CPU is responsible only for computation and lacks intelligence. When you input an instruction, it runs once and then stops, waiting for the next instruction.

These instructions are all in binary, known as opcodes; for example, the addition instruction is 00000011. The compiler’s role is to translate programs written in high-level languages into a series of opcodes.

For humans, binary programs are unreadable, and it is impossible to see what the machine has done. To solve the readability issue and occasional editing needs, assembly language was born.

Assembly language is the textual form of binary instructions, corresponding one-to-one with the instructions. For example, the addition instruction 00000011 is written in assembly language as ADD. Once converted back to binary, assembly language can be executed directly by the CPU, making it the lowest-level low-level language.

2. Origin

In the earliest days, programming was done by hand-writing binary instructions and inputting them into the computer through various switches. For instance, to perform addition, one would press the addition switch. Later, paper tape punch machines were invented, which automatically input binary instructions into the computer by punching holes in the tape.

To address the readability issue of binary instructions, engineers wrote those instructions in octal. Converting binary to octal is straightforward, but octal is still not very readable. Naturally, it eventually returned to using text, with the addition instruction written as ADD. Memory addresses were no longer directly referenced but represented using labels.

This added an extra step of translating these textual instructions into binary, a process called assembling, and the program that completes this step is called an assembler. The text it processes is naturally called assembly code. After standardization, it is referred to as assembly language, abbreviated as asm, which translates to 汇编语言 in Chinese.

Each CPU has its own machine instructions, so the corresponding assembly language also differs. This article introduces the most common x86 assembly language, which is used by Intel’s CPUs.

3. Registers

To learn assembly language, one must first understand two concepts: registers and memory models.

First, let’s look at registers. The CPU itself is responsible only for computation and does not store data. Data is generally stored in memory, and the CPU reads and writes data from memory when needed. However, the CPU’s computation speed is much higher than the read/write speed of memory. To avoid being slowed down, CPUs come with Level 1 and Level 2 caches. Essentially, CPU cache can be seen as faster memory.

However, even CPU cache is not fast enough, and the addresses of data in the cache are not fixed, so the CPU must address each read/write operation, which can also slow down speed. Therefore, in addition to cache, the CPU also has registers to store the most frequently used data. This means that the most frequently read/write data (such as loop variables) will be placed in registers, and the CPU will prioritize reading/writing registers before exchanging data with memory.

Registers do not rely on addresses to differentiate data but on names. Each register has its own name, and we tell the CPU to fetch data from a specific register, which is the fastest method. Some compare registers to the zero-level cache of the CPU.

4. Types of Registers

Early x86 CPUs had only 8 registers, each with different purposes. Modern registers have over 100 and have become general-purpose registers without specific designations, but the names of early registers have been retained.

Among these 8 registers, the first seven are general-purpose. The ESP register has a specific purpose, storing the address of the current Stack (see next section).

We often see names like 32-bit CPU and 64-bit CPU, which actually refer to the size of the registers. A 32-bit CPU has registers that are 4 bytes in size.

5. Memory Model: Heap

Registers can only store a small amount of data, and most of the time, the CPU needs to direct registers to exchange data directly with memory. Therefore, in addition to registers, one must understand how memory stores data.

When a program runs, the operating system allocates a segment of memory to store the program and the data generated during its execution. This segment of memory has a starting address and an ending address, for example, from 0x1000 to 0x8000, where the starting address is the smaller address and the ending address is the larger address.

During program execution, for dynamic memory allocation requests (such as creating new objects or using the malloc command), the system will allocate a portion of the pre-allocated memory to the user, following specific rules starting from the starting address (in reality, the starting address will have a segment of static data, which we will ignore here). For example, if the user requests 10 bytes of memory, it will be allocated starting from the address 0x1000 up to address 0x100A. If another request for 22 bytes is made, it will be allocated up to 0x1020.

This memory area allocated due to user requests is called Heap. It grows from the starting address upwards (to higher addresses). An important characteristic of the Heap is that it does not automatically disappear; it must be manually released or reclaimed by a garbage collection mechanism.

6. Memory Model: Stack

In addition to the Heap, other memory usage is called Stack. In simple terms, the Stack is a memory area temporarily occupied due to function execution.

Let’s look at the example below.

int main() {
   int a = 2;
   int b = 3;
}

When the system starts executing the main function, it will establish a frame in memory for it, where all internal variables of main (like a and b) are stored. Once the main function ends, that frame will be reclaimed, releasing all internal variables and no longer occupying space.

What happens if a function calls another function?

int main() {
   int a = 2;
   int b = 3;
   return add_a_and_b(a, b);
}

In the above code, the main function calls the add_a_and_b function. When it reaches this line, the system will also create a new frame for add_a_and_b to store its internal variables. This means that at this moment, there are two frames: main and add_a_and_b. Generally, the number of frames corresponds to the number of layers in the call stack.

Once add_a_and_b finishes executing, its frame will be reclaimed, and the system will return to the point where the main function was interrupted, continuing execution. This mechanism allows for layered function calls, with each layer able to use its own local variables.

All frames are stored in the Stack. Since frames are layered, the Stack is called a stack. Creating a new frame is called “pushing onto the stack”; the English term is push. The recovery of the stack is called “popping off the stack”; the English term is pop. The characteristic of the Stack is that the last frame pushed is the first to be popped (because the innermost function call ends first), which is known as the

Related posts

Leave a Comment Cancel reply