Introduction to Assembly Language: Underlying Principles

Author: Ruan Yifeng

Link:http://www.ruanyifeng.com/blog/2018/01/

Code Farmer’s Way

High-Quality Technical Article Directory of Code Farmer’s Way (Click Me)

About Code Farmer’s Way (Click Me)

Learning programming is essentially learning high-level languages, which are designed for humans.

However, computers do not understand high-level languages; they must be converted into binary code through a compiler to run. Knowing a high-level language does not equate to understanding the actual steps a computer takes to execute code.

Introduction to Assembly Language: Underlying Principles

What computers truly understand is low-level languages, which are specifically used to control hardware. Assembly language is a low-level language that directly describes/controls the CPU’s operation. If you want to understand what the CPU is doing and the steps of code execution, you must learn assembly language.

Assembly language is not easy to learn, and even concise introductions are hard to find. Below, I will attempt to write the easiest-to-understand assembly language tutorial, explaining how the CPU executes code.

Introduction to Assembly Language: Underlying Principles

What Is Assembly Language

We know that the CPU is only responsible for computation and does not possess intelligence. When you input an instruction, it executes it once and then stops, waiting for the next instruction.

These instructions are binary and are called opcodes; for example, the addition instruction is 00000011. The role of the compiler is to translate the program written in high-level language into a series of opcodes.

For humans, binary programs are too difficult to read, and it is impossible to see what the machine is doing through code. To solve the readability problem and occasional editing needs, assembly language was born.

Introduction to Assembly Language: Underlying Principles

Assembly language is the textual form of binary instructions, corresponding one-to-one with binary instructions. For example, the addition instruction 00000011 is written as ADD in assembly language. As long as it is restored to binary, the assembly language can be directly executed by the CPU, so it is the lowest-level low-level language.

Origin

In the early days, programming involved manually writing binary instructions and inputting them into the computer via various switches; for example, to perform addition, you would press the addition switch. Later, the invention of the punched tape machine allowed binary instructions to be automatically input into the computer by punching holes in a tape.

To solve the readability problem of binary instructions, engineers wrote those instructions in octal. Converting binary to octal is easy, but octal is also not very readable. Naturally, it eventually returned to using text to express it, with the addition instruction written as ADD. Memory addresses were no longer directly referenced but were represented by labels.

However, this added a step; these textual instructions had to be translated into binary, a process called assembling, and the program that completes this step is called an assembler. The text it processes is naturally called assembly code. After standardization, it was called assembly language, abbreviated as asm, and translated into Chinese as 汇编语言.

Introduction to Assembly Language: Underlying Principles

Each CPU has different machine instructions, so the corresponding assembly language is also different. This article introduces the most common x86 assembly language, which is used by Intel CPUs.

Registers

To learn assembly language, you must first understand two concepts: registers and memory models.

First, let’s look at registers. The CPU itself is only responsible for calculations, not for storing data. Data is generally stored in memory, and the CPU reads and writes data from memory when needed. However, the CPU’s computation speed is much faster than the read and write speed of memory. To avoid being slowed down, the CPU has its own level one and level two caches. Basically, CPU cache can be seen as faster memory for reading and writing.

However, CPU cache is still not fast enough, and the addresses of data in the cache are not fixed. Each read and write operation requires addressing, which can also slow down speed. Therefore, in addition to the cache, the CPU also has registers, which are used to store the most frequently used data. This means that the most frequently read and written data (such as loop variables) will be placed in registers, and the CPU will prioritize reading and writing from registers, exchanging data with memory only when necessary.

Introduction to Assembly Language: Underlying Principles

Registers do not differentiate data by address, but by name. Each register has its own name, and we tell the CPU which register to fetch data from, which is the fastest way. Some compare registers to the CPU’s zero-level cache.

Types of Registers

Early x86 CPUs had only 8 registers, each with different purposes. Now, there are over 100 registers, which have become general-purpose registers, without specific purposes, but the names of early registers have been retained.

  • EAX

  • EBX

  • ECX

  • EDX

  • EDI

  • ESI

  • EBP

  • ESP

Among these 8 registers, the first seven are general-purpose. The ESP register has a specific purpose, which is to hold the address of the current stack (see next section).

Introduction to Assembly Language: Underlying Principles

We often see names like 32-bit CPU, 64-bit CPU; these actually refer to the size of the registers. A 32-bit CPU has a register size of 4 bytes.

Memory Model: Heap

Registers can only store a small amount of data, and most of the time, the CPU needs to direct registers to exchange data directly with memory. Therefore, in addition to registers, it is necessary to understand how memory stores data.

When a program runs, the operating system allocates a segment of memory for it to store the program and the data generated during execution. This segment of memory has a starting address and an ending address, for example, from 0x1000 to 0x8000, where the starting address is the smaller one and the ending address is the larger one.

Introduction to Assembly Language: Underlying Principles

During the program’s execution, for dynamic memory allocation requests (such as creating new objects or using the malloc command), the system will allocate a portion of the pre-allocated memory to the user, specifically from the starting address (in practice, there will be a segment of static data at the starting address, which we will ignore). For example, if a user requests 10 bytes of memory, it will be allocated starting from address 0x1000, up to address 0x100A. If they request another 22 bytes, it will be allocated up to 0x1020.

Introduction to Assembly Language: Underlying Principles

This memory area allocated due to user requests is called Heap (堆). It grows from the starting address upwards (in terms of address). One important feature of the heap is that it does not disappear automatically and must be manually freed or reclaimed by a garbage collection mechanism.

Memory Model: Stack

In addition to the heap, other memory usage is called Stack (栈). In simple terms, the stack is the memory area temporarily occupied during function execution.

Introduction to Assembly Language: Underlying Principles

Let’s look at the following example.

int main() {
   int a = 2;
   int b = 3;
}

In the above code, when the system starts executing the main function, it will create a frame in memory for it, storing all internal variables of main (such as a and b) in this frame. Once the main function execution is complete, this frame will be reclaimed, freeing all internal variables and no longer occupying space.

Introduction to Assembly Language: Underlying Principles

If a function calls another function, what happens?

int main() {
   int a = 2;
   int b = 3;
   return add_a_and_b(a, b);
}

In the above code, the main function calls the add_a_and_b function. When it reaches this line, the system will also create a new frame for add_a_and_b to store its internal variables. This means that at this point, there are two frames simultaneously: main and add_a_and_b. Generally speaking, the number of frames corresponds to the depth of the call stack.

Introduction to Assembly Language: Underlying Principles

Once add_a_and_b finishes running, its frame will be reclaimed, and the system will return to the point where the main function was interrupted, continuing execution from there. This mechanism allows for layered function calls, with each layer able to use its local variables.

All frames are stored in the stack, and since frames are stacked on top of each other, the stack is called a stack. Creating a new frame is called “pushing onto the stack”; in English, it’s called push; reclaiming the stack is called “popping from the stack”; in English, it’s called pop. The characteristic of the stack is that the last frame to be pushed is the first to be popped (because the innermost function call ends first), which is called a

Leave a Comment