Exploring The Magical Power Of Process Address Space In Linux

Linux | Red Hat Certified | IT Technology | Operations Engineer

👇Technical exchange QQ group of 1000 people, note [public account] for faster approval

Address Bus

The width of the address bus is generally 32 or 64. Each bus can exhibit either electric or non-electric (weak signal), with an energized bus interpreted by the computer as 1 and a non-energized bus as 0.

32 buses can represent 2 to the power of 32 combinations of 0s and 1s. Each combination can request 1 bit of space in memory, totaling 4294967296 bits of space, which is 4GB. This is the maximum space a program can request.

2 to the power of 32 combinations of 0s and 1s can be understood in decimal as growing from 0 to 4294967296. In other words, from the CPU’s perspective, the memory space it sees is linear (continuous).

Now we can introduce a concept: the so-called process address space is this segment of linear space.

Exploring The Magical Power Of Process Address Space In Linux

Memory at the Language Level

[C++] C++ Memory Management – CSDN Blog

If a program wants to access memory, it must be running on the CPU. The memory managed by the language is the process address space mentioned above.

The process address space is divided into stack, heap, data segment, code segment, and memory-mapped segment, distributed as shown in the figure below.

Grouping similar attributes in the same area and defining starting positions and boundaries between different areas is to efficiently maintain this segment of linear space.

Memory at the System Level

The process address space is actually a virtual space. This means that the memory managed by the program is virtual memory.

The addresses of physical memory are absolute, with each storage unit having its own number. The system manages both virtual memory and physical memory.

The memory allocated in the process address space is mapped to physical space through page tables – creating an equal-sized space in physical space to store data.

Process Overview

To clarify what a process address space is, we have downplayed the concept of a process. Now let’s delve into the concept of a process to better understand the process address space.

Process vs. Program:

Compiling a text file that saves language logic into a binary file is an executable program. The CPU can only read data from memory, so an executable program must first be loaded into memory to execute.

The Linux operating system creates a PCB for the executable program – the Process Control Block. The Linux kernel is written in C, and the PCB is actually a structure (task_struct). The attributes of the executable program are maintained in this structure.

Below is the classification of attributes saved by task_struct for the executable program.

Identifier: Unique identifier describing this process, used to distinguish it from other processes. State: Task state, exit code, exit signal, etc. Priority: Priority relative to other processes. Program Counter: Address of the next instruction to be executed in the program. Memory Pointer: Includes pointers to program code and process-related data, as well as pointers to memory blocks shared with other processes. Context Data: Data in the processor's registers during process execution. I/O State Information: Includes displayed I/O requests, allocated I/O devices for the process, and the list of files used by the process. Accounting Information: May include total processor time, total number of clock ticks used, time limits, accounting numbers, etc. Other Information

Through task_struct, we can control when an executable program is scheduled by the CPU, how long the scheduling lasts, when it is blocked, and when it is reclaimed, etc.

Managing an executable program also requires managing its corresponding memory. As mentioned earlier, the memory of an executable program is the process address space.

The Linux kernel will create another structure (mm_struct) to describe the attributes of the process address space. By managing this structure, we can effectively manage the memory corresponding to the program.

In summary,

Process = task_struct + mm_struct + Executable Program

Advantages of Process Address Space

Why can’t a process directly access physical memory? The author summarizes three advantages of the process address space.

Permissions

[Linux] How Root Restricts Ordinary Users – Permission Management – CSDN Blog

Physical memory is just hardware, and hardware has no concept of permissions. By adding read-only, writable, and other permission fields to the page table, this physical memory gains a concept of permissions.

The operating system can intercept illegal operations by processes on physical memory in the process address space.

For instance, out-of-bounds access, mismatched read/write permissions, etc. Since physical memory stores data from multiple processes, if one process could arbitrarily perform illegal operations on physical memory, other processes could easily crash.

The operating system can kill a process if it performs illegal operations on memory; such a process is usually terminated by the system.

In the following example, the content pointed to by str cannot be modified; if modified, the process will crash.

#include<stdio.h>                                                                                                                      
int main(){
char* str = "abc";     *str = 'H';
return 0;    }

Uniformity

Data from different processes in physical memory may be out of order, but the memory managed by each process is linear and orderly.

Even if the tasks each process needs to perform are different, such as QQ sending messages or game software outputting game information to the screen, the process address space provides a unified template for the operating system to describe process memory attributes, facilitating the maintenance of process memory attributes and thus efficient process management.

The size of computer memory generally ranges from 2GB to 16GB, while the number of processes generally ranges from a dozen to several dozen. Can the operating system allocate 4GB (32-bit) of space to each process? No, it is impossible; physical memory is a scarce resource, and the process address space can be understood as a large pie drawn by the operating system for the process.

When memory resources are tight, the operating system creates a task_struct for a process, also creates an mm_struct, and even allocates several GB of space in the process address space for that process, but most of the code and data may still be on the disk, with only part of the code and data loaded into memory.

Why? Because the operating system does not waste time and space. Loading several GB of data into memory, most of which may not require computation, is unnecessary.

This behavior is called lazy loading.

In fact, lazy loading is not counterintuitive. Let’s use a real-life example to help everyone understand. A large game software is dozens of GB; why can it run smoothly on a 4GB gaming laptop? At this time, your gaming laptop might still have WeChat, music, etc., open, and the reason is that the operating system uses lazy loading to control physical memory.

Decoupling

The operating system has four major modules: process management, memory management, file management, and driver management.

The decoupling mentioned here is to reduce the correlation between process management and memory management.

With the process address space, if one process encounters a problem, it will not affect memory management, and thus will not affect the operation of other processes.

Exploring The Magical Power Of Process Address Space In Linux

For course consultation, add: HCIE666CCIE

↑ Or scan the QR code above ↑

If you have any technical points and content you want to see

You can leave a message below to let us know!

Related posts

Leave a Comment Cancel reply