Deep Optimization of Linux File System: Revitalize Your Disk

Linux | Red Hat Certified | IT Technology | Operations Engineer

πŸ‘‡Join our technical exchange QQ group with the note 【Official Account】 for faster access

Deep Optimization of Linux File System: Revitalize Your Disk

Components of the Disk
Our disk is the only mechanical device in our computer (some older computers), and it is also a peripheral.
Our computers are called electronic computers, which means that within the computer, the flow of the entire device is transmitted in the form of electronic and optoelectronic signals between devices and throughout the computer. This indicates that the transmission speed of these devices is actually very fast. However, our mechanical devices use high-speed motors for movement, and this movement is mechanical. No matter how fast, it cannot surpass the speed of light. Therefore, the efficiency of mechanical devices is relatively low.
Appearance of the Disk
The surface of the disk is similar to that of a CD, the disk has two smooth sides, while the CD has one smooth side.
Below is a comparison image of the CD and the disk:
Deep Optimization of Linux File System: Revitalize Your Disk
Deep Optimization of Linux File System: Revitalize Your Disk
Components of the Disk:
Below are the various components of the disk
Deep Optimization of Linux File System: Revitalize Your Disk
One point to note is that the read/write head does not contact the disk surface, and the distance in between is similar to a Boeing 747 flying just one meter above the ground.
Disk Damage: This type of disk is not suitable for manual handling. Because the distance between the head and the disk surface is very small, any vibration can cause damage. Imagine a Boeing 747 flying just one meter above the ground, and suddenly it rains, or something scares the plane. This could likely lead to an accident. In the case of the disk, it means the head scratches the surface of the disk. Our disk stores a series of binary sequences. Once scratched, we risk losing data or disordered binary sequences, leading to data corruption. This results in a situation where some older computers may work well today, but not tomorrow. The essence of these problems is disk damage, which is why we rarely use mechanical disks in laptops today. In the future, when we store data, it is essentially stored on the disk surface. When we save data, it flows from the powered area of the disk to the computer's motherboard, CPU, and other devices. Inside the disk, this data will be charged and discharged through the head to the disk platter, and then these binary data will be written to the platter.
What is the essence of data transfer by hardware?
Computers only recognize binary. Inside the computer, there are various hardware components, such as disks, network cards, sound cards, etc. Since computers only recognize binary, do these hardware components also only recognize binary? β€” The answer is yes. Why?
Because each device is ultimately connected to various lines. When there are optoelectronic signals on our data bus and IO bus, the signals in our wires represent light and no light, or strong and weak, or sparse and dense. Regardless of how it is expressed, it fundamentally uses this method to transmit binary.
Disk β€” Permanent Storage Medium
The disk is called a permanent storage medium. (Memory is called a volatile storage medium)
The relationship between the binary data in the disk and its storage β€” we can use the example of a magnet to understand this relationship. We all know that a magnet has a north and south pole. South pole S, North pole N. However, some technicians can energize the magnet to switch its poles. After the poles are switched, it can represent a 0 or 1 character. We know that our read/write head moves across the platter. The platter can be imagined as composed of countless small magnets. The area the head points to represents a small magnet, and when energized, this small magnet forms a 0 or 1 character. The movement of the head will pass over countless small magnets, forming a series of 01 sequences. This is what we call the binary sequence!!! β€” In other words, the head can write a series of 01 data onto the platter using electrical energy!!!
Disk Security Issues
We know that although disks are permanent storage media, they also have a lifespan. After four to five years or even more, the data on the disk may be lost. For some large internet companies, they may have thousands or even tens of thousands of disks. Therefore, when they copy data from disks or replace them, they may obtain data from the disks, which can cause societal impacts and chaos. Thus, these disks must have their data erased before being discarded. Erasure can be done physically, where the data on the disk is destroyed at high temperatures, but this is costly. Another method is to use software-level erasure, which involves writing 0s or 1s over the data on the disk. Although some residual data may remain, performing software-level data erasure multiple times can meet the standards. These software-level erasures are generally provided by manufacturers along with the disks as user interfaces. Users can call the relevant interface directly when they want to erase data. β€” This is the security issue of disks.
Composition of Disk Storage
As shown in the figure below:
Deep Optimization of Linux File System: Revitalize Your Disk
Deep Optimization of Linux File System: Revitalize Your Disk
Once we understand the concept of sectors, we can note that if we want to modify a byte, we need to load the entire sector containing that byte into memory.
To save data to the disk, the first issue to resolve is locating the sector: first determine which side (which head to use), then which track, and finally which sector.
At the same time, the combination of tracks from multiple disk surfaces can form a cylinder, as shown in the figure:
Deep Optimization of Linux File System: Revitalize Your Disk
We know that the platter rotates, and the read/write head swings left and right.But the question arises, why does the disk move this way?The conclusion is straightforward β€” the left and right swinging of the head and the rotation of the platter is essentially a process of locating sectors. The head’s swinging determines the track, while the platter’s high-speed movement determines the sector.
So, what determines the efficiency of the disk? β€” The overall efficiency of the disk is determined by the selection of the head (which side), confirmation of the track, and confirmation of the sector.
If our data is randomly placed on the platter, it will inevitably lead to more frequent up-and-down swings and clockwise rotations during mechanical movement. So what constitutes high or low efficiency for a disk? β€” The less movement, the higher the efficiency. The more movement, the lower the efficiency (in software design, designers must consciously group related data together).
Logical Structure of the Disk
The logical structure of the disk is linear, in a long strip shape.
Deep Optimization of Linux File System: Revitalize Your Disk
This strip is divided into segments.Each segment represents one side.And these segments are of equal length.
We know that each disk has tracks, with the innermost radius being the smallest, gradually increasing outward. This means that each segment can also be subdivided into smaller segments to represent individual tracks, with these segments gradually increasing or decreasing in size. As shown by the green line:
Deep Optimization of Linux File System: Revitalize Your Disk
Then, these tracks can be further divided into sectors.As shown by the three colored lines:
Deep Optimization of Linux File System: Revitalize Your Disk
We can abstract the above content into an array based on sectors.
CHS Addressing Method
We know that the number of sectors per track is equal, but their sizes are not equal. So how is the addressing done? β€” Suppose we have 20,000 sectors, each platter has 200 tracks, and each track has 100 sectors.
Addressing process: If the sector number is 28888, then we determine the platter as 28888 / 20000 = 1. // This 1 is the platter number. Then we determine the sector as 28888 % 20000 = 8888. Then 8888 / 100 = 88, which gives us the 89th sector. 8888 % 100 = 88, which gives us the 89th track's 88th sector. The above C (track) is 88, H (head) is 1, S (sector) is 89. Thus, we have converted our logical sector address into a physical address. The physical address can also be converted from the CHS value into a logical sector address.
The logical address of the sector is recognized by the operating system, making it convenient for the operating system to locate the contents of the disk. This logical sector address, also known as the LBA address, is abstracted by the operating system. When the physical CHS address on the disk is used, the operating system calculates and converts it into the LBA address to find the corresponding content. On the disk, the CHS address is used for identification, while the system uses the LBA address. This allows the operating system not to need to understand the disk structure; for the operating system, all disks are linear. Thus, the operating system can access any sector within the disk using the LBA address!!! At the same time, any sector in the LBA can also be converted back to the CHS address using an algorithm. (Here LBA is called logical block address.)
Returning to hardware: not only the CPU has “registers”, but other devices (peripherals also have), disks also have them. β€” The main registers in the disk include control registers, data registers, and status registers.
The entire process of data transmission in the disk is illustrated in the figure below:
Deep Optimization of Linux File System: Revitalize Your Disk
The first step is for the CPU to write the r/w instruction into the disk’s control register β€”> to determine the direction of IO.
The second step is for the CPU to send data to the data register, which is the data to be written to the disk.
The third step is for the CPU to transmit the disk address where the data is to be stored to the address register. Once the address register has this address, the disk writes the data to the disk according to the address in the address register.
The fourth step is to check whether the operation was successful.
That concludes this section. I hope it helps you.

Deep Optimization of Linux File System: Revitalize Your Disk

For course inquiries, add: HCIE666CCIE

↑ or scan the QR code above ↑

What technical points and content do you want to see?

You can leave a message below to tell us!

Leave a Comment