Essential Linux Kernel Insights! From ‘Everything is a File’ to Buffer Traps, Learn and Get a Raise After the Interview!

Linux | Red Hat Certified | IT Technology | Operations Engineer

👇 Join our technical exchange QQ group with 1000 members, note 【public account】 for faster approval

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

1. Understanding Everything as a File

Other files are understood as files by processes, which is easy to comprehend. However, external devices such as keyboards, monitors, and network cards are also considered files by processes. How should we understand this?

1.1 Describe First, Then Organize

The operating system (OS) must manage external devices and provide corresponding interfaces for user operations. At the system call level, it encapsulates language-level functions like fopen, fclose, etc.

1.1.1 How are External Devices Managed? —— Describe First, Then Organize

To describe external devices, a structure is defined that includes the type, operational status, ID value, and many other attributes of the device. —– This describes the device.

Each device will have a structure describing it, and while their general direction is similar, they have many corresponding attributes, such as names, but their values differ. Each device is linked in a linked list, so later, our management of devices will involve adding, deleting, querying, and modifying the linked list.

Each external device has its own driver program provided by the manufacturer.

1.1.2 The Descriptive Structure Contains Corresponding Operation Methods

The Linux system is written in C, and C structures cannot encapsulate direction. So how does it control the corresponding devices? C can encapsulate function pointers.

For example, the keyboard has a write function and an output function. The monitor also has input and output functions. However, for the keyboard, the input function is meaningless, so it is set to null. For the monitor, the input function is also invalid, so it is set to null. This ensures that a single structure can be shared. Later, when the operating system needs to perform corresponding operations, it can be unified.

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

1.2 From the Process Perspective, Everything is a File

VFS (Virtual File System Switch): Virtual File System.

For ordinary files, there is a corresponding file descriptor: struct file. It contains the attributes and methods of the file. However, from the above perspective, after processing at the lower level, there is also struct file, which contains these attributes and methods, allowing different devices to be described in the same way, just like files. Therefore, external devices can also be understood as files.

All our actions will be converted into processes. When a process finds the corresponding file, it can call the corresponding methods and execute functions.

Multiple devices providing methods represent polymorphism.

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

2. Text Writing and Binary Files

The monitor is a character device. When we output 1234 (one thousand two hundred thirty-four), we are actually outputting the characters ‘1’, ‘2’, ‘3’, and ‘4’ on the monitor.

2.1 Why Do We Need Functions Like printf?

For the write system call, the string to be output is of type void*, which does not distinguish between text and binary. So why do we still need to specify the corresponding type in printf, for example, using %d to print integers and %f to print floating-point numbers? Otherwise, when using write directly, we first need to convert the integer to a string before printing.

Thus, printf makes it convenient for us users.

At the lower level, there is no distinction between text files and binary files. Characters correspond to ASCII codes, which are represented as binary numbers. Therefore, there are only binary files.

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

2.2 The Great C Language

Different systems have different system calls, but when we write code, it remains largely unchanged across different platforms. The underlying interfaces of different operating systems are different, but C language encapsulates them so that we do not have to worry about the underlying details. C handles different operating systems differently.

This improves the portability of the code.

C Language Libraries:

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

A language can achieve compatibility across different operating systems, as the underlying implementations of different operating systems are different. This leads to language portability, which also aims to increase the user base.

3. Kernel-Level Buffers

There are kernel-level buffers in the operating system and language-level buffers in programming languages. They are different, but both are implemented to improve efficiency.

3.1 How Do They Improve Efficiency?

If we perform an output operation every time, it would incur a significant cost. Generally, they are set to refresh on line flush or when the buffer is full (fflush).

For special devices like monitors, line refresh is typically used, meaning it refreshes upon encountering a newline. For other files, it may only refresh when the buffer is full.

3.2 How to View Kernel-Level Buffers?

When we call the write interface to write to a file, it does not write directly to the file but to the file’s buffer. How long it takes to write to the file is determined by the operating system. Therefore, not every use of write results in an I/O operation. Instead, several uses are accumulated, and when the content in the buffer reaches a certain amount, it is then flushed.

Language-level buffers are designed similarly. When we use printf, it does not directly output our content to the kernel-level buffer. Instead, it copies the content to the buf array of the FILE structure. Only when certain conditions are met does it call the system call interface to copy the content from the FILE structure to the kernel-level buffer.

How often the kernel-level buffer performs I/O is determined by the operating system.

3.3 How to Force Kernel-Level Buffers to Flush?

#include <unistd.h>
int fsync(int fd);
int fdatasync(int fd);

fd is the file descriptor. To perform a write operation, the file must be opened, and when opened, it is certainly in the file descriptor table, with fd representing the index in the file descriptor table.

3.4 How to Read and Modify Files?

The above describes writing; reading is the opposite process. Modification includes both reading and writing.

Reading involves opening the file (how to find the file, how files are stored on the disk is part of the file system, which will be explained later), loading the file’s content into the buffer, and then reading from the buffer, which constitutes the reading process.

Modification involves first reading, then modifying the specified content, and finally writing it back.

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

3.5 Memory Blocks

Buffers are composed of memory blocks, which are generally 4KB in size.

Linux Operations Materials Collection / Course Consultation

↓ Please scan the QR code below ↓

Essential Linux Kernel Insights! From 'Everything is a File' to Buffer Traps, Learn and Get a Raise After the Interview!

What technical points and content would you like to see?

You can leave a message below to let us know!

Leave a Comment