Introduction
In Linux system development, we often need to perform file operations, such as reading configuration files, saving logs, or recording data. The most common methods fall into two categories:
- One category is System Call I/O:
<span>open()</span>,<span>read()</span>,<span>write()</span>,<span>lseek()</span>, etc. - The other category is C Standard Library I/O:
<span>fopen()</span>,<span>fread()</span>,<span>fwrite()</span>,<span>fclose()</span>, etc.
While these two seem to serve similar functions, they actually differ significantly in terms of operational mechanisms, buffering methods, performance, and use cases. This article will clarify the essential differences and relationships between these functions from the application layer to the system layer.
1. System Call I/O:<span> open / read / write / lseek</span>
These functions interact directly with the kernel, representing the lowest level of I/O operation interfaces.
Overview of Common Functions
int open(const char *pathname, int flags, mode_t mode);
ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);
off_t lseek(int fd, off_t offset, int whence);
int close(int fd);
<span>open()</span>: Opens a file and returns a file descriptor (fd).<span>read()</span>: Reads data from a file into the user buffer.<span>write()</span>: Writes data from the user buffer to a file.<span>lseek()</span>: Moves the file read/write position.<span>close()</span>: Closes the file descriptor.
The core of these functions is the file descriptor, which is an integer index assigned by the kernel for each open file.
Characteristics
- No buffering or only kernel buffering: When calling
<span>read()</span>or<span>write()</span>, data is immediately copied between user space and kernel buffer. - System call (syscall): Each call requires switching from user mode to kernel mode, resulting in slightly lower performance.
- Suitable for low-level operations: Ideal for driver development, embedded system programming, and device file operations (e.g.,
<span>/dev/ttyS0</span>,<span>/dev/spidev0.0</span>).
2. C Standard Library I/O:<span> fopen / fread / fwrite / fseek</span>
This is a further encapsulation of system calls by the C standard library, providing a higher-level, more user-friendly interface.
Overview of Common Functions
FILE *fopen(const char *pathname, const char *mode);
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
int fseek(FILE *stream, long offset, int whence);
int fclose(FILE *stream);
The core of these functions is the FILE pointer, which is a structure defined by the C library that maintains:
- A file descriptor (from system calls)
- A user-space buffer
- File read/write position, status flags, etc.
Characteristics
- Buffered mechanism (user-space buffer):
<span>fread()</span>and<span>fwrite()</span>do not always enter the kernel; they first operate on the user-space buffer to improve performance. - Better support for text and formatting: Functions like
<span>fprintf()</span>and<span>fscanf()</span>can directly format data, making them very convenient. - Ultimately still relies on system calls: When reading and writing disk data, the C library will call the underlying
<span>read()</span>and<span>write()</span>as necessary.
3. Differences and Relationships Between System Call I/O and C Library I/O
| Comparison Item | System Call I/O (<span>open</span>, etc.) |
C Library I/O (<span>fopen</span>, etc.) |
|---|---|---|
| Level | Directly interacts with the kernel | Encapsulates system calls into a high-level interface |
| Operation Object | File descriptor (int) | FILE structure (including buffer) |
| Buffering Mechanism | Only kernel buffering | User-space + kernel double buffering |
| Performance | Lower performance when called frequently | Better performance (reduces system call frequency) |
| Thread Safety | Not thread-safe | Generally thread-safe (internal locking) |
| Use Case | Device drivers, low-level development | General file operations, application layer development |
4. Multi-layer Buffering Structure of I/O
Understanding the Linux I/O buffering structure is crucial, typically consisting of three levels:
┌─────────────────────┐
│ User Buffer (User Space) │ ← fread/fwrite operations
└──────────┬──────────┘
│ Triggers system call on flush
┌──────────▼──────────┐
│ Kernel Page Cache (Kernel Space) │ ← read/write operations
└──────────┬──────────┘
│ Asynchronously written by the kernel
┌──────────▼──────────┐
│ Disk Device (Physical Storage) │
└─────────────────────┘
graph LR
A[User Buffer] --> B[ Kernel Page Cache]
- User Buffer: Managed by the C library (internal buffer of
<span>FILE*</span>). - Kernel Buffer (Page Cache): File page cache maintained by the kernel.
- Disk: The final physical storage medium.
Thus, a single <span>fwrite()</span><span> may actually go through three layers of buffering before it is truly written to disk.</span>
5. Functions for Forcing Data to Disk:<span> fflush()</span><span> and </span><code><span>fsync()</span>
Due to the existence of multi-layer buffering, sometimes we want to write data to disk “immediately” rather than waiting for the buffer to fill or the program to exit. In this case, we need to use the following functions:
1. <span>fflush(FILE *stream)</span>
- Flushes the user buffer to the kernel buffer.
- Does not guarantee that data has been written to disk, only ensures that data has moved from the
<span>FILE*</span>buffer to the kernel.
fwrite(buf, 1, len, fp);
fflush(fp); // Ensure written to kernel buffer
2. <span>fsync(int fd)</span>
- Flushes the kernel buffer to disk.
- Requires a file descriptor
<span>fd</span>, not a<span>FILE*</span>. - Can be combined with
<span>fileno()</span><span> to obtain the </span><code><span>fd</span>corresponding to<span>FILE*</span>.
int fd = fileno(fp);
fflush(fp); // Flush user buffer
fsync(fd); // Flush kernel buffer to disk
Comparison Summary
| Function | Flush Level | Parameter Type | Description |
|---|---|---|---|
<span>fflush()</span> |
User Buffer → Kernel Buffer | <span>FILE*</span> |
Only guarantees writing to the system call layer |
<span>fsync()</span> |
Kernel Buffer → Disk | File Descriptor | Guarantees actual disk write (power failure safe) |
6. Conclusion
| Function Category | Typical Functions | Characteristics |
|---|---|---|
| System Call I/O | <span>open</span>, <span>read</span>, <span>write</span>, <span>lseek</span>, <span>close</span> |
Directly calls the kernel, no user buffering, suitable for low-level development |
| C Library I/O | <span>fopen</span>, <span>fread</span>, <span>fwrite</span>, <span>fseek</span>, <span>fclose</span> |
Includes user buffering, easy to use, suitable for application layer |
| Flush Functions | <span>fflush</span>, <span>fsync</span> |
Controls the timing of buffer writes |
| Buffering Levels | User Buffer → Kernel Page Cache → Disk | Each layer can be controlled independently |
7. Development Recommendations
- If you are writing embedded device drivers or system-level programs, please use the system call interface.
- If you are writing general applications, logs, or configuration file operations, the C standard library interface is sufficient.
- When you need to ensure data persistence (e.g., critical log data or database files), you should use:
fflush(fp);
fsync(fileno(fp));
This ensures that the data has been written to disk.
8. Example Code
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
// Using C Library I/O
FILE *fp = fopen("test.txt", "w");
fputs("Hello, FILE I/O\n", fp);
fflush(fp); // Flush user buffer
fsync(fileno(fp)); // Flush to disk
fclose(fp);
// Using System Call I/O
int fd = open("test_sys.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
write(fd, "Hello, SYS I/O\n", 15);
fsync(fd);
close(fd);
return 0;
}
Of course, this example code is not rigorous; it merely demonstrates the general framework. If you find that data is often lost, but you have indeed called these functions, you need to check the <span>fflush</span> and <span>fsync</span> calls to see if you forgot to invoke them.
The <span>fileno</span> function is a standard C library function that primarily serves to obtain the underlying file descriptor (file descriptor, abbreviated as fd) from a file stream (FILE *). It acts as a bridge between C library I/O functions and system I/O functions.
Conclusion
<span>open/read/write</span> and <span>fopen/fread/fwrite</span> differ fundamentally in terms of bare I/O at the system call level versus buffered I/O in the C library. Understanding their operational mechanisms and buffering levels can help you write more efficient and reliable programs, as well as avoid issues such as data not being written to disk or file corruption in embedded development.