1. Overview of System Calls
A system call is a set of interfaces provided by the operating system kernel to user-space programs, allowing user programs to request services from the kernel. It is a controlled communication mechanism between user space and kernel space.
Why Do We Need System Calls?
- Permission Isolation: User programs run in non-privileged mode and cannot directly access hardware or execute privileged instructions.
- Resource Management: The kernel needs to manage system resources uniformly.
- Abstract Interface: Provides a unified hardware abstraction for applications.
2. Implementation Principles of Linux System Calls
System Call Table (syscall table)
The Linux kernel maintains a system call table, where each system call has a unique number. On the x86-64 architecture, this table is defined in <span>arch/x86/entry/syscalls/syscall_64.tbl</span>.
System Call Execution Process
- The user program places the system call number into a specific register (rax).
- Parameters are placed into designated registers (rdi, rsi, rdx, r10, r8, r9).
- Execute the
<span>syscall</span>instruction (or<span>int 0x80</span>on 32-bit systems). - The CPU switches to kernel mode and jumps to the predefined system call entry.
- The kernel looks up the handler function based on the system call number.
- Execute the system call handler function.
- Return to user space and resume execution of the user program.
3. Example Analysis of System Calls
1. Using the C Library for System Calls
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
int main() {
// Using the write system call wrapped by glibc
write(1, "Hello via libc\n", 15);
// Directly using the syscall function
syscall(SYS_write, 1, "Hello via syscall\n", 18);
return 0;
}
2. Inline Assembly for System Calls
#include <unistd.h>
#include <sys/syscall.h>
int main() {
const char msg[] = "Hello via inline assembly\n";
// x86_64 inline assembly for the write system call
asm volatile (
"movl $1, %%edi\n" // fd = STDOUT_FILENO (1)
"movq %0, %%rsi\n" // buf = msg
"movl %1, %%edx\n" // count = sizeof(msg)
"movl $1, %%eax\n" // syscall number for write
"syscall\n" // invoke syscall
:
: "r"(msg), "r"(sizeof(msg)-1)
: "%edi", "%rsi", "%edx", "%eax"
);
return 0;
}
4. Adding Custom System Calls
1. Adding System Call Number
Edit <span>arch/x86/entry/syscalls/syscall_64.tbl</span>, and add a line:
449 common my_syscall __x64_sys_my_syscall
2. Declaring System Call Prototype
Add in <span>include/linux/syscalls.h</span>:
asmlinkage long sys_my_syscall(int arg1, const char __user *arg2);
3. Implementing System Call
Add the implementation in an appropriate location (e.g., kernel/sys.c):
SYSCALL_DEFINE2(my_syscall, int, arg1, const char __user *, arg2)
{
char buf[256];
long ret = 0;
if (copy_from_user(buf, arg2, sizeof(buf))) {
return -EFAULT;
}
printk(KERN_INFO "my_syscall called with %d and %s\n", arg1, buf);
return ret;
}
4. User Space Testing
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
#define MY_SYSCALL_NR 449
int main() {
long ret = syscall(MY_SYSCALL_NR, 42, "test message");
printf("syscall returned %ld\n", ret);
return 0;
}
5. Performance Considerations
- Context Switch Overhead: System calls require switching from user mode to kernel mode and back, which incurs performance overhead.
- vsdo Mechanism: Linux uses vsdo (virtual dynamic shared object) to optimize certain frequently called system calls (e.g., gettimeofday).
- Fast System Call Instructions: Modern CPUs provide
<span>syscall/sysret</span>instructions to replace the traditional<span>int 0x80</span>, reducing overhead.
6. System Call Tracing
1. Using strace
strace -o trace.log ./my_program
2. Using perf
perf trace ./my_program
3. Kernel ftrace
cd /sys/kernel/debug/tracing
echo 1 > events/syscalls/enable
cat trace_pipe
7. Security Considerations
- Parameter Validation: The kernel must strictly validate all parameters coming from user space.
- Permission Checks: Check whether the calling process has permission to perform the operation.
- Boundary Checks: Prevent buffer overflow and other attacks.
- Pointer Checks: User space pointers must be validated before dereferencing.
Conclusion
Linux system calls are the core mechanism for interaction between user space and kernel space. Understanding their working principles is crucial for system programming and kernel development. Through this article’s explanations and code examples, you should have a deeper understanding of the implementation and usage of system calls. In actual development, prefer using the encapsulated interfaces provided by the standard library, and only consider directly using system calls when necessary.