Content
Hello everyone, I am Bug Jun~
Recently, our project had high real-time requirements, so I took some time to study various solutions for achieving strong real-time performance in Linux. In summary, there are three main architectural approaches for achieving strong real-time performance in Linux: Preemption Real-Time Systems, Xenomai Real-Time Systems, and AMP Dual Systems. Below, I will introduce each of them so that colleagues can make informed decisions when using Linux systems for their projects.
1. Preemption Real-Time System (Based on PREEMPT_RT)

This method involves directly modifying the standard Linux kernel itself to enhance its real-time capabilities. By applying the PREEMPT_RT (Real-Time) patch to the Linux kernel, the latest versions of the Linux kernel have already merged this RT—CONFIG_PREEMPT_RT into the mainline, so you no longer need to apply the patch yourself.
Main Implementations of the RT Patch:
Maximizing kernel preemptibility: Almost all kernel code execution paths, including system calls and the ends of interrupt handlers, can be preempted by higher-priority real-time tasks, eliminating long-held coarse-grained locks like <span>BKL</span>.
Interrupt threadization: Most hardware interrupt handlers are converted into kernel threads, also known as IRQ Threads, which have configurable scheduling priorities (SCHED_FIFO). High-priority real-time tasks can preempt running interrupt threads, thus avoiding long CPU occupation by interrupt handlers that block real-time tasks.
Spinlock transformation: Kernel spinlocks that may cause priority inversion are replaced with priority inheritance mutexes (RT Mutex). When priority inversion occurs, for example, when a low-priority task holds a lock and blocks a high-priority task, the low-priority task can temporarily inherit the high-priority task’s priority to release the lock as quickly as possible.
High-precision timers: Provide microsecond-level precision for timing and sleeping, and to improve real-time performance, the tick can be made smaller.
Thus, through most of the above means, Linux applications can seamlessly integrate with standard Linux, and application development can fully utilize the Linux API (such as POSIX threads, supporting <span>SCHED_FIFO/SCHED_RR</span>), with a very high level of integration and strong community support, essentially following the mainline version, making it simple and easy to use.
However, there are also drawbacks; the theoretical worst-case latency is relatively worse than the latter two solutions, but achieving latencies in the range of ten microseconds to hundreds of microseconds is generally not a problem. I have conducted stress tests using this solution, and it performed well.
2. Xenomai Real-Time System

This solution primarily runs a lightweight hard real-time kernel (Nanokernel / Cobalt) alongside the Linux kernel. It ensures that the real-time kernel can always execute first through a priority inversion mechanism.
It mainly adopts a dual-kernel architecture
Real-Time Domain: A microsecond-level hard real-time core that runs Xenomai real-time tasks. It has its own scheduler (usually based on priority/deadline).
Non-Real-Time Domain: The standard Linux kernel and all its user processes.
Some critical hardware interrupts that require hard real-time responses are intercepted by the underlying pipeline and routed to the real-time kernel for processing. Only when the real-time kernel has nothing to do will the interrupts be passed to the Linux kernel for processing. When needed (e.g., when real-time tasks are blocked), it quickly switches to run in the Linux non-real-time domain. Due to the dual-kernel architecture, inter-kernel communication requires a dedicated mechanism, such as message/signal passing to allow real-time domain tasks to communicate with non-real-time Linux processes, and the real-time kernel’s API is also different, leaning towards porting traditional RTOS code and application development.
If extremely low deterministic latency is required, such as below 10us, or if hardware isolation is needed, this solution can be considered. However, patching, configuration, and maintenance can be quite challenging, and one must also familiarize themselves with a set of APIs, which is not as straightforward.
3. AMP Dual System (Linux + HAL)

This solution is also being gradually introduced by various manufacturers on heterogeneous platforms, such as STM32MP157, 257; TI AM6253; NXP’s 9352; RK3562, etc., utilizing Asymmetric Multi-Processing (AMP) hardware (usually heterogeneous multi-core SoCs), running Linux and a dedicated hard real-time operating system (RTOS) or bare-metal real-time HAL layer on physically isolated CPU cores, allowing for diverse applications. For SMP multi-core systems, it is common to isolate CPUs to run HAL, which many module manufacturers are doing, combining MPU and MCU.
Compared to the software isolation of Xenomai, AMP achieves hardware isolation: CPU cores typically have only limited, low-latency hardware communication channels (such as Mailbox, RPMSG, Shared Memory).
This solution allows the real-time core to exclusively occupy physical cores, achieving zero interference and enabling nanosecond to microsecond level ultra-low latency. A crash, freeze, or attack on the Linux kernel typically does not affect real-time tasks running on independent cores. Moreover, the RIF security isolation framework of STM32MP2 provides strong guarantees for resource security allocation.
However, the complexity of development for this solution is also significant. The communication latency and bandwidth between Linux and the real-time kernel may become system bottlenecks, and development and debugging require two sets of toolchains and two development models (Linux vs RTOS/bare-metal).
4. How to Choose in the Future?
- Need to balance performance and compatibility? And real-time requirements in the range of tens to hundreds of microseconds? → PREEMPT_RT.
- Need something close to Linux but with stricter hard real-time requirements? Need microsecond-level determinism? Or need to port traditional RTOS code? → Xenomai.
- Need the highest level of real-time performance, isolation, or security? Require latency at the nanosecond/microsecond level? Need hardware-level fault isolation? Or have specific AMP hardware? → AMP (Linux + RTOS/HAL)
Finally
That’s all for today. If you found this helpful, please remember to give a like~
Unique, permanent, and free sharing platform for embedded technology knowledge~
Recommended Albums Click the blue text to jump
☞ MCU Advanced Album 
☞ Embedded C Language Advanced Album 
☞ “Bug Says” Album 
☞ Album | Comprehensive Guide to Linux Application Programming
☞ Album | Learn Some Networking Knowledge
☞ Album | Handwritten C Language
☞ Album | Handwritten C++ Language
☞ Album | Experience Sharing
☞ Album | Power Control Technology
☞ Album | From Microcontroller to Linux
