Choosing the Right Real-Time Operating System (RTOS)

(1) Introducing RTOS?

Many friends and colleagues have asked me how to choose an RTOS in practice. This question is really difficult to answer; it is very complex. In practice, there are at least three situations: 1. In some places, there is simply no need for an RTOS; perhaps the system designer is a fan of RTOS, and they forcefully implemented it; 2. In some places, an RTOS is needed, but for various reasons, it is not used; 3. The worst case is choosing the wrong RTOS for development, which could cost the development team their lives…

(2) Is RTOS Necessary?

Before making a choice, you can ask the following questions: 1. Does the system have a requirement for response delay time for certain events? This limit is in microseconds. 2. Does the system have a time limit for processing certain events? This limit is close to the CPU’s full-speed processing time for that event, with a difference of no more than milliseconds. 3. Is the code for handling these events complex (on average, no more than 100 lines of standard C code per event, with no function calls)? Are there more than 5 such events? 4. Does the system have RAM and ROM limitations, making it impossible for most operating systems like Linux, uClinux, or WinCE to work properly? 5. Is the system of a certain scale, exceeding 20,000 lines of standard C/C++ code? Are there multiple logical transactions in the system that require synchronization or data exchange? 6. Does the product or system have a long lifecycle, with requirements for subsequent upgrades and development? 7. Does the team understand the chosen RTOS? Are there experts in RTOS implementation?

If more than 2 of the above questions are answered positively, friends should take note; you may very well need an RTOS for your system development. If more than 4 questions are answered positively, you must use an RTOS.

(3) How to Choose an RTOS?

When you decide to use an RTOS, the next question is which RTOS to choose. There are so many RTOSs on the market, all kinds of them. When we choose an RTOS, we may need to weigh the following factors: 1. Cost 2. Reliability 3. Real-time performance 4. Toolchain 5. Rich modules 6. RTOS kernel RAM, ROM usage 7. Support Cost mainly includes RTOS licensing fees and learning costs. This difference can be significant; some operating systems, such as commercial VxWorks, QNX, Lynx, and uC/OS, are expensive, but if you pay the money, they will definitely teach you how to get started. However, many operating systems, such as FreeRTOS, RTEMS, ecos, and RT-Thread, have almost no cost for commercial use and no copyright issues. Leaving aside these commercially charged RTOSs, let’s talk about these open-source free RTOSs; the cost is mainly the learning cost. For example, RTEMS is not easy to learn, with little information available, and its complexity is high; on the other hand, FreeRTOS is compact, widely studied, its code is not complex, and the learning curve is not steep, making it easy to climb up. Reliability is built over time. The market is not short of some late bloomers, such as rt-thread, which, compared to the ancestor RTEMS, seems a bit immature. This does not mean that we should choose RTEMS for everything; how can rt-thread develop? For small projects, it can be tried. For large projects, to reduce technical risks, it is better to be cautious. Real-time performance is the hallmark of RTOS. When I first learned about RTOS, I loved to look at the comparison charts made by experts. Context switch times, interrupt response delays… I always liked to choose those with the smallest times… But later I learned that, in fact, not just a few comparison charts can clarify the issue. These issues will be discussed in detail later. The toolchain often determines our development efficiency and the quality of the final product delivery. Some RTOSs are not so lucky; they do not give you the right to choose the toolchain, and even if they do, it requires a high cost. For example, RTEMS uses the GNU toolchain, which is not easy to use; I once tried to move RTEMS to IAR EWarm. Later, halfway through, I had to give up as the effort exceeded my capacity. However, small RTOSs like FreeRTOS and uC/OS can compile as long as the compiler supports compiling reentrant code, so basically any C compiler can compile FreeRTOS and uC/OS. Rich modules, whether there are TCP/IP stacks, file systems, CAN protocol stacks, graphical interfaces, etc. Of course, this is not a must; for simple products, these modules may not be needed. For complex systems, these integrated modules can greatly save development time. You can also port related modules yourself, but there may be several practical problems that are difficult to solve: modules may not conform to the design philosophy of RTOS, which can damage overall real-time performance; or the libraries used by the modules may conflict with the libraries used by the RTOS… The kernel RAM and ROM usage actually requires RTOS to be highly customizable. Not all kernels can meet the requirements after being cut down; RTOS has a minimum RAM and ROM requirement, leaving only some basic services. Adding a feature will increase resource usage; you can refer to relevant materials to obtain this information and determine whether system resources can ensure the smooth use of that RTOS. Support, if it is a commercial system, there is no need to worry. Since money has been paid, they will definitely guarantee a smooth implementation process. If it is an open-source system, the development team must have decent RTOS experts. Although RTOS systems are generally similar, understanding another RTOS can be quick, but sometimes it is not necessarily the case. If you understand such a complex RTOS as RTEMS, moving to FreeRTOS, uC/OS, or rt-Thread will naturally not be a problem; however, if you are dealing with QNX, VxWorks, or Lynx, it will still take some effort. Many issues will be encountered during the development process of RTOS, such as stack estimation, task priority design, memory design, real-time design, etc., which are all quite challenging. It is best if the team has relevant RTOS experts; if it is for learning, it doesn’t matter, but for product and system development, that is a big problem.

(4) Analysis and Comparison of Several Embedded RTOS

Choosing the Right Real-Time Operating System (RTOS)

1). Introduction to 4 Operating Systems

1.1 VxWorks

VxWorks is a product of WindRiver in the United States and is currently widely used in embedded systems, with a relatively high market share. The VxWorks real-time operating system consists of more than 400 relatively independent, compact target modules, allowing users to choose appropriate modules to customize and configure the system according to their needs; it provides functions such as priority-based task scheduling, inter-task synchronization and communication, interrupt handling, timers, and memory management, and has built-in memory management that complies with POSIX (Portable Operating System Interface) specifications, as well as multi-processor control programs; it also has a user-friendly interface, and can even shrink down to 8 KB in core aspects.

1.2 μC/OS-II

μC/OS-II is a compact, preemptive multitasking real-time kernel written in C language by embedded systems expert Jean J. Labrosse. μC/OS-II can manage 64 tasks and provides task scheduling and management, memory management, inter-task synchronization and communication, time management, and interrupt service functions, with characteristics such as high execution efficiency, small footprint, excellent real-time performance, and strong scalability.

1.3 μClinux

μClinux is an excellent version of embedded Linux, whose full name is micro-control Linux, literally meaning micro-control Linux. Compared to standard Linux, μClinux has a very small kernel, but it still inherits the main features of the Linux operating system, including good stability and portability, powerful networking capabilities, excellent file system support, a rich set of APIs, and TCP/IP network protocols. Due to the lack of MMU (Memory Management Unit), its multitasking implementation requires certain techniques.

1.4 eCos

eCos (embedded Configurable Operating System) is an open-source, configurable, portable real-time operating system aimed at deeply embedded applications. Its main feature is flexible configuration, using a modular design, with the core consisting of small components, including the kernel, C language library, and underlying runtime packages. Each component can provide a large number of configuration options (the real-time kernel can also be an optional configuration), and using the configuration tools provided by eCos makes it easy to configure and meet the requirements of different embedded applications.

1.5 FreeRTOS:

Previously, my impression of FreeRTOS was good because it was free, but after looking closely at the official website recently, I found that it uses a modified GPL2 license. Commercial use is indeed free, but you must inform users that your product uses FreeRTOS, and if users request it, you must provide the source code.

If you don’t want to talk about what system you are using and don’t want to provide source code, you have to pay for it and change the license to OpenRTOS.

There are even better options! If you want more features, a more complete protocol stack, and more complete security, please pay more to get SafeRTOS! Just looking at the API documentation costs money, and other modules also cost money (FS, TCP). Otherwise, you can spend some effort to port it yourself. In addition, the functionality is relatively simple; it can only support queues, semaphores, and mutexes. However, the paid version of SafeRTOS should be good; it just can’t be seen without paying (some models of the Luminary CM3 have SafeRTOS APIs built in and can be used directly from the factory, which is nice).

Minimum system: ROM 6K RAM 2K

/* Supplementary content */

FreeRTOS and OpenRTOS share the same source code; OpenRTOS is just FreeRTOS with a ‘commercial and legal wrapper’ added. Users upgrade from FreeRTOS to OpenRTOS mainly for two reasons: (1) to overcome the restrictions of the modified GPL license of FreeRTOS. (2) to obtain additional services, such as professional technical support, high-quality middleware, training, consulting, and corresponding tools. The restrictions of the modified GPL license of FreeRTOS have several drawbacks (There are several reasons why developers may find the FreeRTOS modified GPL licence restrictive.)(1) Companies may have a comprehensive ban on using GPL licensed software in their projects. (2) They may require IP indemnification. (3) They may prefer to avoid acknowledging their use of FreeRTOS in their products due to FreeRTOS’s license requirements. An OPENRTOS license removes the restrictions of the modified GPL, provides IP protection, and allows developers to remain anonymous. FreeRTOS and SafeRTOS SafeRTOS is also based on FreeRTOS, but unlike FreeRTOS, it has been redesigned by experts in security. SAFERTOS was initially certified in 2007 by TüV SüD to IEC 61508-3 SIL 3, the highest level possible for a software-only component. Today SAFERTOS has grown to be a leading safety-critical RTOS solution supporting a wide range of international design safety standards, including:

Industrial IEC 61508 (2010)
Railway EN 50128
Medical IEC 62304/FDA 510K
Nuclear IEC 61513, IEC 62138, ASME NQA-1 2008
Process IEC 61511
Automotive ISO 26262
Aerospace DO178B

2) Performance Analysis and Comparison

Task management, synchronization and communication mechanisms between tasks and interrupts, memory management, interrupt management, file systems, hardware support, and system portability are the main performance aspects of real-time operating systems. The following will analyze and compare the above four operating systems based on these aspects.

2.1 Task Management

Task management is the core and soul of embedded real-time operating systems, determining the real-time performance of the operating system. It usually includes priority settings, multitasking scheduling mechanisms, and time determinism.

Priority Settings

Embedded operating systems support multitasking, and each task has a priority; the more important the task, the higher the assigned priority. Priority settings are divided into static priority and dynamic priority. Static priority means that each task is assigned a priority before running, and this priority cannot be changed under normal circumstances during system operation, although it is allowed to change the task’s priority through system call functions; dynamic priority means that each task’s priority (especially the application program’s priority) can be dynamically changed during system operation, this change is determined by the scheduling algorithm, rather than artificially changed through system calls.

Multitasking Scheduling Mechanism

Task scheduling mainly coordinates the competition for CPU resources among tasks. For embedded systems that are very resource-constrained, task scheduling is particularly important; it directly affects the system’s real-time performance. Generally, multitasking scheduling mechanisms are divided into priority-based preemptive scheduling and time-slice round-robin scheduling.

Priority-Based Preemptive Scheduling (PBP): In the system, each task has a priority, and the kernel always allocates the CPU to the task with the highest priority that is in the ready state. If the system finds a higher priority task in the ready queue than the currently running task, it places the currently running task in the ready queue and switches to the high-priority task. The system adopts priority preemptive scheduling to ensure that important burst events can be processed in a timely manner.

Time-Slice Round-Robin Scheduling (RR): Allows tasks with the same priority in the ready state to use the CPU in a time-slice manner to prevent any one task of the same priority from monopolizing the CPU for too long.

In general, embedded real-time operating systems adopt a scheduling mechanism that combines priority-based preemptive scheduling and time-slice round-robin scheduling.

Time Determinism

The execution time of function calls and services in embedded real-time operating systems should have determinism. The execution time of system services does not depend on the number of application tasks. Based on this feature, the time to complete a specific task in the system is predictable. Table 1 specifically lists the scheduling mechanisms of the four operating systems.

All four embedded real-time operating systems support multitasking, but there are differences in the number of supported tasks and task scheduling mechanisms. VxWorks has efficient task management capabilities, supporting multitasking and allocating 256 priorities, supporting priority preemptive debugging and time-slice round-robin scheduling, with the best real-time performance. The μC/OS-II kernel is designed to meet the requirements of real-time systems, supporting only fixed-priority preemptive scheduling; the scheduling method is simple and can meet higher real-time requirements. μClinux inherits the multitasking implementation method of standard Linux, divided into real-time processes and ordinary processes, adopting first-come-first-served and time-slice round-robin scheduling; it only improves the features for mid-to-low-end embedded CPUs and does not support kernel preemption. eCos has a rich scheduling method, providing two priority-based schedulers (bitmap scheduler and multi-level queue scheduler), allowing users to choose one of the schedulers during configuration, which is adaptable.

2.2 Synchronization and Communication Mechanisms Between Tasks and Interrupts

The functions of real-time operating systems are generally accomplished through several tasks and interrupt service routines. Tasks must coordinate actions between each other, as well as between tasks and interrupts, which involves synchronization and communication issues. Embedded real-time operating systems typically implement synchronization through semaphores, mutexes, event flags, and asynchronous signals, and provide communication services through mailboxes, message queues, pipes, and shared memory. The use of mutexes often leads to the common issue of priority inversion in real-time operating systems. Priority inversion is a form of uncertain delay; when a high-priority task attempts to access a shared resource occupied by a low-priority task, it must wait for the low-priority task to release the shared resource; if the low-priority task is preempted by one or more medium-priority tasks, the delay time for the high-priority task will be further extended, making real-time performance difficult to guarantee. Therefore, measures should be taken to minimize the occurrence of priority inversion issues. Real-time systems generally use priority inheritance and priority ceiling mechanisms.

Priority inheritance means that a task holding a mutex is elevated to the same priority as the highest priority task waiting for that mutex; priority ceiling means that the task obtaining the mutex will elevate its priority to a predetermined value. Table 2 compares the synchronization and communication mechanisms of the four operating systems.

All four systems have flexible synchronization and communication mechanisms between tasks and can achieve synchronization and communication through semaphores and message queues, but VxWorks and μClinux do not support mailboxes and event flags, and except for the bitmap scheduler in μClinux and eCos, other operating systems have measures to suppress priority inversion.

2.3 Memory Management

Memory management mainly includes: memory allocation principles, storage protection, and memory allocation methods.

Memory Allocation Principles

Memory allocation principles include speed, reliability, and efficiency. Speed requires the memory allocation process to be as fast as possible, so simple and fast allocation algorithms are generally used; reliability means that memory allocation requests must be satisfied; the system emphasizes efficiency not only in terms of system costs but also because the system’s configurable memory capacity is limited, so waste should be avoided as much as possible. Embedded systems usually plan memory allocation schemes according to specific needs to avoid memory fragmentation.

Storage Protection

In many systems, both system programs and user programs exist in the operating system’s memory; to ensure that both can operate normally and avoid interference between programs, it is necessary to protect the programs and data in memory. Storage protection usually requires hardware support, and many systems use an MMU, combined with software implementation; however, due to cost limitations in embedded systems, the kernel and user programs are usually in the same memory space. Therefore, whether to support storage protection depends on whether the CPU supports MMU and different operating levels; for example, the ARM7TDMI core does not support MMU, and most DSP do not support MMU and operating levels; on the other hand, it also depends on whether the operating system supports it in software, uC/OS, eCos, etc. do not support virtual memory management. VxWorks also has different versions, with versions below 6.0 not supporting MMU.

Memory Allocation Methods

Memory allocation methods can be divided into static allocation and dynamic allocation. Static allocation refers to allocating memory to the corresponding program before running it, and it does not allow further requests or movements in memory during program execution; dynamic allocation allows memory allocation throughout the program’s execution. Static allocation makes the system lose flexibility, but it is necessary for systems with high real-time requirements; usually, these systems have limited memory, and users’ global data are carefully planned, with only the kernel itself using some dynamic memory. Dynamic allocation gives system designers more autonomy, allowing them to flexibly adjust the system’s functions.

VxWorks uses Flat Mode for memory usage, which can be statically or dynamically linked. VxWorks provides users with two types of memory areasRegion and Partition. Region is a variable-length memory area from which users can allocate Segments; its characteristic is that it is prone to fragmentation but is flexible and does not waste memory; Partition is a fixed-length memory area from which users can allocate Buffers; its characteristic is that it does not produce fragmentation, is efficient but prone to waste. VxWorks uses the first-fit algorithm for memory allocation.

μC/OS-II manages large blocks of continuous memory by partitioning them, with each partition containing an integer number of equally sized memory blocks, but different partitions can have different sizes. When a user dynamically allocates memory, they only need to choose an appropriate partition and allocate memory by blocks, returning the block to the previous partition when released, thus eliminating fragmentation caused by multiple dynamic allocations and releases of memory.

μClinux is designed for processors without an MMU and cannot use the processor’s virtual memory management technology; it can only use real memory management strategies. The system uses paging memory allocation, paging the actual memory at startup. The system’s memory access is direct, the operating system does not protect memory space, and multiple processes can share a running space, so even if a non-privileged process calls an invalid pointer, it will trigger an address error and may cause program crashes or even system crashes.

eCos does not use segmentation or paging for memory allocation but employs a dynamic memory allocation mechanism based on memory pools. Two types of memory pools are used to implement two memory management methods: one is a variable-length memory pool; the other is a fixed-length memory pool, similar to VxWorks’ management scheme. Table 3 compares the memory management of the four operating systems.

2.4 Interrupt Management

Interrupt management is a very important part of real-time systems, as the system often interacts with external events through interrupts. The main considerations are whether to support interrupt nesting, interrupt handling mechanisms, interrupt delays, etc.

Interrupt Management in VxWorks

VxWorks’ interrupt management uses an interrupt handling mechanism that processes interrupts and ordinary tasks on different stacks, ensuring that interrupts only trigger the storage of some critical registers and do not lead to context switching of tasks, thus greatly shortening interrupt delays. At the same time, VxWorks’ interrupt handlers generally notify the occurrence of the interrupt in the shortest possible time and try to place other non-real-time processing into the triggered interrupt service tasks to complete, which also shortens the interrupt handling time and effectively reduces interrupt delays. All interrupt handlers use the same interrupt stack. To handle the worst-case scenario of interrupt nesting, sufficient interrupt stack space must be allocated.

Interrupt Management in μC/OS-II

μC/OS-II’s interrupt handling is relatively simple. An interrupt vector can only hang one interrupt service routine (ISR), and user code must be completed within the ISR. The more tasks an ISR has to perform, the longer the interrupt delay. Whether to support interrupt nesting depends on the specific implementation; for example, on the ARM processor, choosing interrupt nesting requires switching different processor modes, which is relatively complicated to implement.

Interrupt Management in μClinux

μClinux divides interrupt handling into two parts: top-half processing and bottom-half processing. In top-half processing, interrupts must be disabled, and only necessary, very few, fast processes are performed, while other processing is left to bottom-half processing; bottom-half processing executes those complex, time-consuming processes and allows interrupts. Typically, bottom-half soft interrupts are executed when returning from hardware interrupts; if there are too many soft interrupts, they will be handled by a dedicated kernel task, at which point interrupts return, avoiding excessive interrupt run times affecting other processes. Therefore, in the top half, interrupts will not nest. All bottom-half interrupt handlers will execute in sequence. All interrupt handlers share the system stack.

Interrupt Management in eCos

eCos uses a layered interrupt handling mechanism, dividing interrupt handling into traditional ISRs and deferred interrupt service routines (DSRs). Similar to μClinux’s handling mechanism, this mechanism allows DSRs to run when interrupts are allowed, thus allowing high-priority interrupts to be handled while processing lower-priority interrupts. To greatly shorten interrupt delays, ISRs should run quickly. If the service amount caused by interrupts is small, then ISRs can handle interrupts alone; if the interrupt service is complex, then ISRs only mask the interrupt source and hand it over to DSR for processing.

2.5 File System

The so-called “file system” refers to the organization responsible for accessing and managing file information, which can also be said to be responsible for creating, deleting, organizing, reading, writing, modifying, copying files, and managing other resources needed for file management.

The VxWorks operating system uses a standard I/O interface between the file system and device drivers, and supports MS-DOS, RT-11, RFS, CD-ROM, RAW, and other file systems. In this way, multiple identical or different types of file systems can run within a single VxWorks operating system.

μC/OS-II is aimed at small to medium-sized embedded systems, and even with all features included, the compiled kernel is less than 10 KB, so the system itself does not provide support for file systems. However, μC/OS-II has good extensibility, and if needed, file system content can be added by the user.

μClinux inherits the excellent file system performance of Linux, using the VFS mechanism, and can support ROMFS, NFS, ext2, MS-DOS, JFFS, and other file systems. However, it generally uses the ROMFS file system, which occupies less space compared to general file systems (such as ext2). However, the ROMFS file system does not support dynamic erasure and saving, so for data that needs to be saved dynamically, it must be processed using virtual RAM disks/JFFS.

The eCos operating system has very strong configurability, and users can add the required file system themselves.

2.6 Hardware Support

VxWorks, μC/OS-II, μClinux, and eCos all support most popular embedded CPUs. μC/OS-II supports CPUs from 8 bits to 32 bits, and VxWorks, μClinux, and eCos can be ported across different architectures, including 16-bit, 32-bit, and 64-bit.

Since μClinux inherits most of Linux’s performance, it requires at least 512KB of RAM and 1MB of ROM/Flash space; μC/OS-II and eCos are inherently small kernels, and after trimming, the minimum code can be 2 KB and 10 KB, respectively, with the minimum required data RAM space being 4 KB and 10 KB.

Overall, the hardware requirements of the four systems are relatively low and economical. Specific comparisons are listed in Table 4.

2.7 System Portability

The purpose of porting embedded operating systems is to enable them to run on a specific microprocessor or microcontroller. Among the four systems, VxWorks is a commercial operating system with many API functions and related technical support, making porting and secondary development relatively easy, but the cost of porting is high. The structured design of the other three systems facilitates the separation of processor-related parts, making it possible to port them to new processors. Porting μC/OS-II is relatively simple; it only requires modifying processor-related code. μClinux is an improvement of Linux for embedded systems, and its structure is more complex. Porting μClinux requires that the target processor meet the conditions required for porting μC/OS-II, as well as sufficient external ROM and RAM capacity. eCos‘s portability is significantly better than that of μC/OS-II and μClinux. In the eCos system, each hardware platform has a separate directory for storing the hardware abstraction layer code and configuration information for that hardware platform; while μClinux’s hardware abstraction layer code is distributed across several directories, making it more cumbersome to modify. Therefore, modifying eCos code is relatively simple, and porting is also relatively easy.

2.8 Conclusion

These four embedded real-time operating systems are widely used in embedded systems but have their own characteristics. Based on the comparisons above, we summarize their applicable fields.

VxWorks is a real-time operating system similar to Unix, which has built-in memory management compliant with POSIX specifications, as well as multi-processor control programs, and features a user-friendly interface, which can even shrink down to 8 KB in core aspects. It consists of more than 400 relatively independent, compact target modules, allowing users to choose appropriate modules to customize and configure the system, effectively ensuring the system’s security and reliability. It is widely used in communication, military, aviation, aerospace, and other high-tech fields with extremely high real-time requirements, especially in many critical applications; VxWorks is still a standout. For example, Boeing in the United States has adopted this operating system in its latest 787 aircraft; in the field of outer space exploration, VxWorks has always been a favorite of NASA.

μC/OS-II is a simple structure, fully functional, and highly real-time embedded operating system kernel, suitable for a wide range of embedded system developers and enthusiasts to learn, as well as for teaching and research in universities. μC/OS-II is very suitable for developing various small embedded systems with limited RAM and ROM that do not have very stringent requirements.

μClinux’s main feature is its design for processors without an MMU, allowing it to leverage the powerful resources of Linux; thus, it is suitable for developing various low-cost, small-capacity products with no stringent timing requirements, especially for embedded devices closely related to network applications or PDA devices. For example, CISCO’s 2500/3000/4000 routers are developed based on the μClinux operating system.

eCos’s main feature is flexible configuration and its focus on deeply embedded applications, making it suitable for some cost-sensitive embedded systems in commercial or industrial-grade applications, such as consumer electronics.

Leave a Comment

×