For real-time operating system (RTOS) kernels, the speed of quickly finding the highest priority task from the task ready list is an important metric for measuring real-time performance.This article analyzes the highest priority task lookup algorithms of μCOS-II, μCOS-III, and FreeRTOS, providing reference for selecting RTOS for embedded development applications.First, the conclusions are given:1) μCOS-II is not limited by CPU hardware instructions, but since it mainly considers 8-bit microcontroller scenarios and uses a pure software algorithm to find the highest priority task, it is relatively slower compared to μCOS-III, which relies on the hardware CLZ instruction. For 32-bit CPUs, μCOS-II is clearly not suitable.2) When the number of task priorities is less than 32, μCOS-III and FreeRTOS (configured with configUSE_PORT_OPTIMISED_TASK_SELECTION == 1) have the same efficiency, both relying on the CPU hardware CLZ instruction.When the number of task priorities exceeds 32, μCOS-III’s lookup efficiency is higher than that of FreeRTOS.3) In commercial software, if the number of system task priorities does not exceed 32, FreeRTOS is clearly more cost-effective.1. uCOS-IIuCOS-II was released in 1998 to meet various requirements: 8-bit CPU, supporting a maximum of 255 tasks, with each priority corresponding to only one task (or in other words, tasks and priorities are one-to-one), and allowing priority changes during execution. It designed a seemingly complex algorithm that efficiently finds the “highest priority task in the ready state”.First, a priority group variable OSRdyGrp is defined, which divides 64 priorities into 8 groups, each with 8 priorities. If there is at least one task in group x that is in the Ready state, then OSRdyGrp.bitx=1; otherwise, OSRdyGrp.bitx=0.Next, a priority resolution table OSUnMapTbl[256] is defined, which can be seen as having 256 values for the 0-7 bits of OSRdyGrp. Each specific value can determine the highest priority group number, which obviously has 8 possibilities (the value of each element in the table is between 0-7).The highest priority task index is calculated as Group number × 8 + highest priority number within the group (between 0-7).
/*********************************************************************************************************** PRIORITY RESOLUTION TABLE** Note: Index into table is bit pattern to resolve highest priority* Indexed value corresponds to highest priority bit position (i.e. 0..7)**********************************************************************************************************/INT8U const OSUnMapTbl[256] = { 0, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x00 to 0x0F */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x10 to 0x1F */ 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x20 to 0x2F */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x30 to 0x3F */ 6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x40 to 0x4F */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x50 to 0x5F */ 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x60 to 0x6F */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x70 to 0x7F */ 7, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x80 to 0x8F */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0x90 to 0x9F */ 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0xA0 to 0xAF */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0xB0 to 0xBF */ 6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0xC0 to 0xCF */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0xD0 to 0xDF */ 5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, /* 0xE0 to 0xEF */ 4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0 /* 0xF0 to 0xFF */};static void OS_SchedNew (void){#if OS_LOWEST_PRIO <= 63u /* See if we support up to 64 tasks */ INT8U y; y = OSUnMapTbl[OSRdyGrp]; OSPrioHighRdy = (INT8U)((y << 3u) + OSUnMapTbl[OSRdyTbl[y]]);#else /* We support up to 256 tasks */ ......#endif}
When the number of tasks is less than 64, the lookup efficiency of uCOS-II is still relatively high. However, it requires 3 table lookup commands, 1 left shift, and 1 addition command.For detailed algorithm description, please refer to:https://blog.csdn.net/zhuyong006/article/details/807296162. UCOS-IIIuCOS-III was released in 2009. As mentioned in the introduction, uCOS-III is a completely new RTOS kernel, not an upgraded version of uCOS-II.It has discarded the methods of uCOS-II and directly relies on the CPU hardware CLZ instruction.The CLZ instruction returns the number of leading zeros before the first 1 in the binary encoding of the operand.If the operand is 0, the instruction returns 32; if the most significant bit of the operand is 1, the instruction returns 0.The pseudo code for the instruction operation is shown in the following code segment.
If Rm==0 Rd=32Else Rd=31-(bit position of most significant “1” in Rm)

In the currently popular 32-bit CPUs, unlike uCOS-II’s OSPrioHighRdy, each specific value represents a task index.uCOS-III directly defines a uint32 array OSPrioTb[OS_PRIO_TBL_SIZE], where each bit of each element represents the Ready state of a task. If we think of all bits of OSPrioTb as a two-dimensional bit table of OS_PRIO_TBL_SIZE*32, the bits closer to the top left represent higher task priorities.

Algorithm for finding the highest priority:In order of priority, every 32 is a group, starting from OSPrioTb[0]. If OSPrioTb[x] is non-zero, the index of the first non-zero bit is returned, plus the offset (32 × group number), which gives the index of the highest priority ready task.
/************************************************************************************************************************** GET HIGHEST PRIORITY TASK WAITING* Description: This function is called by other uC/OS-III services to determine the highest priority task* waiting on the event.* Arguments : none* Returns : The priority of the Highest Priority Task (HPT) waiting for the event* Note(s) : 1) This function is INTERNAL to uC/OS-III and your application MUST NOT call it.*************************************************************************************************************************/OS_PRIO OS_PrioGetHighest (void){ CPU_DATA *p_tbl; OS_PRIO prio; prio = (OS_PRIO)0; p_tbl = &OSPrioTbl[0]; while (*p_tbl == (CPU_DATA)0) { /* Search the bitmap table for the highest priority */ prio += DEF_INT_CPU_NBR_BITS; /* Compute the step of each CPU_DATA entry */ p_tbl++; } prio += (OS_PRIO)CPU_CntLeadZeros(*p_tbl); /* Find the position of the first bit set at the entry */ return (prio);}
From this algorithm, it can be seen that the efficiency of uCOS-III strictly depends on whether the CPU has the CPU_CntLeadZeros assembly instruction. If not, it would take 31 software shifts to locate the task with priority index mod 32 equal to 31, greatly reducing efficiency.Additionally, as the number of tasks increases (for example, over 256), the lowest priority task requires 8 iterations to define.Therefore, although theoretically uCOS-III has no upper limit on the number of task priorities it supports, it is generally recommended that the maximum application number does not exceed 256, and preferably not exceed 64.In fact, if a system engineer designs 256 tasks for their software, it indicates poor system design, and unnecessary tasks should be optimized.

3. Free-RTOSFreeRTOS, like μCOS-II and μCOS-III, is open source.However, unlike μCOS-II and μCOS-III, which require a license fee for commercial use, FreeRTOS is completely free, making it very attractive for commercial companies and widely used.The following is a global survey from 2017 showing the market share of various RTOS.

FreeRTOS has two scheduling algorithms:First, when configUSE_PORT_OPTIMISED_TASK_SELECTION == 0the maximum number of tasks is configMAX_PRIORITIES, and uxNumberOfItems is usually of long type (uint32), meaning it can define 2^32 task priorities.Its lookup algorithm, although simple, has a lookup count proportional to the task priority number.For tasks with 32 priorities, the lowest priority task requires 32 iterations to determine, which takes much longer than μCOS-II and μCOS-III.
PRIVILEGED_DATA static List_t pxReadyTasksLists[ configMAX_PRIORITIES ]; #define listLIST_IS_EMPTY( pxList ) ( ( ( pxList )->uxNumberOfItems == ( UBaseType_t ) 0 ) ? pdTRUE : pdFALSE )/*< Prioritised ready tasks. */ #define taskSELECT_HIGHEST_PRIORITY_TASK() \ { \ UBaseType_t uxTopPriority = uxTopReadyPriority; \ \ /* Find the highest priority queue that contains ready tasks. */ \ while( listLIST_IS_EMPTY( &( pxReadyTasksLists[ uxTopPriority ] ) ) ) \ { \ configASSERT( uxTopPriority ); \ --uxTopPriority; \ } \ \ /* listGET_OWNER_OF_NEXT_ENTRY indexes through the list, so the tasks of \ * the same priority get an equal share of the processor time. */ \ listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopPriority ] ) ); \ uxTopReadyPriority = uxTopPriority; \ } /* taskSELECT_HIGHEST_PRIORITY_TASK */
Second, when configUSE_PORT_OPTIMISED_TASK_SELECTION == 1,the maximum configMAX_PRIORITIES is limited to no more than 32, otherwise an error will occur.At the same time, the task priority algorithm is the same as μCOS-III, equivalent to defining only OSPrioTb[0], still requiring the use of the CLZ instruction.
#if configUSE_PORT_OPTIMISED_TASK_SELECTION == 1/* Check the configuration. */#if( configMAX_PRIORITIES > 32 )#error configUSE_PORT_OPTIMISED_TASK_SELECTION can only be set to 1 when configMAX_PRIORITIES is less than or equal to 32. It is very rare that a system requires more than 10 to 15 difference priorities as tasks that share a priority will time slice.#endif#endif /* configUSE_PORT_OPTIMISED_TASK_SELECTION */#define portGET_HIGHEST_PRIORITY( uxTopPriority, uxReadyPriorities ) uxTopPriority = ( 31UL - _clz( ( uxReadyPriorities ) ) ) #define taskSELECT_HIGHEST_PRIORITY_TASK() \ { \ UBaseType_t uxTopPriority; \ \ /* Find the highest priority list that contains ready tasks. */ \ portGET_HIGHEST_PRIORITY( uxTopPriority, uxTopReadyPriority ); \ configASSERT( listCURRENT_LIST_LENGTH( &( pxReadyTasksLists[ uxTopPriority ] ) ) > 0 ); \ listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopPriority ] ) ); \ } /* taskSELECT_HIGHEST_PRIORITY_TASK() */
Summary:1) μCOS-II is not limited by CPU hardware instructions, but since it mainly considers 8-bit microcontroller scenarios and uses a pure software algorithm to find the highest priority task, it is relatively slower compared to μCOS-III, which relies on the hardware CLZ instruction. For 32-bit CPUs, μCOS-II is clearly not suitable.2) When the number of task priorities is less than 32, μCOS-III and FreeRTOS (configured with configUSE_PORT_OPTIMISED_TASK_SELECTION == 1) have the same efficiency, both relying on the CPU hardware CLZ instruction.3) When the number of task priorities exceeds 32, μCOS-III’s lookup efficiency is higher than that of FreeRTOS.4) In commercial software, if the number of system task priorities does not exceed 32, FreeRTOS is clearly more cost-effective. Original link:https://blog.csdn.net/weixin_40137252/article/details/109514434