FreeRTOS Source Code Analysis: Task Startup and Switching

FreeRTOS Source Code Analysis – Task Startup and Switching

In the previous article, “Embedded Operating Systems: FreeRTOS Source Code Analysis Part Two – Task Creation,” we analyzed how tasks are created and added to the linked list. Now, we return to SVC 0, and after calling this statement, the first task will officially start.In this chapter, we will introduce the implementation principles of task switching. Before that, let’s review the role and changes of the Cortex-M general-purpose registers during FreeRTOS task switching.

General Purpose Register Role Table

Register Alias Calling Convention Common Uses Hardware/Software
R0 Parameter/Return Value 1 Task function parameter pvParameters is placed here for the first time Hardware
R1 Parameter/Return Value 2 Local variables, temporary results Hardware
R2 Parameter/Return Value 3 Same as above Hardware
R3 Parameter/Return Value 4 Same as above Hardware
R4 Cross-call Save Long-lived variables, compiler may use to store global/static pointers Software
R5 Cross-call Save Same as above Software
R6 Cross-call Save Same as above Software
R7 Cross-call Save Sometimes used as frame pointer (Thumb-1) Software
R8 Cross-call Save Long-lived variables Software
R9 Cross-call Save Special platforms may retain (SB static base) Software
R10 Cross-call Save Same as above Software
R11 Cross-call Save Usually used as frame pointer FP Software
R12 IP Temporary/Procedure Call Used for linker glue code Hardware
R13 SP Dual Stack MSP/PSP PSP for task runtime, MSP for exceptions Preserve PSP itself
R14 LR Link/Return 1. Subroutine return address 2. Exception return code EXC_RETURN Hardware
R15 PC Instruction Pointer Always points to “current instruction + 4” Hardware

First Task Startup – vPortSVCHandler()

When <span>svc 0</span> is triggered, the hardware has done the following:

  1. 1. Push <span>R0-R3, R12, LR, PC, xPSR</span> onto the current <span>MSP</span> (since this is in <span>Handler</span> mode);
  2. 2. Write <span>0xFFFFFFF9</span> (using <span>MSP</span>) into <span>LR</span> (exception return code);
  3. 3. Jump to the vector table at <span>vPortSVCHandler</span>.
__asm void vPortSVCHandler( void )
{
    PRESERVE8

    /* Get the location of the current TCB. */
    ldr r3, =pxCurrentTCB     /* r3 = &amp;pxCurrentTCB(variable address)*/
    ldr r1, [r3]              /* r1 = pxCurrentTCB(variable value = TCB address)*/
    ldr r0, [r1]              /* r0 = TCB first item → pxTopOfStack */
    /* Pop the core registers. */
    ldmia r0!, {r4-r11, r14}  /* Pop R4-R11 + **task's LR at that time** */
    msr psp, r0               /* r0 now points to the lowest address of the "hardware frame" */
    isb                       /* Pipeline synchronization */
    mov r0, #0                /* Prepare to clear interrupt mask */
    msr    basepri, r0           /* BASEPRI=0 → enable all interrupts */
    bx r14                    /* r14 is already **task's own LR** */
}

This restores the registers from the current task stack to the <span>CPU</span>, sets the <span>PSP</span>, enables interrupts, and then enters the user task world, never to return.

Task Switching – xPortPendSVHandler()

<span>PendSV</span> is set as the “lowest priority exception,” so it will never preempt any <span>IRQ</span>; it will only perform a <span>PendSV</span> after the <span>CPU</span> exits all <span>IRQ</span>, thus safely completing the context switch at a safe point. The entire process follows the AAPCS (ARM Architecture Procedure Call Standard) for the use of general-purpose registers <span>r0-r12</span>, divided into hardware automatic save + software manual save.

__asm void xPortPendSVHandler( void )
{
    extern uxCriticalNesting;  /* Not used in this section, can be ignored */
    extern pxCurrentTCB;       /* Current task TCB pointer variable */
    extern vTaskSwitchContext; /* Select the next ready task */

    PRESERVE8                  /* Tell the assembler to maintain 8-byte stack alignment */

    mrs r0, psp                /* r0 = current task PSP (points to hardware frame bottom)*/
    isb                        /* Synchronize pipeline */
    /* Get the location of the current TCB. */
    ldr    r3, =pxCurrentTCB      /* r3 points to global variable pxCurrentTCB */
    ldr    r2, [r3]               /* r2 = pxCurrentTCB, i.e., current TCB address */

    /* `bit4 = 1` → FPCA set → task has executed floating-point instructions → need to save `s16-s31` (callee saved) after pushing, `r0` continues to grow downwards to prepare for unified saving of `r4-r11, r14`.*/
    /* Is the task using the FPU context?  If so, push high vfp registers. */
    tst r14, #0x10             /* Check EXC_RETURN bit4 (FPCA) */
    it eq                      /* If set, the task used FPU */
    vstmdbeq r0!, {s16-s31}    /* Push S16-S31, decrement address, store after decrement */

    /* Save the core registers. */
    stmdb r0!, {r4-r11, r14}   /* Manually save R4-R11 + task's own LR */

    /* Save the new top of stack into the first member of the TCB. */
    str r0, [r2]               /* Write the new stack top back to TCB, the first member of TCB is pxTopOfStack */

    stmdb   sp!, {r0, r3}      /* Temporarily save r0, r3 to MSP (Handler stack)*/
    mov     r0, #configMAX_SYSCALL_INTERRUPT_PRIORITY
    msr     basepri, r0        /* Disable interrupts &lt;= configMAX_SYSCALL... */
    dsb
    isb
    bl      vTaskSwitchContext /* C function, updates pxCurrentTCB */
    mov     r0, #0
    msr     basepri, r0        /* Enable interrupts */
    ldmia   sp!, {r0, r3}      /* Restore temporaries */

    /* The first item in pxCurrentTCB is the task top of stack. */
    ldr     r1, [r3]          /* r1 = new TCB */
    ldr     r0, [r1]          /* r0 = new task's pxTopOfStack */

    /* Pop the core registers. */
    ldmia   r0!, {r4-r11, r14} /* Pop r4-r11 and new task's LR */

    /* Is the task using the FPU context?  If so, pop the high vfp registers too. */
    tst     r14, #0x10        /* Check new task EXC_RETURN.FPCA */
    it      eq
    vldmiaeq r0!, {s16-s31}   /* If used FPU, pop */

    msr     psp, r0           /* New PSP points to "hardware frame" lowest address */
    isb
    #ifdef WORKAROUND_PMU_CM001 /* XMC4000 specific errata */
        #if WORKAROUND_PMU_CM001 == 1
            push { r14 }
            pop { pc }
            nop
        #endif
    #endif

    bx      r14               /* r14 is still the EXC_RETURN given by hardware when entering */
    /* At this point, hardware pops `xPSR, PC, LR, R12, R0-R3` from **new PSP** again → task continues running */
}

1. Trigger Moment

  • <span>CPU</span> is in some task thread mode (using <span>PSP</span>).
  • • Receives <span>PendSV</span>, immediately switches to <span>Handler</span> mode (using <span>MSP</span>).

2. Hardware Automatic Stack Push (8 words, 32 bytes)

| xPSR | ← SP_new = PSP - 0x20
|  PC  | ← Return address (next instruction where the task was interrupted)
|  LR  | ← Task mode LR (EXC_RETURN)
| r12  |
| r3   |
| r2   |
| r1   |
| r0   | ← SP_irq = PSP - 0x20
  • • Stack pointer: <span>PSP</span> (task stack)
  • • Stack width: 32 bytes fixed
  • • The above registers are managed by hardware, and the assembly code does not need to handle them.

3. Software Manual Stack Push (Remaining “Callee Saved” Registers)

In the <span>PendSVHandler</span> assembly, immediately:

mrs   r0, psp            ; r0 = task stack pointer (points to hardware pushed 8 bytes)
stmdb r0!, {r4-r11}      ; Push r4-r11 again, SP goes down 32 bytes

At this point, the task stack frame is 16×4 = 64 bytes:

+--------+
| xPSR   |  ← Hardware
|  PC    |
|  LR    |
| r12    |
| r3     |
| r2     |
| r1     |
| r0     |  ← PSP - 0x20
+--------+
| r4     |  ← Software
| r5     |
| r6     |
| r7     |
| r8     |
| r9     |
| r10    |
| r11    |  ← r0 final value (new PSP)
+--------+

4. Save “Old Task” Stack Pointer

ldr   r1, =pxCurrentTCB
ldr   r2, [r1]      ; r2 = pxCurrentTCB
str   r0, [r2]      ; TCB-&gt;pxTopOfStack = r0(new PSP)

Write the final <span>PSP</span> back to the current task’s <span>TCB</span> for future restoration.

5. Select the Next Task

bl    vTaskSwitchContext   ; C function only modifies global pxCurrentTCB

At this point, we are completely in the <span>MSP</span> space, and can freely use <span>r0-r3, r12</span> as parameters/return values without worrying about corrupting the task context.

6. Restore “New Task” Stack Pointer

ldr   r1, =pxCurrentTCB
ldr   r2, [r1]
ldr   r0, [r2]      ; r0 = new task TCB-&gt;pxTopOfStack

7. Software Pop r4-r11

ldmia r0!, {r4-r11}

8. Update PSP and Exit Exception

msr   psp, r0        ; Let PSP point to the top of the hardware save area
orr   lr, lr, #0x04  ; Ensure using PSP to return to thread mode
bx    lr             ; Hardware automatically pops the remaining 8 bytes → new task runs

Register Role Comparison Table

Register Who Saves Save Location Reason
r0-r3 Hardware Task Stack Caller saved, can be changed at any time
r4-r11 Software (PendSV) Task Stack Callee saved, must remain unchanged across functions
r12 Hardware Task Stack Temporary ip, caller saved
LR (Task) Hardware Task Stack Used for interrupt return (EXC_RETURN)
PC & xPSR Hardware Task Stack Fetch instruction/flags after interrupt return
MSP Hardware Kernel Specific Runs in Handler mode
PSP Software TCB Record Thread mode stack pointer, core of task switching

Key Points Summary

  1. 1. Hardware only saves “half” of the context <span>r0-r3,r12,LR,PC,xPSR</span>.
  2. 2. <span>PendSV</span> assembly is responsible for the other half <span>r4-r11</span>.
  3. 3. The entire 64-byte stack frame is completely private to a task, restored to its original state next time.
  4. 4. Using <span>MSP</span> to run <span>Handler</span>, thus <span>r0-r3</span> can be freely used as temporary variables without worrying about corrupting task data.
  5. 5. Update <span>PSP</span> on exit and set <span>LR</span>‘s <span>bit2</span>, telling the <span>CPU</span> to return to thread mode + use PSP.

First Startup vs Subsequent Switching

First Startup (prvStartFirstTask → svc 0)

Fake Stack Frame

+------------------+  ← r0 points here
| xPSR             |  (contains Thumb bit)
| PC = pxCode      |  ← Task function entry
| LR = prvTaskExit |  ← Task return trap
| R12, R3-R1       |
| R0 = pvParams    |
+------------------+

<span>vPortSVCHandler</span> only pops <span>R4-R11 + LR</span> (meaningless since the task has never run), then points the <span>PSP</span> to <span>bx r14</span>, and finally hardware pops 8 bytes again, and the task function runs for the first time.

Subsequent Switching (PendSV → vTaskSwitchContext → New Task)

New Task Stack

+------------------+  ← r0 points here
| xPSR             |  ← Current interrupt context
| PC               |  ← Location where interrupted
| LR (Task)        |  ← Subfunction return address
| R12, R3-R0       |
+------------------+

<span>vPortSVCHandler</span> pops <span>R4-R11 + Task LR</span>, then updates <span>PSP</span><span>bx r14</span>, and finally hardware pops 8 bytes again, and the task continues running from where it was interrupted.

Decision – vTaskSwitchContext()

<span>vTaskSwitchContext()</span> is the core “decision maker” for FreeRTOS task switching, called from the <span>PendSV</span> interrupt (triggered by assembly). Main function: select the next task to run and point <span>pxCurrentTCB</span> to its <span>TCB</span>; the actual <span>CPU</span> register switching is completed by the <span>PendSV</span> assembly.

void vTaskSwitchContext( void )
{
    if( uxSchedulerSuspended != ( UBaseType_t ) pdFALSE )
    {
        /* The scheduler is currently suspended - do not allow a context switch. */
        xYieldPending = pdTRUE;
    }
    else
    {
        xYieldPending = pdFALSE;
        traceTASK_SWITCHED_OUT();

        #if ( configGENERATE_RUN_TIME_STATS == 1 )
        {
            #ifdef portALT_GET_RUN_TIME_COUNTER_VALUE
                portALT_GET_RUN_TIME_COUNTER_VALUE( ulTotalRunTime );
            #else
                ulTotalRunTime = portGET_RUN_TIME_COUNTER_VALUE();
            #endif

            /* Add the amount of time the task has been running to the accumulated time so far.  The time the task started running was stored in ulTaskSwitchedInTime.  Note that there is no overflow protection here so count values are only valid until the timer overflows.  The guard against negative values is to protect against suspect run time stat counter implementations - which are provided by the application, not the kernel. */
            if( ulTotalRunTime &gt; ulTaskSwitchedInTime )
            {
                pxCurrentTCB-&gt;ulRunTimeCounter += ( ulTotalRunTime - ulTaskSwitchedInTime );
            }
            else
            {
                mtCOVERAGE_TEST_MARKER();
            }
            ulTaskSwitchedInTime = ulTotalRunTime;
        }
        #endif /* configGENERATE_RUN_TIME_STATS */

        /* Check for stack overflow, if configured. */
        taskCHECK_FOR_STACK_OVERFLOW();

        /* Before the currently running task is switched out, save its errno. */
        #if( configUSE_POSIX_ERRNO == 1 )
        {
            pxCurrentTCB-&gt;iTaskErrno = FreeRTOS_errno;
        }
        #endif

        /* Select a new task to run using either the generic C or port optimized asm code. */
        taskSELECT_HIGHEST_PRIORITY_TASK(); /*lint !e9079 void * is used as this macro is used with timers and co-routines too.  Alignment is known to be fine as the type of the pointer stored and retrieved is the same. */
        traceTASK_SWITCHED_IN();

        /* After the new task is switched in, update the global errno. */
        #if( configUSE_POSIX_ERRNO == 1 )
        {
            FreeRTOS_errno = pxCurrentTCB-&gt;iTaskErrno;
        }
        #endif

        #if ( configUSE_NEWLIB_REENTRANT == 1 )
        {
            /* Switch Newlib's _impure_ptr variable to point to the _reent structure specific to this task.
            See the third party link http://www.nadler.com/embedded/newlibAndFreeRTOS.html for additional information. */
            _impure_ptr = &amp;( pxCurrentTCB-&gt;xNewLib_reent );
        }
        #endif /* configUSE_NEWLIB_REENTRANT */
    }
}

1. Check if the Scheduler is Suspended

if (uxSchedulerSuspended != pdFALSE)
{
    xYieldPending = pdTRUE;   // Will perform again when xTaskResumeAll() is called
    return;                   // Exit directly, do not modify pxCurrentTCB
}
  • <span>vTaskSuspendAll()</span> will increment <span>uxSchedulerSuspended++</span>;
  • • During this time, it is still possible for an <span>ISR</span> to request a <span>yield</span>, but cannot switch immediately, so it is recorded first.

2. Run Time Statistics (Optional)

#if configGENERATE_RUN_TIME_STATS
    ulTotalRunTime = portGET_RUN_TIME_COUNTER_VALUE();
    pxCurrentTCB-&gt;ulRunTimeCounter += (ulTotalRunTime - ulTaskSwitchedInTime);
    ulTaskSwitchedInTime = ulTotalRunTime;
#endif
  • • High-precision clock counter (usually GPT/TC) accumulates “actual CPU time” for each task.
  • • Used for <span>vTaskGetRunTimeStats()</span><span> to print percentages.</span>

3. Stack Overflow Detection (Optional)

  • • If enabled, <span>configCHECK_FOR_STACK_OVERFLOW</span> will check if the current task stack has been overrun.

4. Save/Restore POSIX errno

#if configUSE_POSIX_ERRNO
 pxCurrentTCB-&gt;iTaskErrno = FreeRTOS_errno;        // Save before switching out
 ...
 FreeRTOS_errno = pxCurrentTCB-&gt;iTaskErrno;        // Restore after switching back
#endif
  • • Each task has its own independent <span>errno</span><span>, ensuring thread safety for </span><code><span>strerror()</span> and others.

5. Select the Highest Priority Ready Task (Core)

taskSELECT_HIGHEST_PRIORITY_TASK();

Iterates through <span>uxTopReadyPriority</span><span>, then updates the first item in the linked list to </span><code><span>pxCurrentTCB</span><span>.</span>

This only modifies the pointer; the register context is still managed by the <span>PendSV</span> assembly for bulk <span>push/pop</span>.

6. Newlib Reentrant Structure Switch (Optional)

#if configUSE_NEWLIB_REENTRANT
 _impure_ptr = &amp;(pxCurrentTCB-&gt;xNewLib_reent);
#endif

<span>newlib</span>‘s <span>_impure_ptr</span> points to the current task’s <span>struct _reent</span>, making <span>printf/scanf/strtok</span> thread-safe.

High priority task ready
        ↲
PendSV triggered (assembly)
        ↲
Save current task registers → call vTaskSwitchContext()
        ↲
Select new value for pxCurrentTCB
        ↲
Assembly restores new task registers → exit PendSV
        ↲
New task continues running

<span>vTaskSwitchContext()</span> acts as the judge, checking the scheduler state → recording state → checking stack → naming the next highest priority task → pointing <span>pxCurrentTCB</span> to the past; the actual register save/restore is handed back to the <span>PendSV</span> assembly to complete.

At this point, the first task officially starts. If multiple tasks are created, the system will begin task switching based on priority and time slices. Although we have analyzed the general process, there are still many important functions that have not been analyzed, such as task suspension and resumption, and task voluntary yielding, which are not reflected in this article. If interested, you can check the code yourself.

In FreeRTOS, linked lists and queues are very important data structures. Tasks are linked under different states in different linked lists. We will analyze these two data structures separately later. My level is limited, so if there are any errors or omissions, please feel free to correct me. I hope the above content is useful to you.

Leave a Comment