What to Do When You Encounter Stack Overflow While Porting RTOS?

What to Do When You Encounter Stack Overflow While Porting RTOS?

Click the above blue text to follow us

In embedded systems, RTOS meets strict timing requirements by managing multiple tasks. Task stack management is a critical aspect of RTOS development, especially when porting RTOS to new hardware platforms. Stack overflow is a common error in embedded development that can lead to memory corruption, unpredictable system behavior, or even complete crashes.

What to Do When You Encounter Stack Overflow While Porting RTOS?

In RTOS, each task is allocated an independent stack to store the following:

  • Local Variables: Variables defined within functions.
  • Function Call Information: Including return addresses and parameters.
  • Context Data: Register states saved during task switching.

The stack is typically allocated with a fixed size and stored in RAM. Depending on the CPU architecture, the stack may grow from high addresses to low addresses (e.g., ARM Cortex-M) or vice versa. The stack pointer (SP) always points to the current top of the stack.

Stack overflow occurs when the stack space used by a task exceeds the allocated size. Common causes include:

  • Deep Recursion: Functions repeatedly calling themselves without proper termination conditions, leading to rapid stack growth.
  • Large Local Variables: Declaring large arrays or structures within functions that consume significant stack space.
  • Insufficient Allocation: The stack size allocated during task creation is insufficient to handle worst-case demands.
  • Interrupt Nesting: Calling functions within interrupt handlers can further increase stack usage.

Detecting stack overflow is an important step in porting RTOS. Detection methods can be categorized into hardware and software, with specific choices depending on hardware support and application requirements.

1

Hardware Detection Methods

Hardware detection utilizes dedicated features of the CPU, providing fast and reliable detection.

Some CPU architectures (e.g., ARMv8-M) provide stack limit registers (SP_Limit). The RTOS sets SP_Limit to the bottom address of the stack during task switching. If the stack pointer (SP) exceeds this limit, the CPU triggers an exception.

The MPU can monitor memory access by setting protection regions for each task’s stack to detect illegal writes. For example, ARMv7M supports 8 regions, while ARMv8-M supports 16 regions.

Alternatively, a protected memory area (typically 128-256 bytes) can be set at the bottom of the stack. Any attempt to write to this area will trigger an exception.

2

Software Detection Methods

Software detection is performed by the RTOS at runtime and is suitable for platforms that do not support hardware detection.

The RTOS initializes a known pattern (e.g., 0xABCDEF01) at the bottom of the task stack. During task switching, it checks whether this pattern has been modified. If the pattern is overwritten, it indicates a stack overflow.

During task switching, the RTOS checks whether the stack pointer is within the allocated stack range. If the SP is out of range, a stack overflow is considered to have occurred.

FreeRTOS provides a built-in stack overflow detection mechanism, which can be enabled by setting configCHECK_FOR_STACK_OVERFLOW in FreeRTOSConfig.h. It supports two detection methods:

  • Method 1: Check whether the stack pointer is within the stack range during task switching.
  • Method 2: Fill a known pattern during stack initialization and check whether the last 16 bytes of the stack have been modified.

When an overflow is detected, FreeRTOS calls the user-defined hook function vApplicationStackOverflowHook, which has the following prototype:

void vApplicationStackOverflowHook(TaskHandle_t xTask, char *pcTaskName);

Here is an example implementation:

void vApplicationStackOverflowHook(TaskHandle_t xTask, char *pcTaskName) {    // Log the name of the overflowing task    printf("Stack overflow in task: %s\n", pcTaskName);    // Optionally restart the system or terminate the task    for(;;) {        // Enter an infinite loop, waiting for the watchdog to restart    }}

Additionally, FreeRTOS provides the uxTaskGetStackHighWaterMark function to monitor the minimum remaining stack space of a task:

UBaseType_t uxTaskGetStackHighWaterMark(TaskHandle_t xTask);

Here is an example:

void monitorStackUsage(void *pvParameters) {    TaskHandle_t xTask = xTaskGetCurrentTaskHandle();    for(;;) {        UBaseType_t uxHighWaterMark = uxTaskGetStackHighWaterMark(xTask);        printf("Task stack high water mark: %u words\n", uxHighWaterMark);        vTaskDelay(pdMS_TO_TICKS(1000));    }}

By periodically calling this function, developers can dynamically adjust stack sizes to ensure tasks have sufficient stack space.

3

Preventing Stack Overflow

Initially allocate a larger stack (e.g., 1KB), run the application under worst-case scenarios, and monitor stack usage. For example, FreeRTOS’s uxTaskGetStackHighWaterMark can report the high water mark.

Adjust the stack size based on monitoring results, retaining a safety margin (typically 20%). For instance, if the high water mark shows a maximum usage of 80%, the stack size can be set to 1.25 times the actual requirement.

In safety-critical applications, calculate the precise stack requirements by analyzing the call graph and local variable sizes. This requires considering function call depth, interrupt nesting, and RTOS context saving (e.g., FreeRTOS requires about 60 bytes on Cortex-M).

4

Handling Stack Overflow

When a stack overflow is detected, the RTOS typically calls a hook function, allowing the application to take appropriate actions. Handling strategies include:

  • Logging Errors: Log the name of the overflowing task and other debugging information. For example, FreeRTOS’s hook function can print the task name.
  • System Restart: In non-critical systems, the watchdog timer can be triggered to restart the system.
  • Task Termination: In some cases, the overflowing task can be terminated and recreated.
  • Safe State: In safety-critical systems, place the system in a known safe state, such as stopping non-essential tasks.

Here is a complete example of a FreeRTOS hook function:

void vApplicationStackOverflowHook(TaskHandle_t xTask, char *pcTaskName) {    // Disable interrupts to prevent further damage    taskDISABLE_INTERRUPTS();    // Log the error    printf("Stack overflow detected in task: %s\n", pcTaskName);    // Trigger system restart    NVIC_SystemReset();}

In safety-critical systems, handling stack overflow is a crucial part of ensuring system integrity. For example, an automotive electronic control unit (ECU) may need to switch the system to a fail-safe mode and log events for later analysis.

In RTOS porting and application development, handling task stack overflow is a key aspect of ensuring system reliability and stability. By understanding the causes of stack overflow, implementing hardware and software detection methods, and following best practices for stack allocation and coding, developers can effectively reduce the risk of overflow.

What to Do When You Encounter Stack Overflow While Porting RTOS?What to Do When You Encounter Stack Overflow While Porting RTOS?Click to read the original text, for more exciting content~

Leave a Comment