Embedded Development in C: When to Use the Volatile Keyword for Variables?

What is the volatile keyword?

In embedded programming with C, the volatile keyword is very important. Its English meaning is “changeable”. Its main function is to inform the compiler that the value of the variable it modifies may change in ways that the compiler cannot predict during program execution. Therefore, the compiler cannot perform regular optimization operations on this variable when optimizing the code. Each time this variable is accessed, its real value must be read directly from memory, rather than using the cached value in the register, to ensure that the program accesses the variable in real-time.

Assume there is a variable flag. Under normal circumstances, the compiler might think that if there are no explicit modification operations on flag in a piece of code, its value will not change. Therefore, in subsequent code, the compiler might directly use the previously cached value of flag from the register, rather than reading from memory. However, if flag is a variable that can be modified by hardware interrupts or other threads, this optimization could lead to the program reading an outdated value of flag, resulting in errors. When the volatile keyword is used to modify flag, the compiler will know that the value of this variable may change at any time, and will read the latest value of flag from memory each time it is used, avoiding errors caused by optimization.

Usage Scenario 1: Preventing Compiler Optimization

To improve program execution efficiency, compilers perform various optimizations on the code. One common optimization is the optimization of variable access. During the optimization process, the compiler assumes that the value of a variable will not change unless it is explicitly modified. Based on this assumption, the compiler may cache frequently accessed variables in registers, allowing subsequent accesses to read directly from the register instead of memory, as register access is much faster than memory access, thus improving program execution efficiency. Additionally, the compiler may optimize some “redundant” accesses in the code. For example, if the same variable is read multiple times in a piece of code and has not been modified in the meantime, the compiler may read the variable’s value only once and use the previously read value in subsequent uses, rather than reading it again.

Although these optimizations can improve program performance in most cases, they may cause problems in embedded development environments. In embedded systems, the value of a variable may change due to hardware device state changes, interrupt service program execution, or concurrent operations in multi-threaded (multi-threading supported embedded systems) environments, in ways that the compiler does not expect. If the compiler continues to handle these variables in a conventional optimization manner, it may lead to the program reading outdated variable values, resulting in logical errors.

Assume there is a simple embedded system with a hardware device’s status register, and we use a variable to read the value of this status register, waiting for the hardware device’s status to change to a specific value. Below is a simplified code example:

#include <stdio.h>
// Simulate the hardware device's status register, using a normal variable instead
int hardware_status = 0;
// Wait for the hardware device status to become the target value
void wait_for_device_ready() {
    while (hardware_status != 1) {
        // Loop waiting, doing nothing
    }
    printf("Device is ready.\n");
}
int main() {
    // Assume at some point, the hardware device's status is changed to 1
    // Simulate the change of the hardware device status
    hardware_status = 1;
    wait_for_device_ready();
    return 0;
}

In this code, the hardware_status variable represents the hardware device’s status register. The wait_for_device_ready function waits for the hardware device’s status to become 1. In the main function, we simulate the hardware device’s status being changed to 1, and then call the wait_for_device_ready function.

If hardware_status is not modified with volatile, the compiler may optimize the wait_for_device_ready function. The compiler may think that in the while loop, the value of hardware_status has not been explicitly modified, so it will not change, and thus the compiler may cache the value of hardware_status in a register, only reading from the register in the loop instead of memory. As a result, even if the hardware device’s status has changed to 1, the while loop will never end because the compiler is using the cached value in the register, causing the program to enter an infinite loop.

Next, let’s look at the code after using volatile to modify the hardware_status variable:

#include <stdio.h>
// Use volatile to modify, simulating the hardware device's status register
volatile int hardware_status = 0;
// Wait for the hardware device status to become the target value
void wait_for_device_ready() {
    while (hardware_status != 1) {
        // Loop waiting, doing nothing
    }
    printf("Device is ready.\n");
}
int main() {
    // Assume at some point, the hardware device's status is changed to 1
    // Simulate the change of the hardware device status
    hardware_status = 1;
    wait_for_device_ready();
    return 0;
}

When the volatile keyword is used to modify the hardware_status variable, the compiler will know that the value of this variable may change in unpredictable ways during program execution. Therefore, the compiler will read the latest value of hardware_status directly from memory each time it accesses it, rather than using the cached value in the register. This way, when the hardware device’s status is changed to 1, the while loop can promptly detect this change, allowing the loop to end normally, and the program can continue executing as expected.

Usage Scenario 2: Accessing Hardware Registers

In embedded systems, communication between hardware devices and processors is achieved through hardware registers. To facilitate the processor’s access to these hardware registers, they are usually mapped to fixed addresses in memory space, a method known as memory-mapped registers. Through memory mapping, the processor can access hardware registers as if they were ordinary memory units, thus controlling and reading the status of hardware devices.

For example, in a simple embedded system, there may be a serial device, and the serial device’s transmit and receive registers are mapped to memory addresses 0x4000C000 and 0x4000C004. When the processor needs to send data to the serial port, it only needs to write the data to address 0x4000C000, and the serial device will automatically read the data from that address and send it out; when the processor needs to read the data received by the serial port, it only needs to read from address 0x4000C004.

The values of these hardware registers are changed in real-time by external hardware devices, rather than by ordinary assignment operations in the program. For example, when new data arrives at the serial port, the serial hardware automatically stores the data in the receive register, a process that is not directly controlled by the program. Therefore, the compiler cannot predict when these register values will change. If the variables corresponding to these registers are not modified with volatile, the compiler may optimize the access to these variables, leading to the program being unable to read the latest values of the hardware registers correctly or failing to write data correctly to the hardware registers.

Below is an example of code that operates on the serial transmit register, showing both the code with and without the volatile modifier. Assume the address of the serial transmit register is 0x4000C000, and we access this register through a pointer.

Below is the code without using volatile:

#include <stdio.h>
// Address of the serial transmit register
#define UART_TX_REGISTER 0x4000C000
// Define a pointer to the serial transmit register
unsigned int *uart_tx = (unsigned int *)UART_TX_REGISTER;
// Function to send data to the serial port
void send_data_to_uart(int data) {
    *uart_tx = data;    // The compiler may optimize here, thinking no need to read the transmit register again
    while (*uart_tx != 0); // Wait for data to be sent, assuming the register value is 0 after sending
}
int main() {
    int data_to_send = 0x55;
    send_data_to_uart(data_to_send);
    return 0;
}

In this code, the uart_tx pointer points to the serial transmit register. However, since volatile is not used, the compiler may optimize the line while (*uart_tx != 0). The compiler may think that after *uart_tx = data, the value of *uart_tx will not change (because there are no other explicit modifications to *uart_tx in this code), so it may cache the value of *uart_tx in a register and only read from the register in the while loop instead of memory. As a result, if the serial hardware sets the register value to 0 after sending the data, the while loop will never end because the compiler is using the cached value in the register, causing the program to enter an infinite loop.

Next, let’s look at the code using volatile:

#include <stdio.h>
// Assume this is the address of the serial transmit register
#define UART_TX_REGISTER 0x4000C000
// Use volatile to modify the pointer to the serial transmit register
volatile unsigned int *uart_tx = (volatile unsigned int *)UART_TX_REGISTER;
// Function to send data to the serial port
void send_data_to_uart(int data) {
    *uart_tx = data;
    while (*uart_tx != 0); // Wait for data to be sent, assuming the register value is 0 after sending
}
int main() {
    int data_to_send = 0x55;
    send_data_to_uart(data_to_send);
    return 0;
}

When the volatile keyword is used to modify the uart_tx pointer, the compiler will know that the value at the memory address pointed to by this pointer (i.e., the serial transmit register) may change at any time, and this change is unpredictable by the compiler. Therefore, each time *uart_tx is accessed, the compiler will read the real value of the serial transmit register from memory, rather than using the cached value in the register. This way, when the serial hardware sets the register value to 0 after sending the data, the while loop can promptly detect this change, allowing the loop to end normally, and the program can continue executing as expected.

Usage Scenario 3: Shared Variables in Interrupt Service Routines (ISR)

In the interaction between the main loop and interrupts, it is common for the main program and the interrupt service routine to share the same variable. For example, in an embedded system based on timer interrupts, the main program may need to perform certain operations based on the number of timer interrupts that have occurred, while the timer interrupt service routine modifies a variable representing the interrupt count each time an interrupt occurs. Since interrupts are asynchronous, they can occur at any time during the execution of the main program, leading to the shared variable’s value being changed unexpectedly. If the main program accesses this shared variable while the compiler has optimized it, such as caching the variable’s value in a register, and the interrupt service routine modifies the variable’s value afterward, the main program will still use the old value from the register, resulting in logical errors in the program. Therefore, to ensure that the main program can obtain the latest value of the shared variable in real-time, the volatile keyword must be used to modify this shared variable, preventing the compiler from optimizing it and ensuring that the latest value is read from memory each time it is accessed.

Assume there is a simple embedded system that uses a button to trigger an external interrupt. When the button is pressed, the interrupt service routine sets a flag, and the main program executes corresponding operations based on this flag, such as turning on an LED. Below is a code example implementing this functionality:

#include <reg51.h>  // Assume using 51 microcontroller
// Define the port connected to the LED
sbit LED = P1^0;
// Define the interrupt flag, note that <strong>volatile</strong> is used here
volatile bit flag = 0;
// External interrupt 0 service routine
void External0_ISR(void) interrupt 0 {
    flag = 1;  // Button pressed triggers interrupt, set flag
}
void main() {
    // Configure external interrupt 0 to trigger on falling edge
    IT0 = 1;      // Enable external interrupt 0
    EX0 = 1;      // Enable global interrupt
    EA = 1;
    while (1) {
        if (flag) {
            LED = 0;  // Turn on LED
            flag = 0;  // Clear flag, prepare for next detection
        }
    }
}

In this code, if the flag variable is not modified with volatile, the compiler may optimize the condition check if (flag) in the while loop. The compiler may think that the value of flag has not been explicitly modified in the while loop, so it will cache the value of flag in a register and only read from the register in the loop instead of memory. As a result, even if the button is pressed and the interrupt service routine sets flag to 1, the main program will still use the cached value from the register, and the if (flag) condition will never be true, meaning the LED will never be turned on.

When the volatile keyword is used to modify the flag variable, the compiler will know that the value of this variable may change in unpredictable ways during program execution (because it will be modified by the interrupt service routine). Therefore, the compiler will read the latest value of flag directly from memory each time it accesses it. This way, when the button is pressed and the interrupt service routine sets flag to 1, the main program can promptly detect the change in flag and execute the operation to turn on the LED.

Embedded Development in C: When to Use the Volatile Keyword for Variables?

Related posts

Leave a Comment Cancel reply