Click 👆👆👆 the blue text

Follow “Passion Embedded”

In embedded development, especially in the development of microcontrollers like STM32, the <span>volatile</span> keyword is a very important concept.
1. Concept of the volatile Keyword
<span>volatile</span> is a type modifier in C/C++ that tells the compiler:
- The variable may be modified unexpectedly (e.g., by hardware, interrupts, or multithreading)
- Prevents the compiler from optimizing (e.g., caching to registers, instruction reordering)
- Forces every access to read/write from memory
This feature is crucial in STM32 development for the following scenarios:
- Accessing peripheral registers
- Variables shared between interrupt service routines (ISRs) and the main program
- Shared variables in a multitasking environment
- Memory areas modified by DMA
2. Mechanism of volatile
1. Preventing Compiler Optimization
Dangerous scenario without using volatile:
uint32_t sensor_value = 0;
void main() {
    while(1) {
        if(sensor_value > 100) {
            // Trigger action
        }
    }
}
// Modified in interrupt
void ADC_IRQHandler() {
    sensor_value = ADC1->DR;
}The compiler may cache <span>sensor_value</span> in a register, causing the main loop to never detect changes.
After using volatile:
volatile uint32_t sensor_value = 0;This forces every access to read the latest value from memory.
2. Preventing Instruction Reordering
The compiler/processor may change the order of instructions for optimization.<span>volatile</span> ensures:
- The order of operations on volatile variables remains unchanged
- Surrounding code will not be reordered around volatile operations
3. Scenarios in STM32 that Must Use volatile
1. Accessing Peripheral Registers
All peripheral registers should be declared as volatile:
#define GPIOA_ODR (*(volatile uint32_t *)(0x40020014))Reason: The GPIO state may be changed automatically by hardware.
2. Shared Variables in Interrupts
volatile bool data_ready = false;
void USART1_IRQHandler() {
    data_ready = true; // Modified in interrupt
}
void main() {
    while(!data_ready); // Main loop checks
}3. DMA Buffers
volatile uint8_t dma_buffer[256];DMA transfers are independent of the CPU, requiring volatile to ensure data visibility.
4. Shared Variables in Multitasking
In RTOS:
volatile int shared_counter;5. Watchdog Feed Operations
volatile uint32_t *const WDG_KR = (uint32_t*)0x40003000;
*WDG_KR = 0xAAAA; // Feed the watchdog4. Comparison of Using and Not Using volatile
| Scenario | Using volatile | Not Using volatile | 
|---|---|---|
| Reading Interrupt Flags | Always gets the latest value | May read a cached old value | 
| Accessing GPIO Input Registers | Correctly reflects pin state | May return historical values | 
| Detecting DMA Transfer Completion | Timely detection of completion status | May fail to detect status changes | 
| Shared Counters in Multitasking | Ensures value visibility | May lead to data inconsistency | 
| Configuring Peripheral Control Registers | Ensures configuration executes in order | May be optimized out of order by the compiler | 
5. In-Depth Principles
1. Role of Memory Barriers
In the STM32 Cortex-M core, <span>volatile</span> generates specific memory access instructions:
- <span>LDR</span>/- <span>STR</span>replaces regular load instructions
- Prevents the compiler from reordering memory access
2. Comparison of Compiler Optimization
volatile int *pReg = (int*)0x1234;
int *pNormal = (int*)0x5678;
void func() {
    *pReg = 1;   // Generates STR instruction
    *pReg = 2;   // Generates STR again
    *pNormal = 3; // May be optimized away
    *pNormal = 4; // Only the last one is kept
}3. Relationship with Caching
Although STM32 does not have CPU caching,:
- Register caching still exists (due to compiler optimization)
- DMA and CPU memory access need synchronization
6. Practical Development Cases
Case 1: GPIO Input Detection
// Incorrect writing
if(GPIOA->IDR & GPIO_PIN_5) {
    // May be optimized to a single read
}
// Correct writing
volatile uint32_t *pIDR = &GPIOA->IDR;
if(*pIDR & GPIO_PIN_5) { 
    // Will actually read the register every time
}Case 2: Delay Loop Optimization
volatile uint32_t delay;
for(delay=0; delay<1000000; delay++); // Without volatile, the loop may be completely optimized away7. Precautions
- 
Cannot replace atomic operations volatile int i = 0; i++; // This is not an atomic operation!Needs to be used with critical sections or atomic instructions 
- 
Do not misuse Unnecessary volatile can degrade performance 
- 
Use in conjunction with const volatile const uint32_t *pReg = 0x40021000;
- 
Debugging Tips Observe disassembly in MDK/IAR: LDR R0, [R1] ; volatile access MOV R0, #5 ; non-volatile may directly use immediate value
8. Performance Impact Analysis
Tested on STM32F103 (Cortex-M3):
| Operation | No volatile (cycles) | With volatile (cycles) | 
|---|---|---|
| Single variable read | 2 | 4 | 
| Loop 100 times access | 200 | 400 | 
| Continuous register write operations | May be merged into 1 write | Each write is independent | 
9. Best Practice Recommendations
- All peripheral register pointers must include volatile
- Variables shared between ISRs and the main program must be volatile
- Use CMSIS-defined registers (which already include volatile)
- Use volatile for memory areas that may be modified by hardware
- Shared variables in multitasking need to be used with mutex mechanisms
- Pay special attention in highly optimized code (-O2/-O3)



Share this

Like this

Watching this