How STM32 Combines Software and Hardware for Execution

Have you ever wondered why software can control hardware?

This article analyzes how the STM32 microcontroller combines software and hardware and how the microcontroller program is compiled and executed.

Software and Hardware Integration

Beginners often have a question: why can software control hardware? Just like the 51 microcontroller, why can we output high and low levels at the IO port simply by writing P1=0X55? To clarify this issue, we first need to understand a concept: address space.

Address Space

What is address space? The so-called address space is the addressing range of the PC pointer, and it is also called the addressing space.

We all know that our computers have 32-bit and 64-bit systems. Why? Because in a 32-bit system, the PC pointer is a 32-bit binary number, which is 0xffffffff, and the range is only 4G of addressing space. As memory becomes larger, 4G is simply not enough, so it needs to be expanded. To access memory beyond 4G, a 64-bit system was created. How many bits is the STM32? It is 32-bit, hence the PC pointer is also 32-bit, and the addressing space is 4G.

Let’s take a look at how the STM32’s addressing space is allocated. In the data sheet “STM32F407_Data Sheet.pdf”, there is a diagram that shows the memory allocation of the STM32. All chips will have this diagram, usually called a Memory map. When using a new chip, first check this diagram.

How STM32 Combines Software and Hardware for Execution

On the far left, there are 8 blocks, each 512M, totaling 4G, which is the chip’s addressing space.
Block 0 contains a section called FLASH, which is the internal FLASH where our program is downloaded, starting at address 0X8000000. Note that this only has 1M of space. Now STM32 has chips with 2M flash; where does the FLASH beyond 1M go? Please refer to the corresponding chip manual.
In block 1, there are two segments of SRAM, totaling 128K, which is the memory we mentioned earlier, used to store variables used by the program. If needed, the program can also run from SRAM. Doesn’t the 407 have 196K?
Actually, the 407 has 196K of memory, but 64K is not ordinary SRAM; it is located in block 0’s CCM. These two areas are not contiguous, and CCM can only be used by the core; peripherals cannot use CCM memory, or else it will crash.
Block 2 is for Peripherals, which is the peripheral space. Looking to the right, it mainly involves APB1/APB2, AHB1/AHB2; what are these? We will discuss this later.
Blocks 3, 4, and 5 are for FSMC space, which can expand to external SRAM, NAND FLASH, LCD, and other peripherals.

Now that we have analyzed the addressing space, let’s look back at how software controls hardware. For this confusion, you can also refer to this article: How Code Controls Hardware? In the example of outputting at the IO port, we configure the IO port by calling library functions. Let’s see how the library function does this.

For example:

GPIO_SetBits(GPIOG, GPIO_Pin_0 | GPIO_Pin_1 | GPIO_Pin_2 | GPIO_Pin_3);

This function actually assigns a value to a variable, specifically assigning to the member BSRRL of the GPIOx structure.

void GPIO_SetBits(GPIO_TypeDef* GPIOx, uint16_t GPIO_Pin){ /* Check the parameters */ assert_param(IS_GPIO_ALL_PERIPH(GPIOx)); assert_param(IS_GPIO_PIN(GPIO_Pin)); GPIOx->BSRRL = GPIO_Pin;}

assert_param: This is an assertion used to check whether the input parameters meet the requirements. GPIOx is an input parameter, which is a pointer to a GPIO_TypeDef structure, so we use -> to access its members.

GPIOx is the parameter we passed in, GPIOG. What exactly is this? It is defined in stm32f4xx.h.

#define GPIOG               ((GPIO_TypeDef *) GPIOG_BASE)

GPIOG_BASE is also defined in the file as follows:

#define GPIOG_BASE           (AHB1PERIPH_BASE + 0x1800)

AHB1PERIPH_BASE, the AHB1 address, is starting to make sense, right? Let’s look further.

/*!< Peripheral memory map */#define APB1PERIPH_BASE       PERIPH_BASE#define APB2PERIPH_BASE       (PERIPH_BASE + 0x00010000)#define AHB1PERIPH_BASE       (PERIPH_BASE + 0x00020000)#define AHB2PERIPH_BASE       (PERIPH_BASE + 0x10000000)

Now let’s find the definition of PERIPH_BASE.

#define PERIPH_BASE           ((uint32_t)0x40000000)

At this point, we can see that operating on IO port G is actually manipulating a member of a structure at the address 0X40000000+0X1800. In other words, it is manipulating a register at this location. Essentially, this is the same as operating on a normal variable, just like the two lines of code below, the difference being that variable i is at the SRAM space address, while 0X40000000+0X1800 is at the peripheral space address.

u32 i; i = 0x55aa55aa;

This peripheral space address register is part of the IO port hardware. As shown in the figure below, the output data register on the left is the register (memory, variable) we are manipulating, and its address is 0X40000000+0X1800+0x14.

Controlling other peripherals is similar; it is just writing data to the peripheral register, just like operating on memory, and thus controlling the peripherals.

Registers should actually be considered a general term for memory; peripheral registers should be called special registers. Gradually, everyone started referring to peripherals as registers, while others are referred to as memory or RAM. Why can registers control hardware peripherals? Because, roughly speaking, a BIT in a register is a switch: on is 1, off is 0. By using this electronic switch to control the circuit, we control the peripheral hardware.

Pure Software – A Comprehensive Small Program

We have completed the control of serial ports and IO ports, but we only know how to use them, with no knowledge of the rest. How does a program run? Regarding how programs run on microcontrollers, you can also watch this video: Animation Demonstration of How Microcontrollers Run Programs. Where exactly is the code stored? How is memory preserved? Below, we will learn the basic elements of embedded software through a simple program.

Analyzing Startup Code

Where does the function start running?

Every chip has a reset function. After resetting, the PC pointer of the chip (a register that indicates the program running position; for multi-stage pipeline chips, the PC may not be consistent with the actual instruction execution position, but for now, let’s assume they are consistent) will reset to a fixed value, usually 0x00000000. In STM32, it resets to 0X08000004. Therefore, the first line of code executed after the reset is 0X08000004. Earlier, we copied a startup code file to the project, startup_stm32f40_41xxx.s. Why is this assembly file called startup code? Because the assembly program inside is the program executed after reset. In the file, there is a data table called interrupt vector, which stores the execution addresses of various interrupts. Reset is also an interrupt.

When the chip resets, it loads the value of Reset_Handler (a function pointer) from the interrupt table into the PC pointer, and the chip will execute the Reset_Handler function. (A function entry is a pointer)

; Vector Table Mapped to Address 0 at Reset                AREA    RESET, DATA, READONLY                EXPORT  __Vectors                EXPORT  __Vectors_End                EXPORT  __Vectors_Size__Vectors       DCD     __initial_sp               ; Top of Stack                DCD     Reset_Handler              ; Reset Handler                DCD     NMI_Handler                ; NMI Handler                DCD     HardFault_Handler          ; Hard Fault Handler                DCD     MemManage_Handler          ; MPU Fault Handler                DCD     BusFault_Handler           ; Bus Fault Handler                DCD     UsageFault_Handler         ; Usage Fault Handler

The Reset_Handler function first executes the SystemInit function, which is in the standard library and mainly initializes the chip clock. Then it jumps to __main for execution. What is the __main function?

Is it the main function we define in main.c? We will discuss this later.

How does the chip know to start executing the startup code? Or how do we place this startup code at the reset position? This involves a file that generally does not attract attention, wujique.sct. This file is located in the wujique\prj\Objects directory and is usually referred to as a scatter-loading file. The compiler tool uses this file to place various code segments and variables during linking.

In the MDK software, under the Options menu, there are settings regarding this menu.

Uncheck the box for Use Memory Layout from Target Dialog, and the previously unmodifiable boxes can now be modified. Click Edit to edit.

The code editing box displays the contents of the scatter-loading file, and the current file only contains basic content.

This file is actually very powerful; by modifying this file, many program features can be configured, such as: 1. Specifying FLASH and RAM sizes and starting positions. When we divide the program into BOOT, CORE, APP, or even perform driver separation, this can come in handy. 2. Specifying the locations of functions and variables, for example, loading functions to run in RAM.

From this basic scatter-loading file, we can see:

Line 6 ER_IROM1 0x08000000 0x00080000 defines ER_IROM1, which is our internal FLASH starting from 0x08000000 with a size of 0x00080000.
Line 7 .o (RESET, +First) starts from 0x08000000, first placing a .o file, and using (RESET, +First) to specify that the RESET block is prioritized for placement. What is the RESET block? Please check the startup code; the interrupt vector is an AREA named RESET, which belongs to READONLY. This means that after compilation, the RESET block will be placed at the 0x08000000 position, which means the interrupt vector will be placed here. DCD allocates space, 4 bytes, the first being __initial_sp, and the second being the pointer to the Reset_Handler function. This means that after compilation, the pointer (address) of the Reset_Handler function will be placed at 0x800000+4. Therefore, when the chip resets, it can find the reset function Reset_Handler.
Line 8 *(InRoot$$Sections) What is this? GOOGLE it! We will discuss it later.
Line 9 .ANY (+RO) means all other RO will be placed subsequently. This means other code will follow the startup code.
Line 11 RW_IRAM1 0x20000000 0x00020000 defines the size of RAM.
Line 12 .ANY (+RW +ZI) means all RW and ZI will be placed in RAM. RW, ZI refer to variables, this line specifies where variables will be stored.

Analyzing User Code

At this point, the basic startup process has been analyzed. Next, we will analyze the user code starting from the main function.

1. After the program jumps to the main function: RCC_GetClocksFreq retrieves the RCC clock frequency; SysTick_Config configures SysTick, enabling the SysTick interrupt every 10 milliseconds. Delay(5); delays for 50 milliseconds.

int main(void){  GPIO_InitTypeDef GPIO_InitStructure; /*!< At this stage the microcontroller clock setting is already configured,       this is done through SystemInit() function which is called from startup       files before to branch to application main.       To reconfigure the default setting of SystemInit() function,       refer to system_stm32f4xx.c file */  /* SysTick end of count event each 10ms */  RCC_GetClocksFreq(&amp;RCC_Clocks);  SysTick_Config(RCC_Clocks.HCLK_Frequency / 100);  /* Add your application code here */  /* Insert 50 ms delay */  Delay(5);

2. The initialization of IO is not discussed here; we enter while(1), which is an infinite loop. Embedded programs are typically infinite loops; otherwise, they will run away.

/* Initialize LED IO port */ RCC_AHB1PeriphClockCmd(RCC_AHB1Periph_GPIOG, ENABLE); GPIO_InitStructure.GPIO_Pin = GPIO_Pin_0 | GPIO_Pin_1 | GPIO_Pin_2 | GPIO_Pin_3; GPIO_InitStructure.GPIO_Mode = GPIO_Mode_OUT; GPIO_InitStructure.GPIO_OType = GPIO_OType_PP; GPIO_InitStructure.GPIO_Speed = GPIO_Speed_100MHz; GPIO_InitStructure.GPIO_PuPd = GPIO_PuPd_UP; GPIO_Init(GPIOG, &amp;GPIO_InitStructure); /* Infinite loop */ mcu_uart_open(3); while (1){  GPIO_ResetBits(GPIOG, GPIO_Pin_0|GPIO_Pin_1|GPIO_Pin_2|GPIO_Pin_3);  Delay(100);  GPIO_SetBits(GPIOG, GPIO_Pin_0|GPIO_Pin_1|GPIO_Pin_2|GPIO_Pin_3);  Delay(100);  mcu_uart_test(); TestFun(TestTmp2);}

3. Inside the while(1), the TestFun function is called, which uses two global variables and two local variables.

/* Private functions ---------------------------------------------------------*/ u32 TestTmp1 = 5; // Global variable, initialized to 5 u32 TestTmp2; // Global variable, uninitialized const u32 TestTmp3[10] = {6,7,8,9,10,11,12,13,12,13}; u8 TestFun(u32 x) // Function with one parameter, returning a u8 value { u8 test_tmp1 = 4; // Local variable, initialized u8 test_tmp2; // Local variable, uninitialized static u8 test_tmp3 = 0; // Static local variable test_tmp3++; test_tmp2 = x; if(test_tmp2 > TestTmp1) test_tmp1 = 10; else test_tmp1 = 5; TestTmp2 += TestTmp3[test_tmp1]; return test_tmp1;}

Then the program continues to execute inside the while loop of the main function. What about interrupts? Yes, there are interrupts. Interrupts break the normal execution flow of the program.We check the Delay function, does uwTimingDelay not equal 0 just wait? Who will set uwTimingDelay to 0?

/**  * @brief  Inserts a delay time.  * @param  nTime: specifies the delay time length, in milliseconds.  * @retval None  */ void Delay(__IO uint32_t nTime){  uwTimingDelay = nTime; while(uwTimingDelay != 0);}

Search for the uwTimingDelay variable, the TimingDelay_Decrement function will decrement the variable to 0.

/**  * @brief  Decrements the TimingDelay variable.  * @param  None  * @retval None  */ void TimingDelay_Decrement(void){  if (uwTimingDelay != 0x00)  {    uwTimingDelay--;  }}

Where is this function executed? Upon searching, it runs in the SysTick_Handler function. Who uses this function?

/**  * @brief  This function handles SysTick Handler.  * @param  None  * @retval None  */ void SysTick_Handler(void){  TimingDelay_Decrement();}

Upon searching, this function is in the interrupt vector table, which means this function pointer is stored in the interrupt vector table. When an interrupt occurs, this function will be executed. Of course, entering and exiting an interrupt will involve saving and restoring the context, which mainly involves assembly and will not be analyzed here for now. If interested, please study it yourself. Typically, we do not need to worry about context switching when developing programs now.

__Vectors       DCD     __initial_sp               ; Top of Stack                DCD     Reset_Handler              ; Reset Handler                DCD     NMI_Handler                ; NMI Handler                DCD     HardFault_Handler          ; Hard Fault Handler                DCD     MemManage_Handler          ; MPU Fault Handler                DCD     BusFault_Handler           ; Bus Fault Handler                DCD     UsageFault_Handler         ; Usage Fault Handler                DCD     0                          ; Reserved                DCD     0                          ; Reserved                DCD     0                          ; Reserved                DCD     0                          ; Reserved                DCD     SVC_Handler                ; SVCall Handler                DCD     DebugMon_Handler           ; Debug Monitor Handler                DCD     0                          ; Reserved                DCD     PendSV_Handler             ; PendSV Handler                DCD     SysTick_Handler            ; SysTick Handler

Remaining Questions

1. What is the __main function? Is it the main function defined in main.c? 2. What is *(InRoot$$Sections) in the scatter-loading file? 3. When is the ZI segment, which is initialized to 0, initialized? Who initializes it?

Why were these questions left unanswered earlier? Because they are related to the same topic. Follow the clues!

Understanding Code Composition Through MAP Files

Compilation Results

After the program is compiled, information is output in the Build Output window below:

*** Using Compiler 'V5.06 update 5 (build 528)', folder: 'C:\Keil_v5\ARM\ARMCC\Bin' Build target 'wujique' compiling stm32f4xx_it.c...... assembling startup_stm32f40_41xxx.s... compiling misc.c...... compiling mcu_uart.c... linking... Program Size: Code=9038 RO-data=990 RW-data=40 ZI-data=6000  FromELF: creating hex file...".\Objects\wujique.axf" - 0 Error(s), 0 Warning(s). Build Time Elapsed:  00:00:32

The build target is wujique.

Compiling C files and assembling assembly files; this process is called compilation.

After compilation, linking occurs.

Finally, we obtain a compilation result: 9038 bytes of code, 990 RO, 40 RW, 6000 ZI. CODE is the code, which is easy to understand; what are RO, RW, and ZI?

FromELF creates hex files; FromELF is a good tool that needs to be added to options to use.

MAP File Configuration

More detailed compilation information is in the map file. In MDK Options, we can see that all information is placed in \Listings\wujique.map

The default settings may not check many compilation information boxes; checking all information will increase compilation time.

MAP File

Open the map file; it looks messy? You will get used to it. We just need to focus on the key points.

Total MAP Information

Starting from the end, can you see? The last segment of MAP content outlines the basic overview of the entire program.

How much RO is there? What exactly is RO?

How much RW is there? What is RW?

Why doesn’t ROM include ZI Data? Why does it include RW Data?

Image Component Sizes

Looking upward, we find Image component sizes, which are more detailed than the overall statistics we just saw.

This part of the content outlines the overview of each source file.

First, regarding our own source code, our program has little code, only main.o, wujique_log.o, and some STM32 library files.

The second part consists of files in the library. Do you see? There is a main.o inside. Is the main function our defined main function? It is obviously not; our main function is placed in the main.o file. In such a small project, so many libraries are used. Have you ever paid attention to this? Probably not, unless you have compressed a program that originally ran on 1M flash to fit in 512K.

The third part is also from the library, and we have not analyzed what these two are for now.

What are library files? Library files are pre-written code libraries. In code, we often include some header files, such as:
#include &lt;stdarg.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt;
These are the header files of the library. These header files are stored in the installation directory of the MDK development tool. Commonly used library functions include: memcpy, memcmp, strcmp, etc. As long as these functions are included in the code, the library files will be linked.

MAP File

location of each code segment (function) and variable in ROM and RAM. First, the ROM at 0x08000000 indeed contains the RESET from startup_stm32f40_41xxx.o.

What are library files?

Library files are pre-written code libraries.

In code, we often include some header files, such as:

#include &lt;stdarg.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt;

These are the header files of the library. These header files are stored in the installation directory of the MDK development tool.

Commonly used library functions include: memcpy, memcmp, strcmp, etc.

As long as these functions are included in the code, the library files will be linked.

MAP File

Going further up, we reach the MAP file, which indicates the location of each code segment (function) and variable in ROM and RAM. First, the ROM at 0x08000000 indeed contains the RESET from startup_stm32f40_41xxx.o.

Each file has multiple lines, for example, the serial port has 4 functions.

Then there are the RAM variables, with variables in main.o stored at 0x20000000, totaling 0x0000000c, of type Data, RW. The serial port has two types of variables: data and bss. What is bss? These two names are section names, referring to segments. Look at the preceding type and attributes.

RW Data is stored in the .data segment; ZI Data is stored in the .bss segment. RW Zero actually refers to ZI. Which variables are RW and which are ZI?

Image Symbol Table

Going further up is the Image Symbol Table, which provides information about each function or variable.

For example, the global variable TestTmp1 is Data, 4 bytes, allocated at 0x20000004.

Where is the TestTmp3 array stored? It is stored at 0X080024E0, which is in the code area. Because we used the const modifier for this global variable array, we told the compiler that this array cannot be changed, so the compiler stored this array in the code. In programs, we often use large data arrays, such as character dot matrices, which can be several K or even tens of K large. It is unnecessary to place them in the RAM area since these data do not change during the entire program execution. Therefore, by using the const modifier, we store them in the code area.

The const modifier has multiple uses; it can modify variables as well as functions. More usage can be learned independently.

Where are local variables stored? We found test_tmp3.

We did not find test_tmp1/test_tmp2; why? Because during definition, test_tmp3 was defined as static, meaning it behaves like a global variable but is defined within the function, which restricts this global variable to be used only within this function. Where are test_tmp1 and test_tmp2 stored? Local variables do not have space allocated during compilation and linking; space is allocated from the stack only during runtime.

u8 TestFun(u32 x) // Function with one parameter, returning a u8 value { u8 test_tmp1 = 4; // Local variable, initialized u8 test_tmp2; // Local variable, uninitialized static u8 test_tmp3 = 0; // Static local variable}

In the previous section, we left a question: which variables are RW and which are ZI? Let’s look at the serial port variable situation: UartBuf3 is placed in the bss segment, while other variables are placed in the .data segment. Why is the array placed in bss? BSS stands for Block Started by Symbol.

At this point, we can explain the following concepts:

Code refers to code and functions.

RO Data refers to read-only variables, such as arrays modified with const.

RW Data refers to read-write variables, such as global variables and static modified local variables.

ZI Data refers to read-write variables that are automatically initialized to 0 by the system, most of which are arrays and are placed in the bss segment.

RO Size equals the sum of code and read-only variables.

RW Size equals read-write variables (including those automatically initialized to 0); this is also the size of RAM.

ROM Size refers to the size of the target file after compilation, which is also the size of FLASH. But why does it include RW Data? Because all global variables need an initialized value (even if not truly initialized, the system will allocate an initialization space). For example, if we define a variable u8 i = 8; this global variable, 8, needs to be stored in the FLASH area.

Let’s examine the functions. Earlier, we had a question: Is __main the same as main? Upon searching for main, we find that main is main, located at 0x08000579.

main is main, located at 0x08000189.

What happened between __main and main? Do you remember this line in the scatter-loading file?

*(InRoot$$Sections)

__main is located within this segment. The following figure shows the address of __main, at 0x08000189. __Vectors is the interrupt vector, placed at the very beginning.

In the scatter-loading file, right after RESET is *(InRoot$$Sections).

Moreover, the size of the RESET segment is exactly 0x00000188.

Coincidence? You can refer to the PPT document “ARM Embedded Software Development.ppt”.

What functionality does this segment of code accomplish? It mainly initializes ZI code, meaning it initializes part of RAM to 0. Other environment initializations…

Conclusion

At this point, we have a basic impression of how a program is composed and how it runs.