Running Embedded System Programs in FLASH or RAM

    Follow and star the public account to reach wonderful content directly.




Question 1: How is the code in FLASH executed? For example, where is the PC pointer set and by whom?

Taking ARM as an example:

For ARM Cortex-M3/4 microcontrollers (like STM32 etc.): The code of this type of microcontroller runs directly in NOR flash, and the Cortex core can run it directly without loading the code into RAM.

For ARM Cortex-A series SoCs (like Exynos4412): These SoCs are more complex, usually with a Memory Management Unit (MMU). The code is stored in NAND flash, and when the program runs, it needs to be loaded into RAM. The startup phase of this type of SoC includes loading the program. Just like the Windows operating system is stored on the hard drive, when booting, the operating system code is loaded into memory (RAM).

PC pointer: Regardless of the microcontroller or SoC, there is a PC register that holds the address of the next instruction to be fetched. Normally, it automatically increments by "4"; when a branch jump occurs, it is set by the jump instruction. So, what is a pointer? A pointer is the address of a variable. When there is an operating system (like Linux or Windows) and a hardware-level Memory Management Unit (MMU), the pointer is a virtual address; without an operating system, it is a physical address, and the virtual address and physical address are converted through the MMU.

Question 2: Do these codes need to be moved to RAM to run? What would be wrong if this is not done?

As mentioned above, most microcontroller codes run directly in NOR flash, while a small number need to be loaded into RAM. NOR flash can directly address a byte and find the specific address of an instruction, so it can run directly. NAND flash's storage units are blocks and cannot directly address instructions, so the code stored in NAND flash cannot run without being loaded into RAM. Just like your Windows on the hard drive cannot run without being loaded into memory.

Question 3: If it needs to be moved to RAM, is there a difference between on-chip and off-chip?

Both on-chip and off-chip can be used; it specifically depends on the SoC or CPU.

Question 4: If a user has a FLASH code size (like 1MB) that exceeds the available RAM space (like 512KB), what is the transfer process like?

Currently, it is rare to encounter such situations. Of course, there may be systems with very little RAM that can use time-sharing and segment loading, i.e., running one segment, loading another, running it, and then loading the next segment. It is highly discouraged to do this; modern RAM is quite large; when your actual code reaches 1MB, your memory may have 1G or 2G. For example, after compiling the Linux operating system, it is only a few MB, while the actual Linux system may have several GB of memory available.

Question 5: Compared to on-chip FLASH and SRAM, what performance speed differences are there with off-chip extensions?

It specifically depends on the bus design of the SoC. Generally speaking, off-chip performance is slightly weaker.

Whether the program code can be executed directly in Flash depends on the access characteristics of Flash.

Flash memory is organized in blocks, and it is more efficient to access it in blocks. Flash is similar to ROM in that it is a type of memory that can be read and written. However, unlike RAM, which can be read and written freely, it must first erase the block where the write location is before writing data. Therefore, if you want to modify data in Flash, you must first cache the block containing the data into memory, modify the data in memory, and then write the block back, ensuring no data loss but at a high cost. During reading, it is often necessary to locate the block first, and then read sequentially within the block. Randomly reading data between different blocks is very inefficient, so block reading and writing is a major feature of Flash; it cannot address storage areas arbitrarily, typically like NAND Flash.

However, there is a type of Flash memory that allows arbitrary addressing during data reading without significant overhead; its read operation is close to that of RAM, while its write operation still follows the characteristics of block erase and block write, typically like NOR Flash.

Thus, due to these characteristics, Flash is usually used to store data that does not require frequent changes and cannot be lost during power outages.

After introducing the background knowledge, let’s return to the question:

First, it is important to understand that the CPU needs to read instructions from memory, and the instruction address is given by the PC register. After executing each instruction, the PC will automatically point to the next instruction. If the length of the instructions varies, the addresses may not always have consistent alignment. Additionally, program execution always involves jumps, making instruction addressing more arbitrary. Therefore, to execute a program directly in a certain type of memory, at least the data reading must allow arbitrary addressing, and NOR Flash happens to meet this requirement. The commonly found Flash in MCUs on the market is this type, so it can run the stored program directly without needing to load it into RAM. Other storage types that do not have this access characteristic cannot execute programs directly and must be transferred to memory that meets this access characteristic for execution, such as loading into RAM.

1. How is the code in FLASH executed? For example, where is the PC pointer set and by whom?

Microcontrollers using the Cortex-M core will map the startup memory to the 0x00000000 address based on the level of external startup configuration pins. If it starts from Flash, an exception interrupt vector table is stored at the beginning of the internal Flash. The first and second items in this table store the initial stack address and reset vector. The position of this table is configurable, and after reset, it is at the 0x00000000 address. After hardware power-on reset, the SP and PC registers are automatically set to the first two items in the table, and code execution begins based on the initial value set by the PC.

2. Do these codes need to be moved to RAM to run? What would be wrong if this is not done?

As mentioned earlier, it is not necessary. Executing in RAM may yield better performance, but for the internal NOR Flash of the MCU, it is not required. It is worth mentioning that programs typically consist of code segments (text), read-only data segments (rodata), initialized data segments (data), and uninitialized data segments (bss, which has no data). The read-only data segment does not need to change, like the code segment, so it can remain in Flash. However, the data segment stored in Flash needs to be loaded into RAM, and space must be made for bss. This is the initialization of the runtime environment, which involves transfer, but the transfer does not include code. This occurs before entering the main function.

3. If it needs to be moved to RAM, is there a difference between on-chip and off-chip?

On-chip RAM generally has better performance, but the capacity typically cannot be too large.

4. If a user has a FLASH code size (like 1MB) that exceeds the available RAM space (like 512KB), what is the transfer process like?

It can be executed in stages, but the program organization will become complex, and execution will be inefficient. If such a situation arises, consider changing the hardware configuration or optimizing the program.

5. Compared to off-chip FLASH and SRAM, what performance speed differences are there with on-chip extensions?

This depends on the clock rate and access delay of the memory. Integrated internal memory generally performs better than off-chip memory. Low-end MCUs, due to low operating speeds, may not have significant differences between internal and external memory.

This can be answered from the following three aspects:

1. Computer Architecture Principles

The Von Neumann Model

Students majoring in computer science are certainly familiar with this diagram, which is the classic computer model. All computing devices (including embedded systems) fall under this model. The five components can be divided into three parts: (1) CU and ALU are the CPU (2) Memory is the memory device (ideal memory device) (3) Input and Output are various peripheral devices (keyboard, mouse, display, etc.).

Here, we focus on memory devices. In the Von Neumann model, Memory is imagined as an ideal memory device. An ideal memory device is one that is readable and writable, non-volatile, and allows random read and write. For theoretical models, simplicity and understandability are key. However, in reality, no memory is so ideal due to cost constraints, and different memories can only meet part of the criteria. This leads to the discussion of mainstream memories.

2. Characteristics of Mainstream Memories

Current memories can be broadly divided into two categories: RAM and ROM. Many summaries have already been made regarding the specific definitions and development history of these two types of memory, so I will not elaborate further. From my personal perspective, I will discuss my understanding of these two types of memory.

(1) RAM can be specifically divided into SRAM and DRAM.

Common characteristics (the fundamental characteristics of RAM): Readable and writable, random read and write.

Differences: SRAM can be used immediately on power-up, while DRAM needs to be initialized before use, and SRAM has a higher unit cost than DRAM.

(2) ROM can be specifically divided into hard drives and Flash (NOR Flash and NAND Flash).

Common characteristics: Non-volatile.

Differences: Hard drives and NAND Flash are read and written in whole blocks, while NOR Flash can be read randomly but requires whole blocks to be written.

The non-volatile and random read characteristics of NOR Flash allow it to serve as the system's boot medium.

3. The speed difference between the CPU and memory is the main factor limiting computer performance today.

4. Specific to the above five questions:

(1) It specifically depends on the type of Flash. If it is NOR Flash, the system can directly access and execute it. If it is NAND Flash, the code needs to be loaded into RAM to run. The PC register is set by hardware to a specific value (for example, the default value of the ARM Cortex-M3 PC register is 0x4) when the CPU is powered on.

(2) As with the first question, whether the code needs to be moved depends on the type of Flash.

(3) If the RAM types are the same, there is no difference between on-chip and off-chip. If the RAM types are different, specific situations must be analyzed.

(4) Here, we assume Flash is 1MB and RAM is 512KB. It is presumed to be NOR Flash and SRAM (like STM32), so the code does not need to be moved. If it is NAND Flash, it is generally paired with DRAM, which has a much larger capacity, so this assumption may not hold.

Copyright Statement:This article is sourced from the internet, freely conveying knowledge, and the copyright belongs to the original author. If there are copyright issues, please contact me for deletion.

‧‧‧‧‧‧‧‧‧‧‧‧‧‧‧‧ END ‧‧‧‧‧‧‧‧‧‧‧‧‧‧‧

Follow my public WeChat account, reply "Join Group" to join the technical exchange group according to the rules.

Click "Read the original" for more sharing. Feel free to share, bookmark, like, or view it.

Related posts

Leave a Comment Cancel reply