Porting U-Boot from Scratch: A Step-by-Step Dissection of the SoC Boot Process

Porting U-Boot from Scratch: A Step-by-Step Dissection of the SoC Boot Process

In today’s rapidly advancing technology, System on Chip (SoC) shines like a brilliant star, widely illuminating various fields such as smartphones, smart homes, and automotive devices, becoming the core force driving continuous innovation in modern electronic devices. It intricately integrates the CPU, GPU, memory, and various communication modules into a compact chip, condensing the “smart brain” of devices into a small chip, bringing unprecedented convenience and intelligent experiences to our lives.

However, an SoC does not operate efficiently immediately upon power-up. The process from powering the chip to the system running smoothly hides a complex and precise sequence. This process is akin to a carefully orchestrated symphony, where each link is closely connected and indispensable; any error in a single note could lead to the entire “movement” being out of tune. Next, let us delve into the mysterious journey of SoC booting, unveiling the secrets behind the moment the chip powers on to the system’s eventual stable operation.

Part 1What is SoC?

SoC, or System on Chip, is a technology that integrates most or all functions of a computer or other electronic systems onto a single chip. This integrated design reduces the number of external components, lowers power consumption and costs, while improving system reliability and performance. The core components of an SoC include the CPU, GPU, memory, input/output interfaces, and communication modules, with different components responsible for different functions, working together to achieve system operation.

SoCs are widely used in modern electronic devices. For example, in smartphones, the SoC typically integrates high-performance CPUs and GPUs, providing a smooth user experience, easily handling both everyday applications and high-performance games; the integrated 4G/5G communication module allows smartphones to quickly connect to the internet for high-speed data transmission; the power management unit optimizes battery usage, extending the smartphone’s battery life. In addition to smartphones, SoCs are also used in smart home devices, wearable devices, autonomous vehicles, and many other fields, driving the intelligent development of these devices.

1.1 Preparation Before U-Boot Booting

⑴ Linker Script and Program Entry

In the SoC boot process, the linker script plays a crucial role, defining the memory layout and functions required for program execution and dynamic linking, determining the program’s entry address. For example, in U-Boot’s linker script, the ENTRY (_start) specifies that the program entry address is _start, which is defined in the arch/arm/lib/vectors.S file. Through the linker script, we can clearly understand the organization of files and data, laying the foundation for subsequent analysis of the program flow.

⑵ Image Container

The image container is used in SoC booting to store and manage the boot image. Common image containers include U-Boot, FIT, etc. These image containers contain various files required for booting, such as the kernel and device tree, organized in a specific format to ensure the system can load and start correctly. Different image containers have different characteristics and applicable scenarios, and developers need to choose the appropriate image container based on specific requirements.

⑶ SPL Booting

SPL, or Secondary Program Loader, is generally loaded by the boot ROM in the boot chain, serving as the second-level boot image (BL2). It is mainly used to complete the initialization of some basic modules and DDR, as well as to load the next-level image U-Boot. Since SPL needs to be loaded into SRAM for execution, for some systems with a relatively small SRAM size, it may not be possible to fit the entire SPL image; in this case, TPL needs to be introduced to solve this problem. During the boot process, SPL first completes DDR initialization, then jumps back to the boot ROM, where the boot ROM completes the loading of TPL, and TPL completes the final loading of U-Boot.

SPL (Secondary Program Loader) program flow is as follows:

  1. Initialize ARM processor
  2. Initialize serial console
  3. Configure clock and basic division
  4. Initialize SDRAM
  5. Configure pin multiplexing functions
  6. Start device initialization (i.e., the selected boot device above)
  7. Load the complete U-Boot/kernel program and transfer control

⑷ ATF Booting

ATF, or Arm Trusted Firmware, is a trusted firmware introduced by ARM to enhance system security, supporting only ARMv7 and ARMv8 architectures. Its basic boot process is BL1 – BL2 – BL31 – BL32 – BL33 (U-Boot), meaning that U-Boot is started after BL32 is completed, with U-Boot serving as the last-level image in the boot chain, used to start the final OS.

During the boot process, ATF is responsible for setting up runtime services for the main CPU before passing control to the bootloader or operating system, completing the initialization and configuration of the CPU, memory, interrupts, etc., through a series of function calls and operations, ensuring the secure boot of the system.

1.2 U-Boot Initialization Process

⑴ U-Boot Booting

U-Boot, short for Universal Boot Loader, is an open-source project that follows GPL terms, serving the purpose of system booting and supporting various OS and instruction set processors. U-Boot’s booting begins executing from the entry address specified in the linker script, first performing a series of initialization operations, such as saving parameters passed from the previous image, checking if the code segment is 4K aligned, resetting the SCTLR register, setting the exception vector table, etc. These initialization operations ensure that U-Boot can run normally on the target platform, preparing for the subsequent boot process.

⑵ Driver Initialization

During the U-Boot boot process, various drivers need to be initialized to ensure that hardware devices can function properly. These drivers include serial drivers, network card drivers, storage device drivers, etc. For example, the initialization process of the serial driver typically includes setting the baud rate, data bits, stop bits, and enabling serial interrupts, ensuring that U-Boot can communicate with external devices via the serial port, output debugging information, and receive user input.

⑶ Interaction Principles

U-Boot provides a powerful command-line interface, allowing users to interact with U-Boot through commands to perform various operations. For example, using the help command displays all commands and their descriptions; using the pri command shows all environment variables; using the setenv command modifies the value of a specific environment variable; using the saveenv command saves all currently set environment variables, preventing loss during power-off. Through these commands, users can conveniently configure and debug the system to meet different application needs.

1.3 Kernel Initialization Process

⑴ The First Line of Code Running in the Kernel

The entry point of the Linux kernel is stext, defined in the file arch/arm/kernel/head.S. When U-Boot completes initialization and hands over control to the kernel, the kernel begins executing from stext. This first line of code typically consists of some assembly instructions to complete some low-level initialization operations, such as setting the stack pointer, initializing registers, etc., to create an environment for the subsequent kernel boot process.

⑵ Execution Process of head.S

The head.S file is mainly responsible for completing some architecture-related initialization tasks, such as initializing page tables, enabling the MMU (Memory Management Unit), setting up the kernel stack, etc. During execution, it performs corresponding initialization operations based on different processor architectures and platform configurations. For example, for ARM architecture processors, it sets the exception vector table to ensure correct jumps and handling during exceptions; it initializes system control registers to configure the processor’s operating mode and state.

⑶ The Entire Process of Kernel Subsystem Startup

After the kernel completes low-level initialization, it sequentially starts various kernel subsystems, such as process management subsystem, memory management subsystem, device driver subsystem, file system subsystem, etc. Each subsystem has its specific initialization process and functions, working together to build a complete kernel runtime environment. For example, the process management subsystem creates the system’s first process init and starts kernel threads responsible for managing and scheduling all processes in the system; the memory management subsystem initializes the memory allocator, managing the system’s physical and virtual memory, providing memory resources for processes and the kernel.

The SoC boot process is a complex and orderly procedure involving hardware initialization, bootloader loading, kernel startup, and multiple links. Understanding the principles and processes of SoC booting is crucial for embedded system developers, helping them better develop, debug, and optimize systems, improving system performance and stability.

Part 2Analysis of the Entire SoC Boot Process

2.1 Power-On Initialization

When the SoC chip is powered on, the power management unit quickly starts, initializing and controlling power to each module, just like providing stable power supply to a factory about to operate. The processor core begins executing the preset boot program, initializing various modules and peripheral interfaces of the system, laying the basic framework for the normal operation of the chip. This process is akin to the staff debugging and preparing all equipment on stage before a large performance, ensuring that the show can proceed smoothly.

At this stage, the power management unit needs to precisely control the power supply sequence and voltage of each module to avoid damage to the device due to voltage fluctuations or improper power supply. The processor core needs to follow the preset program to gradually complete the initialization of key components such as the system clock, interrupt controller, memory controller, etc., ensuring they can function normally. For example, when initializing the memory controller, the processor core needs to set the memory’s operating frequency, timing parameters, etc., to ensure efficient data read and write operations.

2.2 Loading the Operating System or Firmware

After the power-on initialization is complete, the processor core loads the operating system or embedded firmware, akin to installing an operating system on a computer, providing the necessary software environment for the chip to execute various complex tasks and functions. The operating system or firmware acts like an experienced commander, managing the chip’s various resources and scheduling task execution, allowing the chip to operate in an orderly manner.

During the loading of the operating system or firmware, the processor core first needs to read the boot program from the storage device (such as flash memory, hard disk, etc.). The boot program is responsible for loading the operating system kernel and related drivers, transferring control to the operating system. This process requires ensuring the correctness and completeness of the boot program, as well as the reliability of the storage device. For example, in a smartphone, when the user presses the power button, the SoC chip first loads the customized boot program from the phone manufacturer, which then loads the Android operating system kernel and various drivers, ultimately starting the phone’s user interface, allowing the user to utilize the phone’s various functions.

2.3 Task Execution Phase

Once the software environment is established, the SoC chip can begin executing various tasks, which may include data processing, communication, control, etc., depending on the application field and design goals of the SoC chip. Due to the high integration of SoC chips, they can respond to and process various tasks more quickly, improving the overall performance of the system. This is like a well-trained special forces team, capable of swiftly and efficiently completing various complex tasks.

For example, in smart home devices, the SoC chip can process data collected from sensors, control the operating status of home appliances based on user settings, and communicate with the user’s smartphone or other smart devices via wireless networks, achieving remote control and intelligent management. In this process, the SoC chip needs to quickly process a large amount of data while ensuring task real-time performance and reliability to provide a good user experience.

Part 3Characteristics of Different Types of SoC Booting

3.1 Simple Booting Example with Microcontrollers

Microcontrollers, as a simple type of SoC, have a relatively straightforward boot process. Typically, we can use tools like serial ports or JLINK to burn the compiled bin file or hex file into the microcontroller’s flash memory. During program execution, the stack is allocated in RAM to store the program’s variables and execution results.

After powering on the microcontroller, the hardware automatically sets the SP (stack pointer) and PC (program counter) registers. The PC is a 16-bit program counter specifically used to address program memory during CPU instruction fetching, always holding the 16-bit address of the next instruction to be executed. Generally, when one instruction byte is fetched, the PC automatically increments by 1, executing the program sequentially. The SP is used for stack operations, and during transfer instructions, subroutine calls/returns, or interrupts, the current value of the PC is automatically pushed onto the stack for protection, and the address of the subroutine entry or interrupt vector is sent to the PC, changing the program flow.

Taking the common 51 microcontroller as an example, although the startup file may not be easily noticeable during writing, the development software has already internalized this part for us. The main functions of the startup file include initializing the stack pointer SP and program counter pointer PC, setting the size of the heap and stack, setting the entry address of the exception vector table, configuring external SRAM (if any) as data storage, and setting the branch entry of the C library __main (ultimately used to call the main function). Through these settings, the microcontroller can execute the program in an orderly manner, starting from the initialization assembly code, copying the data segment and bss segment to RAM, establishing the stack, and then calling the program’s main function to enter the execution phase of the C program.

3.2 Multi-Stage Booting of Complex SoCs

For complex SoCs, the boot process often requires multiple stages to complete, ensuring that the system can boot stably and efficiently. Taking a typical U-Boot boot process as an example, it usually includes three stages: boot ROM (or XIP), SPL, and U-Boot.

The boot ROM is the first-level boot image executed by the CPU during system initialization, generally hardcoded by the manufacturer in the internal memory of the SoC (usually ROM). This is because, during system initialization, the CPU can only access directly addressable memory, such as ROM, while external storage like SPI FLASH or NAND FLASH requires corresponding drivers to access, which the CPU cannot do in the initial boot phase. The main function of the boot ROM is to load and execute the BL2 image, which will be loaded to a fixed location for execution, typically in the chip’s internal SRAM.

SPL, or Secondary Program Loader, is the second stage in the U-Boot boot process. Its main functions include setting the CPU’s state, such as cache, MMU, endianness settings, etc.; preparing the C language execution environment, including setting the stack pointer and clearing the contents of the BSS segment; allocating memory space for GD; initializing RAM, and copying the BL2 code to RAM for execution. Due to the small size of SRAM, the size of SPL cannot be too large, which is one reason for dividing U-Boot into SPL and U-Boot stages.

U-Boot is the actual bootloader, running in DDR memory, providing complete functionality, such as initializing more peripherals (serial ports, network cards, USB, etc.), parsing the device tree (FDT), reading kernel images (zImage, Image) and DTB files, loading the Linux kernel, and jumping to execute it. At this stage, U-Boot completes comprehensive hardware initialization and loads the OS into memory, then runs the OS to achieve a complete system boot.

Part 4Differences Between SoC Booting and MCU Booting

In the world of chips, SoC booting and MCU booting are like two tracks running in different directions, each with its unique operating methods.

From a hardware architecture perspective, an SoC is like a fully functional large city, integrating processors (such as ARM Cortex-A series), memory, storage controllers, graphics processing units (GPUs), audio processing units, I/O interfaces, and many other modules. During booting, it not only needs to successfully start the “city center” CPU but also to wake up and initialize each functional area module one by one, ensuring the entire “city” operates normally. In contrast, an MCU is like a small village, typically integrating processors (such as ARM Cortex-M series or other architectures), less memory (RAM and Flash), basic I/O interfaces, and simple facilities. Its boot process is akin to waking up a few residents in the village, relatively simple, mainly initializing memory, configuring clocks, and I/O interfaces.

In terms of boot speed, SoCs, due to their numerous integrated modules, operate like a large machine that requires loading a lot of firmware and operating systems during booting, which may also involve complex steps like starting the operating system kernel and drivers. This is similar to starting a large luxury cruise ship, where many preparations are needed, resulting in a slower boot speed, especially when booting complex operating systems like Linux or Android. In contrast, the MCU’s boot process is like starting a small boat, usually without an operating system or running a lightweight real-time operating system (RTOS), with fewer steps and faster times, suitable for scenarios requiring high real-time performance, such as real-time monitoring and response in industrial control.

Regarding boot modes, SoCs are like a large building with multiple entrances, typically supporting booting from various external storage devices such as Flash, eMMC, SPI, SD cards, etc. Moreover, during booting, it may also require a bootloader as a “guide” to boot the operating system kernel, ensuring a smooth boot process. In contrast, MCUs are like a small house with only simple entrances, with relatively single boot modes, usually reading programs from on-chip Flash or external storage, directly running preset firmware or simple boot code without a complex guiding process.

In terms of power management, SoCs, due to the different power consumption and startup requirements of each module, need to gradually start each module to reduce power consumption or improve system stability. This is akin to managing the power supply of a large factory, requiring careful arrangement of the power supply sequence and energy distribution for each workshop, typically needing a power management unit (PMU) to control the power supply of different modules, with different power domain startup sequences involved during booting. In contrast, MCU power management is like managing a small household’s electricity, relatively simple, requiring only the activation of basic power and clocks during startup, with less power management, usually used in low-power embedded systems.

In terms of application scenarios, SoCs, with their powerful functions and complex boot processes, are typically used in devices with complex functionalities that require running complete operating systems and applications, such as smartphones, smart TVs, and intelligent driving assistance systems in cars, just like a large city can accommodate various complex social activities. In contrast, MCUs are more often used in control applications with lower resource requirements and higher real-time demands, such as appliance control, industrial control, and simple electronic control units (ECUs) in cars, akin to a small village meeting simple living and production needs.

Part 5Common Issues and Solutions in SoC Booting

5.1 Hardware-Related Issues

During the SoC boot process, hardware issues are common sources of failure. Incorrect settings of the BOOT pin may prevent the chip from recognizing the correct boot mode, thus failing to load the boot file. This is because, in the boot ROM, the chip first reads the value in the boot mode register at startup, selecting the boot mode (such as NAND, NOR, eMMC, SD, etc.) based on that value. Therefore, if the BOOT pin is incorrectly connected or the level is set incorrectly, the chip will select the wrong boot mode.

If the NRST reset key is not pulled high, the circuit will remain in a reset state and cannot proceed. This is akin to a machine’s restart button being continuously pressed, preventing normal operation. It is also crucial to confirm that the VDD and VDDA voltages meet operational requirements, ensuring not only that the chip’s power supply voltage is normal but also that the working environment of the storage medium meets requirements, especially the frequency and voltage of the flash.

Mismatches between the crystal oscillator and the program configuration may also lead to issues. When the crystal oscillator frequency exceeds the program configuration, overclocking may occur, causing the board to malfunction and fail to start. Since the crystal oscillator provides the clock signal for the chip, if the clock signal does not match the program’s expectations, the chip cannot execute instructions according to the predetermined timing.

Firmware mismatches with the actual chip model or type can also lead to boot failures. For example, burning firmware suitable for one model of chip onto another model may result in failure due to differences in hardware architecture and instruction sets, preventing the firmware from correctly recognizing and controlling the chip’s hardware resources.

To troubleshoot these hardware issues, it is necessary to carefully check hardware connections, ensure the BOOT pin is correctly connected, and the NRST reset key is in a normal state; use multimeters and other tools to measure voltages, confirming that VDD and VDDA voltages meet requirements; compare the actual frequency of the crystal oscillator with the program configuration to ensure consistency; check that the firmware matches the chip model to avoid burning incorrect firmware.

5.2 Software and Other Issues

In terms of software, the relocation operations performed by U-Boot when loading the kernel are significant. The loading process of U-Boot is divided into SPL and U-Boot stages. In the SPL stage, the main task is to load the U-Boot code from Flash to a specified position in RAM; in the U-Boot stage, U-Boot moves itself from the beginning of RAM to the end, occupying high address space, allowing the low address space to serve as a continuous, large memory space for the kernel and other applications. This is akin to rationally arranging items in a limited space to free up suitable positions for more important items.

Trimming the boot time is also a concern during the SoC boot process. Optimizing the bootloader can reduce its code size, minimize unnecessary hardware initialization, and only initialize essential hardware devices, thereby shortening the boot time. For example, removing some unnecessary functional modules during the boot phase and streamlining initialization code. Optimizing the kernel can reduce the number of startup services, optimize the startup order of services, and use preloading techniques. For instance, delaying the startup of some non-critical services or loading them during system idle times.

Some SoCs support fast boot modes, where the SoC skips unnecessary hardware initialization and self-check processes to boot faster. This is akin to a runner taking a shortcut in a race, allowing them to reach the finish line more quickly. Some SoCs also support sleep and wake-up technologies, which can save the system’s state to non-volatile memory and then shut down the system. When the system restarts, it can directly restore the system’s state from non-volatile memory, thus booting faster. This is similar to a computer’s sleep function, allowing for quick recovery to the state before sleep.

Leave a Comment