Welcome to follow the public account belowAbao1990, this public account focuses on autonomous driving and smart cabins, providing you with daily automotive insights. We start with cars but go beyond cars.
ARM has launched a series of CPU architectures for autonomous driving that comply with automotive functional safety, namely A65AE, A76AE, and A78AE, where AE stands for automotive-grade enhanced.
ARM Autonomous Driving Computing Platform Configuration


Image Source: Internet
ARM’s Split/Lock/Hybrid Modes for Automotive Functional Safety

Image Source: Internet
According to the automotive functional safety standard ISO26262-5 2018 product development: Appendix D on the diagnostic coverage of processing units, several safety technical measures are recommended, including hardware redundancy technologies such as dual-core lock-step, asymmetric redundancy, and encoded computation as typical measures.
Lock-step means that two cores run the same program, and the results are input into a comparison logic that periodically checks whether the outputs of the two cores are the same, i.e., CCM. If they are the same, they continue to run; otherwise, certain measures need to be taken. If an error persists after a period, a restart or recheck may be necessary. The design of lock-step cores is fixed in the chip design, so there is no adjustability. It is evident that although lock-step cores use two cores, they effectively only function as one core, wasting one core. This method has been successfully validated over years in the field of microcontrollers and lower complexity microprocessors. If the core design is more complex, even without anomalies, the two cores may not synchronize. Currently, further development involves outputting comparison results to a core referred to as a “safety island,” which is responsible for decision-making and execution. This core uses a separate clock and power supply and has high safety performance. ARM generally recommends the Cortex-R52.
ARM has dual lock-step capability, the first within the CPU, utilizing DSU, and the second externally, utilizing the safety island. DSU stands for DynamIQ Shared Unit, which first appeared in A75.
ARM DSU Application Example

Image Source: Internet
Basic Composition of DSU

Image Source: Internet
The core function of DSU is to control the CPU cores, enabling them to function as a cluster. Each core within the cluster can be individually powered on/off and frequency/voltage adjusted, resulting in better energy efficiency. Manufacturers can even place different cores (currently only supporting Cortex-A75 and Cortex-A55) in an unequal number within a cluster, such as Cortex-A75×3 + Cortex-A55×5 or Cortex-A75×1 + Cortex-A55×7, balancing cost and performance. Additionally, there is shared L3 cache. DSU can connect the CPU with other units in the SoC (GPU, Modem, memory) at high speed using different bus technologies like CCI, CCN, or CMN; if it has 4MB of L3 cache, it can dynamically allocate cache to each core. For example, in a configuration of Cortex-A75×1 + Cortex-A55×7, 3MB of cache can be allocated to the A75 core, while the remaining 7 A55 cores share 1MB of cache, and it can even allocate the L3 cache to other units like the GPU, offering high flexibility. Most importantly, it is responsible for controlling the power state, frequency, and voltage of each CPU core within the cluster, which is key to managing CPU performance and power consumption.
When designing DynamIQ, ARM also considered redundancy requirements. Compared to smartphones, automobiles have much higher reliability and redundancy demands. DynamIQ allows multiple clusters to be connected via CCIX, enabling processors to be distributed in different locations in the car. When one cluster is damaged in an accident, the DynamIQ technology can call upon backup processors to ensure the vehicle operates normally.
Cortex-A78AE DSUAE Internal Framework Diagram

Image Source: Internet
DSU AE mainly adds comparators, and the highlighted parts are duplicated, including execution logic, clock, power states, and various interfaces. Of course, cache cannot be duplicated as it would be too costly and not very meaningful.

Image Source: Internet
DSU-AE Separation Mode

Image Source: Internet
DSU-AE Hybrid Mode

Image Source: Internet
DSU-AE Interface

Image Source: Internet
In addition to DSU, there are GPU architectures G78AE for automotive functional safety, image ISP C71AE, interrupt controller GIC-600AE, memory manager MMU-600AE, and mesh bus CMN-600AE. However, DSU is the most critical.
II. ARM‘s On-Chip Bus
ARM Full Series of Smart Driving Products

Image Source: Internet
Digital ICs have evolved from timing-driven design methods to IP reuse-based design methods, widely used in SoC design. In IP reuse-based SoC design, the on-chip bus is the core system that connects various modules and arbitrates them, making it a critical design issue. The AMBA bus, developed by ARM (Advanced Microcontroller Bus Architecture), provides a special mechanism that integrates RISC processors with other IP cores and peripherals, effectively connecting IP cores and is an important component of ARM’s reuse strategy. It is not the interface between the chip and peripherals but rather the interface for communication between ARM cores and other components on the chip. The AMBA specification mainly includes the AHB (Advanced High-Performance Bus) system bus and the APB (Advanced Peripheral Bus) peripheral bus. In addition to the AMBA bus, ARM also has GIC interrupt control for multi-core and MMC memory control.
ARM CMN-700

Image Source: Internet
ARM has developed a connection system between a bus and NoC, called CMN, primarily used to connect CPU cores but also for connections between CPU cores and accelerators. It adopts a MESH grid structure but lacks routing capabilities; essentially, it is still a bus. However, the MESH grid supports many more units than a typical bus, supporting up to 512 cores and 512MB of L3 cache, primarily targeting the HPC market.

CMN-700 also supports AMBA AXI5, ACE5-lite, CXS, and CCIX
Appendix: Overview of ARM Architecture


Understanding Automotive SoC Series Part 6: CPU Microarchitecture
Understanding Automotive SoC Series Part 5: Instruction Set and Computational Architecture
Understanding Automotive SoC Series Part 4: Cache, Superscalar, Out-of-Order Execution
Understanding Automotive SoC Series Part 3: ARM’s Business Model and Overview of CPU Microarchitecture
Understanding Automotive SoC Series Part 2: Automotive Chip Industry and Supply Chain
Understanding Automotive SoC Series Part 1: Overview of Automotive SoC and AEC-Q100 Automotive Standards

