Understanding the Architecture of Arm Cortex-A53 Cache

Click on the card below to follow Arm Technology Academy

This article is authorized and reprinted from the WeChat public account Arm Selected. This article mainly shares the architecture interpretation of A53 cache.

1 A53 uses the classic big-LITTLE architecture

Below is an early classic big-LITTLE architecture diagram.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 1

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 2

2 A53’s cache configuration

L1 I-Cache

● Configurable: 8KB, 16KB, 32KB, or 64KB

● Cache line: 64 bytes

● 2-way set associative

● 128-bit read L2 memory interface

L1 D-Cache

● Configurable: 8KB, 16KB, 32KB, or 64KB

● Cache line: 64 bytes

● 4-way set associative

● 256-bit write L2 memory interface

● 128-bit read L2 memory interface

● 64-bit read from L1 to datapath

● 128-bit write from datapath to L1

L2 memory System

● Integrates SCU (Snoop Control Unit), can connect up to 4 cores

● SCU internally duplicates the TAGs of L1 Data Cache

● The interface of the L2 memory system can be ACE or CHI, 128-bit width

L2 cache

● Configurable: 128KB, 256KB, 512KB, 1MB, and 2MB.

● Cache line: 64 bytes

● Physically indexed and tagged cache (PIPT)

● 16-way set associative structure

L1 data cache TAG

The L1 Data cache of A53 follows the MOESI protocol, as shown below, the tag in the L1 data cache contains the MOESI state bits.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 3

MOESI state

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 4

L1 Instruction cache TAG

The L1 instruction cache is read-only, so there is no need for hardware to maintain consistency between instruction caches across multiple cores, hence it does not follow the MOESI protocol. Below is the TAG of the L1 Instruction cache, where the flags are minimal, with no MESI state bits.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 5

3 Cache hierarchy:

● L1 cache is private to the core.

● L2 cache is shared within the cluster.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 6

4 L2 memory system introduction

In the big.LITTLE architecture, there is an SCU unit in the cluster, which mainly executes and maintains the consistency of the L1 cache (using the MESI protocol or its variants like the MOESI protocol).

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 7

In the L2 Memory System, in addition to containing the L2 cache, it will also include L1 Duplicate tag RAM (which actually refers to the L1 Data Cache Tags).

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 8

5 Cache consistency between multiple clusters

The interface between the cluster and the outside world can be ACE or CHI (currently ACE is commonly used, but the trend may shift to CHI in the future).

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 9

● If ACE is used, the consistency between multiple clusters is maintained by CCI+ACE.

● If CHI is used, the consistency between multiple clusters is maintained by CMN+CHI.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 10

6 Introduction to CCI (Taking CCI-550 as an example)

CCI-550 includes an inclusive snoop filter, which is used to record the data stored in the ACE main cache.

The snoop filter can respond to snoop transactions in the case of misses and snoop appropriately only when there is a hit. Snoop filter entries are maintained by observing transactions from the ACE master node to determine when entries must be allocated and deallocated.

The snoop filter can respond to multiple consistency requests without broadcasting to all ACE interfaces. For example, if the address is not in any cache, the snoop filter will respond with a miss and direct the request to memory. If the address is in the processor cache, the request is treated as a hit and directed to the ACE port that contains that address in its cache.

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 11

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 12

7 Classic example framework

Understanding the Architecture of Arm Cortex-A53 Cache

Figure 13

Recommended Reading

  • Cortex-M3 Beginner’s Guide (1): Overview of the Architecture

  • What You Didn’t Know About Arm v7/Arm v8/Arm v9 Architectures

  • ARM Series — Armv8-A

Understanding the Architecture of Arm Cortex-A53 Cache
Understanding the Architecture of Arm Cortex-A53 Cache
Understanding the Architecture of Arm Cortex-A53 Cache

Long press to identify the QR code to add the WeChat of Miss Ji Shu (aijishu20), and join the reader group of Arm Technology Academy.

Follow Arm Technology Academy

Understanding the Architecture of Arm Cortex-A53 Cache Click on the “Read Original” below to read more articles from the Arm Selected column.

Leave a Comment

Your email address will not be published. Required fields are marked *