Understanding Memory Attributes and Properties in ARMv8/ARMv9 Architecture

ver0.3
Introduction
In the grand world of ARM memory, the topic of attributes is not the brightest star, just like you and I in front of the screen, living ordinary lives in this vast world. Even if “the husband may not be able to afford it, today we meet without wine money,” do not be discouraged in the face of life, because “the mountains and rivers are heavy and there seems to be no way, but the willows are dark and the flowers are bright in another village.” This era is honing every person, testing everyone’s willpower, as long as you persist, “Do not worry about the future without a confidant, who in the world does not know you?” Some regrettable things happened on the morning this article was written, and I dedicate this article to encourage every coder fighting in front of the screen, every person silently contributing to their family. Although the attributes of memory are ordinary, they are still important enough because they are the cornerstone of some features of this VMSA system. In previous articles, we have also discussed some topics, such as memory types, shared memory attributes, etc. In this article, we will discuss some other memory attributes.
Main Content
The term memory attributes is actually not entirely accurate; it should be memory space attributes. In previous articles, we learned that the CPU works in the virtual address space, and when it needs to access the physical address space, it relies on the MMU for translation. The entire SoC system is also organized through memory space, whether it is data interaction or issuing CPU instructions, it all depends on memory space. The basic classification of memory space is device and Normal, which operate in different business scenarios. The intuitive manifestation is that the behaviors are different; a peripheral register mapped in the device type space is definitely not expected to be cached because its value can change at any time, but it does not have a good mechanism to ensure consistency with the data seen by the CPU. Normal type data is mostly cached to reduce the clock cycles of accessing external memory, and when the CPU is manufactured, it will have a complete cache consistency mechanism and replacement strategy to ensure data consistency. In various cases, combined with the ARM architecture, further subdivision of memory space is required. The application layer OS or firmware fills in the memory attributes when applying for memory, allowing the CPU to accurately control the behavior of the entire system by traversing these attributes during operation. If the check passes, work continues; if it fails, an exception will be thrown, and the exception handling mechanism will be activated for “punishment.” Since attributes are memory space attributes, we must first have a basic understanding of memory space.
1.1 Memory Space
We previously had a dedicated article discussing memory space [A-09] ARMv8/ARMv9-Memory-Address Space (Translation Regimes), and those who have not seen it can go check it out first. Here, we will briefly review what memory space is (if you are familiar with this section, you can skip it directly).
Under the ARM architecture, a virtual memory world is constructed: AArch64 Virtual Memory System Architecture (VMSA). As the name suggests, ARM’s memory world is a virtual world, and all software running on it is assigned a unified address by the compiler. During the compilation and execution of software code, the addresses seen are all virtual.When the CPU gets a virtual address, one thing it needs to do is to assign a real physical address (PA) to this virtual address (VA), so that the entire VMSA architecture system can operate.
With the development of operating systems and the accumulation of experience in engineering practice, it has been found that the world of computers needs levels and regulations. Otherwise, chaos will arise. For example, there is only one speaker on the system, and application A wants to make a sound, while application B also wants to make a sound. What should be done? A higher privilege module must handle these affairs and set rules, thus the ARM exception model was born, as shown in Figure 1-1.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-1 ARM Exception Levels

For more information about the exception model, please refer to the previous article [V-05] Virtualization Basics – Exception Model (AArch64), which will not be discussed in detail here.
The higher the exception level, the greater the authority, which means the higher the control over resources. After order was established in the world of computers, the development of virtualization technology, operating system technology, and various application technologies accelerated. With the flourishing scene, other demands also arose, namely security. Just like in the real world, there are always some people with ulterior motives who spy on others’ private information through various means, such as a public key, a certificate, or biometric information (fingerprints). Security is a serious matter. During the iteration of the ARM architecture, security states were also introduced, and the second round of territory acquisition in the ARM world began.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-2 Security State

“The stars are still the same stars, the moon is still the same moon”, the CPU is still that CPU, and the memory is still that memory. If this world has always been like this, how can we demonstrate the status of the Hypervisor in a Non-Secure state, and how can we demonstrate the identity of the Trusted OS in a Secure state…. Under the ARM architecture, this is distinguished by the division of the memory world, meaning the CPU is still that CPU, but the memory is not that memory anymore, because ARM has subdivided the virtual memory space.
The architecture defines all of the following translation regimes:
• Non-secure EL1&0 translation regime.
• Secure EL1&0 translation regime.
• Realm EL1&0 translation regime.
• Non-secure EL2&0 translation regime.
• Secure EL2&0 translation regime.
• Realm EL2&0 translation regime.
• Non-secure EL2 translation regime.
• Secure EL2 translation regime.
• Realm EL2 translation regime.

• EL3 translation regime.

Of course, only having these space concepts is not enough; the CPU also needs the help of system registers to work in the corresponding memory space under the appropriate state. For information about registers under the ARM architecture, please refer to the previous article: [V-04] Virtualization Basics – Register Set (Based on AArch64).
The CurrentEL register of the PSTATE family can lock the EL state (as shown in Figure 1-3):

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-3 CurrentEL Register

Similarly, the SCR_EL3 register of the PSTATE family can lock the Security state (as shown in Figures 1-4 and 1-5):

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-4 SCR_EL3 Register

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-5 SCR_EL3 Register Secure State
With the above mechanisms in place, after the CPU is powered on and the boot program is loaded, various software in the corresponding spaces is gradually loaded into memory and begins execution. There are many things to do at this time, one of the most important is to initialize its own virtual address space. For example, when a Linux system is to execute, it needs to find a block of memory space for itself, such as Non-Secure EL1&0, and at this time the scene looks like this as shown in Figure 1-6.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-6 High-Level Memory Mapping

When Linux is running, it operates in a virtual address space, but in reality, it needs to load instructions, load data, and operate external devices using the physical address space, just like the U.S. using GPS positioning data to bomb Iraq, rather than just telling the missile the name of a village. At this point, the situation becomes complicated; the virtual space is too large (0x16 zeros to 0x16 Fs), and the physical space cannot be that large; how big would a memory stick need to be? Therefore, the physical memory space can only be sliced and segmented, and then mapped to the corresponding slice area of the virtual address space; at one point in time, a segment of physical memory may hold instructions, while at another point in time, it may hold data being operated on. In such a complex situation, to ensure that the OS does not make mistakes during operation, for example, the space allocated to process A cannot be allocated to process B, a “ledger” is needed to record each transaction clearly. To quote the first gold medal lawyer in history, Hai Rui: “Record it.” At this time, Figure 1-6 becomes the situation shown in Figure 1-7.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-7 High-Level Memory Mapping (Page Table)

With the ledger in place, can we rest easy? Not really, because this ledger also needs to exist in memory, which also incurs overhead. Even with segmented mapping, the overhead is significant; if only a single-level mapping is done, a lot of memory resources will be wasted, and that costs money. Thus, the mapping process continues to evolve into the situation shown in Figure 1-8.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-8 High-Level Memory Mapping (Multi-Level Page Table)

Regarding the multi-level page table architecture and why it can save memory overhead, we have a dedicated discussion in a previous article: [A-11] ARMv8/ARMv9-Memory-Multi-Level Page Table Architecture. With the support of the multi-level page table architecture, software can use virtual addresses with ease. Whenever the CPU issues a virtual address, the CPU’s assistant will map this VA to a PA with the cooperation of the operating system, as illustrated in Figure 1-9.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-9 Virtual Address Translation Process

We will not discuss the virtual address translation process in detail; please refer to the previous article: [A-13] ARMv8/ARMv9-Memory-Virtual Address Translation (Page Table Mapping Process). Through the previous analysis, we found that the link between virtual address space and physical address space is the page table, and the architecture is a multi-level page table architecture, as simplified in Figure 1-10.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-10 ARM Multi-Level Page Table Architecture

In the VNSA system, VA and PA are linked through a cascading architecture realized by D_Table, D_Block, and D_Page, with their internal structures shown in Figure 1-11:

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-11 High-Level Page Table Descriptor

The detailed structure of the page table descriptor can be found in a previous article: [A-12] ARMv8/ARMv9-Memory-Page Table Descriptor (Translation Table Descriptor), which will not be discussed here. However, the protagonist of our article finally makes a grand appearance, which is the memory Attributes (Upper & Lower). We used a small summary to lay the groundwork for two purposes: first, to let everyone understand the origin of memory attributes (just like that soul-stirring question: Doctor, I don’t want to know how I died; I just want to know how I came to be); second, to let everyone understand that the macro world (VMSA) is also composed of individual micro molecules (Pages), which are equally important, they have their own temper and personality (Attributes), and they deserve respect.
1.2 Memory Attributes
In the previous summary, we learned that memory attributes are described in the page table, and each page table entry is called a page table descriptor, so memory attributes naturally occupy a place in the page table descriptor, as shown in Figure 1-12.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-12 Block Descriptor

If we enlarge the Block descriptor in Figure 1-12, we can see its internal structure more clearly. Other descriptors have a similar structure, so we will not elaborate on them one by one. In the following chapters, we will follow the footsteps of the page table descriptor to introduce these important memory attributes in detail.
1.2.1 Memory Types and Cache Attributes
Due to the strong correlation between these two memory attributes, they are expressed by ARM using a field called “AttrIndx”. Because they are so important and directly determine many behaviors of memory, they form the basis for many issues we discuss. Therefore, we have already dedicated an article to introduce them: [A-14] ARMv8/ARMv9-Memory-Types of Memory Models (Device & Normal), which we will not discuss further.
(1) Memory types are divided into two types: Device and Normal, as shown in Figure 1-13.
Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-13 Memory Types
The above figure shows the memory configuration of general memory space (EL1&0). The OS must divide the physical address space in the relevant configuration files during the compilation process, waiting for the corresponding type of virtual address space to be mapped.
(2) The cache attributes of memory determine whether the corresponding memory segment can be cached into the CPU’s internal memory subsystem (L1/L2/L3 Cache). For more information on cache, please refer to previous articles in the cache series.
1.2.2 Access Permissions
Like most storage systems, memory access actions are also “read” and “write”. ARM uses the “AP” in the page table descriptor to restrict access permissions, as shown in Figure 1-14.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-14 Memory Access Permission Configuration

In general, software running at a higher level of privilege in the ARM architecture can unconditionally access lower-level memory spaces, as described in the manual:

The standard permission model is that a more privileged entity can access anything belonging to a less privileged entity. For example, an Operating System (OS) can see all the resources that are allocated to an application, or a hypervisor can see all the resources that are allocated to a virtual machine (VM). This is because executing at a higher exception level means that the level of privilege is also higher.

This permission model design usually works fine, but sometimes, for example, due to coding errors or system vulnerabilities, software attacks may also pose risks to lower-level software memory spaces. Therefore, ARM has preventive measures, such as limiting access through system registers, as shown in Figure 1-15.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-15 Access Permission Restrictions

In addition to controlling access through the configuration of system registers, control can also be done through instructions. We will not discuss this part in detail; interested parties can refer to the manual for detailed information.
1.2.3 Executable Permissions
Executable attributes are relatively easy to understand; they indicate whether the data stored in this memory space can be treated as instructions and executed by the CPU. ARM describes this with the fields “UXN” and “PXN”.
The memory space is divided into two types: Device and Normal, and we will briefly discuss them.
(1) When a system software allocates memory and assigns a segment to have a Device attribute, most of the time, the OS only wants to use this memory space to control external devices, such as notifying a DMA controller where to start copying data from memory. At this time, it is usually unnecessary for Device type memory space to store executable instructions.
(2) Normal type memory space must not only store instructions but also the data that the instructions operate on, such as image encoding. Therefore, the segment of memory storing data does not need to be assigned executable permissions.
Based on the above analysis, let us look at a typical EL1&0 memory space executable permission configuration, as shown in Figure 1-16.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-16 Executable Permission Configuration

1.2.4 Access Flags
Access flags (Access Flag) are relatively easy to understand. The granularity of memory allocation in ARM is pages (4k, 16k, 64k) and blocks, and ARM uses this flag to indicate whether the currently allocated memory area has been accessed.
You can set the AF bit to:
• AF=0. Region not accessed.

• AF=1. Region accessed.

In practical use cases, this flag is mainly utilized by the OS (it can also be set through the hardware by configuring the system register TCR_ELx.HA) as an important reference flag for memory performance optimization, such as swap operations during memory space tightness, as shown in Figure 1-17.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-17 Classic Page Replacement Algorithm – LRU

1.2.5 Dirty State
Those familiar with the Linux system should know that allocating a page in Linux falls into two situations: “file pages” and “anonymous pages.” When we open an editable file page, such as a txt file, the content of this file will be loaded into the “file page.” When this file page is modified, ARM will also record this in the page table, but the record of this state is slightly more complex because it is associated with two bits in the page table descriptor. Taking the Stage-1 page description as an example, as shown in Figure 1-18.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-18 Stage-1 Page Table Descriptor

Here we directly quote from the manual:
The dirty state is used to indicate a memory block or page has been modified. When hardware update of the dirty state is enabled, the descriptor DBM field indicates whether the descriptor is a candidate for hardware updates of the dirty state.
Under DBM configuration, the Dirty state also supports hardware configuration, which will not be discussed here. The Dirty State is also very simple, as shown in Figure 1-19.
Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-19 Dirty State
It should be noted that the interpretation of the DBM bit has many opinions from experts that are subject to debate. You can read the manual for judgment, but the small deviations do not overshadow the greatness of the experts.
1.2.6 Shared Attributes
Shared attributes are very important. We have previously discussed them in a dedicated article: [A-14] ARMv8/ARMv9-Memory-Types of Memory Models (Device & Normal), which we will not discuss further.
1.2.7 Quick Reference Table
We have introduced most of the core attributes of memory, and due to space constraints, we will not elaborate on some commonly used attributes. You can read the manual or refer to the quick reference table shown in Figure 1-20 (except for a few attributes, most of the attribute introductions can be used directly, tested and effective, haha).

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-20 ARM Memory Attributes Quick Reference Table

1.3 Memory Attributes and TLB
The TLB in ARM also caches some memory attributes, which is an interesting point for studying ARM architecture. Specifically, the TLB (Translation Lookaside Buffer) is part of the processor’s Memory Management Unit (MMU) and is used to accelerate the virtual address to physical address conversion process. In the ARM architecture, TLB entries not only contain the mapping relationship between virtual and physical addresses but also include properties such as memory type, cache strategy, access permissions, address space ID (ASID), and virtual machine ID (VMID), as shown in Figure 1-21.

Understanding Memory Attributes and Properties in ARMv8/ARMv9 ArchitectureFigure 1-21 Typical TLB Structure

How many memory attributes the TLB can cache depends on the specific CPU microarchitecture design and the implementation by chip manufacturers, such as the TLB structure of Cortex-A725. Here, we only quote part of the manual’s description and will not discuss it further.
Translation Lookaside Buffer (TLB) entries store the context information required to facilitate a match and avoid the need for a TLB clean on a context or virtual machine switch.
Each TLB entry contains:
• A Virtual Address (VA)
• A Physical Address (PA)
• A set of memory properties that includes type and access permissions
Each TLB entry is associated with either:
• A particular Address Space Identifier (ASID)
• A global indicator

Each TLB entry also contains a field to store the Virtual Machine Identifier (VMID) in the entry applicable to accesses from EL0 and EL1. The VMID permits hypervisor virtual machine switches without requiring the TLB to be invalidated.

Conclusion
In this article, we have detailed the background knowledge of memory space and introduced most of the attributes of memory space. Although the content is slightly lengthy, it has been compressed a lot, but I still hope it can inspire everyone. With the iteration of the ARM architecture, there are more advanced uses of memory attributes worth studying. Here, everyone should first master the basic knowledge system; we will also introduce advanced uses in due course. Additionally, due to the property settings that the OS can make for each level of page table descriptor under the multi-level page table architecture, there are overlapping page tables between upper and lower page tables, and the coverage relationship will not be discussed here due to space constraints; consider it homework for the readers. Finally, with the introduction of virtualization technology, the two-level translation architecture of memory, and the memory allocation in Stage-2 also has its own attributes, and the two-level attributes also have coverage rules. We have already planned a dedicated article to introduce these contents later.
At the beginning, it seemed a bit sad; indeed, there are some things that I really do not want to see. I sincerely hope that every coder and their family can spend every day happily. Thank you all, please stay tuned.
References
[00] <corelink_dmc520_technical_reference_manual_en.pdf>
[01] <corelink_dmc620_dynamic_memory_controller_trm.pdf>
[02] <IP-Controller/DDI0331G_dmc340_r4p0_trm.pdf>
[03] <80-ARM-IP-cs0001_ARMv8基础篇-400系列控制器IP.pdf>
[04] <arm_cortex_a725_core_trm_107652_0001_04_en.pdf>
[05] <DDI0487K_a_a-profile_architecture_reference_manual.pdf>
[06] <armv8_a_address_translation.pdf>
[07] <cortex_a55_trm_100442_0200_02_en.pdf>
[08] <learn_the_architecture_aarch64_memory_management_guide_en.pdf>
[09] <learn_the_architecture_armv8-a_memory_systems_en.pdf>
[10] <79-LX-LK-z0002_奔跑吧Linux内核-V-2-卷1_基础架构.pdf>
[11] <79-LX-LD-s003-Linux设备驱动开发详解4_0内核-3rd.pdf>
[12] <learn_the_architecture_memory_systems_ordering_and_barriers.pdf>
[13] <arm_dsu_120_trm_102547_0201_07_en.pdf>
[14] <80-ARM-MM-AL0001_内存学习(三):物理地址空间.pdf>

[15] <learn_the_architecture_aarch64_memory_attributes_and_properties.pdf>

Glossary
MMU – Memory Management Unit
TLB – Translation Lookaside Buffer
VIPT – Virtual Index Physical Tag
VIVT – Virtual Index Virtual Tag
PIPT – Physical Index Physical Tag
VA – Virtual Address
PA – Physical Address
IPS – Intermediate Physical Space
IPA – Intermediate Physical Address
VMID – Virtual Machine Identifier
TLB – Translation Lookaside Buffer (地址变换高速缓存)
VTTBR_EL2 – Virtualization Translation Table Base Registers
ASID – Address Space Identifier (ASID)
DMC – Dynamic Memory Controller
DDR SDRAM – Double Data Rate Synchronous Dynamic Random Access Memory
TBI – Top Byte Ignore
DMB – Data Memory Barrier
DSB – Data Synchronization Barrier
ISB – Instruction Synchronization Barrier
DSU – DynamIQ ™ Shared Unit
SOC – System on Chip
VMSA – AArch64 Virtual Memory System Architecture
AP – Access Permissions
XN – Execute Never
AF – Access Flag

Leave a Comment