Understanding ARMv8/ARMv9 Memory Attributes and Properties

ver0.3
Introduction
In the grand world of ARM memory, the topic of attributes is not the brightest star, just like you and I in front of the screen, living ordinary lives in this vast world. Even if “a poor husband should not expect much, today we meet without a drink,” do not lose heart in the face of life, because “when the mountains are heavy and the waters are deep, there seems to be no way, but the willows are dark and the flowers are bright in another village.” This era is testing each person’s willpower; as long as we persist, “do not worry about the road ahead without friends, who in the world does not know you?” On the morning this article was completed, some regrettable things happened, and I write this article to encourage every coder fighting in front of the screen, every person silently contributing to their family. Although the attributes of memory are ordinary, they are still important because they are the cornerstone of some features of this VMSA system. In previous articles, we also discussed some of these, such as memory types, shared properties of memory, and so on. In this article, we will discuss other memory attributes.
Main Body
The term memory attributes is not entirely accurate; it should be referred to as attributes of the memory space. In previous articles, we learned that the CPU operates in a virtual address space, and when it needs to access the physical address space, it relies on the MMU to perform the translation. The entire SoC system is organized through memory space; whether it is data interaction or CPU instruction issuance, it relies on memory space. The basic classification of memory space is device and Normal, which operate in different business scenarios. The intuitive manifestation is that their behaviors are different. A peripheral register will be mapped in the device type space, so it certainly does not want to be cached because its value can change at any time, and it does not have a good mechanism to ensure consistency with the data seen by the CPU. Data of Normal type is mostly cached to reduce the clock cycles for accessing external memory, and when the CPU is manufactured, there is a complete cache consistency mechanism and replacement strategy to ensure data consistency. Various situations, combined with ARM architecture, need to further subdivide memory space. When the OS or firmware at the application layer requests memory, it should fill in the memory attributes, so that when the CPU operates, it can accurately control the entire system’s behavior by traversing these attributes. If the check passes, it continues to work; if it fails, it will throw an exception and initiate the exception handling mechanism for punishment. Since attributes are the properties of memory space, we must first have a basic understanding of memory space.
1.1 Memory Space
We have a dedicated article discussing memory space [A-09] ARMv8/ARMv9-Memory-Address Space (Translation Regimes), and those who have not read it can take a look. Here we will briefly review what memory space is (if you are familiar with this section, you can skip directly).
Under the ARM architecture, a virtual memory world is constructed: AArch64 Virtual Memory System Architecture (VMSA). As the name suggests, the ARM memory world is a virtual world, where all the software running on it is assigned a unified address by the compiler. The software code sees virtual addresses during compilation and execution.When the CPU obtains a virtual address, one thing it needs to do is to assign a real physical address (PA) corresponding to this virtual address (VA), thus enabling the entire VMSA architecture to operate.
With the development of operating systems and the accumulation of experience in engineering practice, it has been found that the world of computers requires levels and regulation; otherwise, chaos will ensue. For example, if there is only one speaker on the system that can make sound, and Application A wants to make a sound while Application B also wants to make a sound, what should be done? A higher authority module should handle these affairs and set rules. Thus, the ARM exception model was born, as shown in Figure 1-1.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-1 ARM Exception Levels

For more information about the exception model, please refer to the previous article [V-05] Virtualization Basics – Exception Model (AArch64); we will not discuss it in detail here.
The higher the exception level, the greater the privilege, which means the higher the control over resources. After order was established in the world of computers, various technologies such as virtualization, operating systems, and various application technologies began to develop rapidly. Along with the flourishing scene, other demands arose, namely security. Like in the real world, there are always some ill-intentioned people who pry into others’ private information through various means, such as a public key, a certificate, or biometric information (fingerprint). Security is not a trivial matter; during the iterations of the ARM architecture, security states were also introduced, marking the second phase of the ARM world’s territorial expansion.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-2 Security State

“The stars are still the stars, the moon is still the moon,” the CPU is still the CPU, and the memory is still the memory. If this world has always been like this, how can we demonstrate the position of the Hypervisor in the Non-Secure security state, and how to represent the identity of the Trusted OS in the Secure state… Under the ARM architecture, these distinctions are made based on the division of the memory world, meaning that the CPU is still the CPU, but the memory is no longer the same because ARM has subdivided the virtual memory space.
The architecture defines all of the following translation regimes:
• Non-secure EL1&0 translation regime.
• Secure EL1&0 translation regime.
• Realm EL1&0 translation regime.
• Non-secure EL2&0 translation regime.
• Secure EL2&0 translation regime.
• Realm EL2&0 translation regime.
• Non-secure EL2 translation regime.
• Secure EL2 translation regime.
• Realm EL2 translation regime.

• EL3 translation regime.

Of course, having these concepts of space is not enough; the CPU also needs to work under the corresponding memory space in the appropriate state with the help of system registers. For more information about the registers under the ARM architecture, please refer to the previous article: [V-04] Virtualization Basics – Register Set (Based on AArch64).
The CurrentEL register in the PSTATE family can be used to lock the EL state (as shown in Figure 1-3):

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-3 CurrentEL Register

Similarly, the SCR_EL3 register in the PSTATE family can be used to lock the Security state (as shown in Figures 1-4 and 1-5):

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-4 SCR_EL3 Register

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-5 SCR_EL3 Register Secure State
With the above mechanisms, after the CPU is powered on and the boot program is loaded, various software in the respective spaces is gradually loaded into memory and begins execution. There are many things to do at this time, one of the most important is to initialize its virtual address space. For example, when a Linux system is to execute, it needs to find a block of memory space for itself, such as Non-Secure EL1&0. The scene at this time looks like Figure 1-6.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-6 High-Level Memory Mapping

When Linux is running, it operates in a virtual address space, but in reality, it needs to load instructions, load data, and operate external devices using physical address space, just like the United States cannot only tell the missile the name of a village without using GPS positioning data to strike Iraq. At this time, the situation becomes complicated; the virtual space is too large (0x16 zeros to 0x16 Fs), and the physical space cannot be that large; how big would a memory stick need to be? Therefore, the physical memory space can only be sliced and mapped to the corresponding sliced area of the virtual address space, so that at a point in time, a segment of physical memory may store instructions, while at another point in time, it may store data being operated on. In such a complex situation, to ensure that the OS does not make mistakes during operation, for example, the space allocated to Process A cannot be allocated to Process B, a “ledger” is needed to keep track of these records clearly, as the historical first gold medal lawyer Hai Rui said: keep a record. Therefore, at this time, Figure 1-6 becomes the scenario shown in Figure 1-7.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-7 High-Level Memory Mapping (Page Table)

Now that we have a ledger, can we rest easy? Not really, because this ledger also needs to exist in memory, which incurs overhead. Even if we are slicing and mapping, this overhead is significant, and if we only do a single-level mapping, it will waste a lot of memory resources, and that costs money. Thus, the mapping process continues to evolve into the situation shown in Figure 1-8.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-8 High-Level Memory Mapping (Multi-Level Page Table)

Regarding the multi-level page table architecture and why it can save memory overhead, we have a dedicated discussion in the previous article: [A-11] ARMv8/ARMv9-Memory-Multi-Level Page Table Architecture. With the support of the multi-level page table architecture, software can use virtual addresses comfortably. Whenever the CPU issues a virtual address, the CPU’s assistant will, with the cooperation of the operating system, map this VA to a PA, as shown in Figure 1-9.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-9 Virtual Address Translation Process

We will not discuss the virtual address translation process in detail; please refer to the previous article: [A-13] ARMv8/ARMv9-Memory-Virtual Address Translation (Page Table Mapping Process). Through previous analysis, we find that the link between virtual address space and physical address space is the page table, and the architecture is a multi-level page table architecture, as simplified in Figure 1-10.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-10 ARM Multi-Level Page Table Architecture

In the VNSA system, VA and PA are linked through the D_Table, D_Block, and D_Page, and the internal structure of these ledgers is shown in Figure 1-11:

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-11 High-Level Page Table Descriptor

The detailed structure of the page table descriptor can be referenced in the previous article: [A-12] ARMv8/ARMv9-Memory-Page Table Descriptor (Translation Table Descriptor), and we will not elaborate on it here. However, our protagonist has finally made a grand appearance, which is memory Attributes (Upper & Lower). We have laid the groundwork with a brief summary for two purposes: first, to let everyone understand the origin of memory attributes (just like the soul-searching question: Doctor, I don’t want to know how I died; I just want to know how I was born); second, to let everyone understand that the macro world (VMSA) is also composed of microscopic molecules (Pages), which are equally important and have their own temper and personality (Attributes) and deserve respect.
1.2 Memory Attributes
From the previous summary, we know that memory attributes are described in the page table, and the basic elements of the page table are called page table descriptors, so memory attributes naturally have a place in the page table descriptors, as shown in Figure 1-12.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-12 Block Descriptor

We can enlarge the Block descriptor in Figure 1-12 to see its internal structure more clearly. Other descriptors have a similar structure, and we will not elaborate on them one by one. In the following sections, we will follow the footprints of the page table descriptors to introduce these important memory attributes in detail.
1.2.1 Memory Types and Cache Attributes
Due to the strong correlation between these two attributes of memory, ARM expresses them with a single field called “AttrIndx.” Because they are so important, directly determining many behaviors of memory, we have already dedicated an article to introduce them: [A-14] ARMv8/ARMv9-Memory-Types of Memory Model (Device & Normal), and we will not elaborate on it here.
(1) Memory types are divided into two types: Device and Normal, as shown in Figure 1-13.
Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-13 Memory Types
The above figure shows the memory configuration method for general memory space (EL1&0), where the OS must partition the physical address space in the relevant configuration files during compilation, waiting for the corresponding type of virtual address space to be mapped.
(2) The Cache attributes of memory determine whether the corresponding memory segment can be cached in the CPU’s internal Memory subsystem (L1/L2/L3 Cache). For information on Cache, please refer to the previous Cache series articles.
1.2.2 Access Permissions
Like most storage systems, memory access actions are also “read” and “write.” ARM uses the “AP” in the page table descriptor to impose access restrictions, as shown in Figure 1-14.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-14 Memory Access Permission Configuration

Under normal circumstances, software running at a higher level of privilege in the ARM architecture can unconditionally access memory space of lower level privileges, as described in the manual:

The standard permission model is that a more privileged entity can access anything belonging to a less privileged entity. For example, an Operating System (OS) can see all the resources that are allocated to an application, or a hypervisor can see all the resources that are allocated to a virtual machine (VM). This is because executing at a higher exception level means that the level of privilege is also higher.

This permission model design generally works well; however, there are situations where software attacks due to coding errors or system vulnerabilities can harm and pose risks to lower-level software memory spaces. Therefore, ARM also has preventive measures, such as limiting access through system registers, as shown in Figure 1-15.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-15 Access Permission Restrictions

In addition to controlling through system register configurations, control can also be achieved through instructions. We will not elaborate on this part; for detailed information, please refer to the manual.
1.2.3 Executable Permissions
Executable attributes are relatively easy to understand, indicating whether the data stored in this memory space can be loaded and executed by the CPU as instructions. ARM describes this through the “UXN” and “PXN” fields.
The memory space is divided into two types: Device and Normal, and we will briefly discuss them.
(1) When a system software allocates memory and assigns a segment of memory space a Device attribute, in most cases, the OS only wants to control external devices through this memory space, such as notifying a DMA controller from where to start copying data in memory. In this case, it is usually unnecessary for Device-type memory space to store executable instructions.
(2) Normal-type memory space not only needs to store instructions but also the data that the instructions will operate on, such as image encoding. Thus, the memory storing this data also does not need to be assigned executable permissions.
Based on the above analysis, let’s look at a typical configuration of executable permissions for EL1&0 memory space, as shown in Figure 1-16.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-16 Configuration of Executable Permissions

1.2.4 Access Flag
The access flag (Access Flag) is relatively easy to understand. The granularity of memory allocation in ARM is by page (4k, 16k, 64k) and block, and ARM uses this flag to indicate whether the currently allocated memory area has been accessed.
You can set the AF bit to:
• AF=0. Region not accessed.

• AF=1. Region accessed.

In actual usage scenarios, this flag is mainly utilized by the OS (it can also be set through the hardware using system register TCR_ELx.HA) as an important reference flag for memory performance optimization, such as swap operations in tight memory situations, as shown in Figure 1-17.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-17 Classic Page Replacement Algorithm – LRU

1.2.5 Dirty State
Friends familiar with Linux systems should know that Linux divides a page into two situations: “file page” and “anonymous page.” When we open an editable file page, such as a txt file, the content of this file will be loaded into the “file page.” When this file page is modified, ARM will also make a record in the page table, but the record of this state is slightly more complicated because it is associated with two bits in the page table descriptor. Taking the Stage-1 Page descriptor as an example, as shown in Figure 1-18.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-18 Stage-1 Page Table Descriptor

Here we directly quote from the manual:
The dirty state is used to indicate a memory block or page has been modified. When hardware update of the dirty state is enabled, the descriptor DBM field indicates whether the descriptor is a candidate for hardware updates of the dirty state.
Under the configuration of DBM, the Dirty state also supports hardware configuration; we will not elaborate on this. The Dirty State is also very simple, as shown in Figure 1-19.
Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-19 Dirty State
It is important to note that the interpretation of the DBM bit is subject to debate, and many experts have differing opinions. Everyone can read the manual for their own judgment; however, the small deviations do not overshadow the greatness of the experts.
1.2.6 Shared Attributes
Shared attributes are very important, and we have discussed them in a dedicated article: [A-14] ARMv8/ARMv9-Memory-Types of Memory Model (Device & Normal), and we will not elaborate on them here.
1.2.7 Quick Reference Table
We have introduced the core attributes of memory, and due to space constraints, we will not elaborate on some commonly used attributes. You can read the manual or refer to the following quick reference table, as shown in Figure 1-20 (except for a few attributes, most attribute introductions can be used directly, tested valid, haha).

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-20 ARM Memory Attributes Quick Reference Table

1.3 Memory Attributes and TLB
The TLB in ARM will also cache some memory attributes, which is an interesting point for studying the ARM architecture. Specifically, the TLB (Translation Lookaside Buffer) is part of the processor’s Memory Management Unit (MMU) used to accelerate the translation of virtual addresses to physical addresses. In the ARM architecture, TLB entries not only contain the mapping relationship between virtual addresses and physical addresses but also include attributes such as memory type, cache strategy, access permissions, address space ID (ASID), and virtual machine ID (VMID), as shown in Figure 1-21.

Understanding ARMv8/ARMv9 Memory Attributes and PropertiesFigure 1-21 Typical TLB Structure

How many memory attributes the TLB can cache also depends on the specific design of the CPU microarchitecture and the implementation by the chip manufacturer, such as the TLB structure of the Cortex-A725. Here we only quote part of the manual’s description and will not elaborate on it.
Translation Lookaside Buffer (TLB) entries store the context information required to facilitate a match and avoid the need for a TLB clean on a context or virtual machine switch.
Each TLB entry contains:
• A Virtual Address (VA)
• A Physical Address (PA)
• A set of memory properties that includes type and access permissions
Each TLB entry is associated with either:
• A particular Address Space IDentifier (ASID)
• A global indicator

Each TLB entry also contains a field to store the Virtual Machine IDentifier (VMID) in the entry applicable to accesses from EL0 and EL1. The VMID permits hypervisor virtual machine switches without requiring the TLB to be invalidated.

Conclusion
In this article, we have introduced the background knowledge of memory space in detail and then introduced most of the attributes of memory space. The content of this article is slightly lengthy; although it has been compressed a lot, it still appears somewhat cumbersome. However, I hope it can inspire everyone. With the iterations of the ARM architecture, there are more advanced uses of memory attributes worth researching. Here, everyone should first master the basic knowledge system, and we will introduce advanced uses at an appropriate time later. Additionally, due to the attribute settings that the OS can perform on each level of the page table descriptor under the multi-level page table architecture, being familiar with overlapping page tables in upper and lower page tables and their coverage relationships is left as homework for the readers. Finally, with the introduction of virtualization, the two-level translation architecture of memory allocation in Stage-2 also has its own attributes, and there are also coverage rules for the two-level attributes. We have already planned a dedicated article to introduce these contents later.
At the beginning, it felt a bit sentimental, and indeed there are some things that I really do not want to see. I sincerely hope that every coder and their family can spend every day happily. Thank you all, and please stay tuned.
References
[00] <corelink_dmc520_technical_reference_manual_en.pdf>
[01] <corelink_dmc620_dynamic_memory_controller_trm.pdf>
[02] <IP-Controller/DDI0331G_dmc340_r4p0_trm.pdf>
[03] <80-ARM-IP-cs0001_ARMv8基础篇-400系列控制器IP.pdf>
[04] <arm_cortex_a725_core_trm_107652_0001_04_en.pdf>
[05] <DDI0487K_a_a-profile_architecture_reference_manual.pdf>
[06] <armv8_a_address_translation.pdf>
[07] <cortex_a55_trm_100442_0200_02_en.pdf>
[08] <learn_the_architecture_aarch64_memory_management_guide_en.pdf>
[09] <learn_the_architecture_armv8-a_memory_systems_en.pdf>
[10] <79-LX-LK-z0002_奔跑吧Linux内核-V-2-卷1_基础架构.pdf>
[11] <79-LX-LD-s003-Linux设备驱动开发详解4_0内核-3rd.pdf>
[12] <learn_the_architecture_memory_systems_ordering_and_barriers.pdf>
[13] <arm_dsu_120_trm_102547_0201_07_en.pdf>
[14] <80-ARM-MM-AL0001_内存学习(三):物理地址空间.pdf>

[15] <learn_the_architecture_aarch64_memory_attributes_and_properties.pdf>

Glossary
MMU – Memory Management Unit
TLB – translation lookaside buffer
VIPT – Virtual Index Physical Tag
VIVT – Virtual Index Virtual Tag
PIPT – Physical Index Physical Tag
VA – Virtual Address
PA – Physical Address
IPS – Intermediate Physical Space
IPA – Intermediate Physical Address
VMID – virtual machine identifier
TLB – translation lookaside buffer(地址变换高速缓存)
VTTBR_EL2 – Virtualization Translation Table Base Registers
ASID – Address Space Identifier (ASID)
DMC – Dynamic Memory Controller
DDR SDRAM – Double Data Rate Synchronous Dynamic Random Access Memory
TBI – Top Byte Ignore
DMB – Data Memory Barrier
DSB – Data Synchronization Barrier
ISB – Instruction Synchronization Barrier
DSU – DynamIQ ™ Shared Unit
SOC – System on Chip
VMSA – AArch64 Virtual Memory System Architecture
AP – Access Permissions
XN – Execute Never
AF – Access Flag

Leave a Comment

×