Understanding Arm Architecture and Its Cores

Arm architecture has dominated the embedded processing and computing market today, but it has come a long way over the past few decades.Starting in the 1980s, it was initially used as a home computer processor, then in the 1990s it became the foundation for mobile phone chips.Today, Arm is a strong competitor in almost every technology niche.Many believe that the Arm architecture has become the preferred choice for 32-bit or 64-bit processors.Due to this widespread application, there are now thousands of variants based on the Arm architecture.Understanding the differences between these cores is an important part of making selection decisions.
As early as 2004, the initial Cortex family differentiated the Arm architecture into three core product groups, each targeting different types of applications. The first to be integrated into chips was Cortex-M, which has become a cornerstone of the Arm microcontroller (MCU) ecosystem. Although the Cortex-M series initially launched with cores based on version 7 architecture, later products aimed at ultra-low-cost devices (i.e., M0, M0+, and M1) were based on earlier version 6 architecture. All Cortex-M processors only execute the Thumb instruction set. The other two series were designed to support both Thumb and the full A32 instruction set.
Understanding Arm Architecture and Its Cores
Figure 1: Silicon Labs’ EFM Tiny Gecko.
Since its introduction, Cortex-M3 has been adopted by many MCU vendors, with the Cortex-M3 core helping these MCU manufacturers define their 32-bit products. The MCUs available on the market now include relatively simple yet cost-effective products, such as Silicon Labs’ EFM Tiny Gecko for low-power systems, and Cypress Semiconductor’s PSoC 5 system-on-chip, which combines traditional MCU peripherals with highly flexible programmable analog functions.
As MCU applications began to demand higher digital signal processing (DSP) performance, Arm introduced Cortex-M4 to meet market needs. This core supports floating-point operations and has been adopted by many vendors. A common configuration is to integrate the powerful Cortex-M4F core with the simpler Cortex-M0 or Cortex-M0+, providing users with efficient power management and resource allocation.
In devices like Cypress’s PSoC6 or NXP’s LPC 5411x, the M0+ core can handle interrupts, allowing the M4 or M4F to freely manage DSP tasks without interruptions, thereby maximizing data throughput. This division of responsibilities also allows the more powerful M4 core to sleep for extended periods between active bursts. The low-power M0+ can handle relatively simple system management tasks during limited operation periods.
Understanding Arm Architecture and Its Cores
Figure 2: From Cypress Semiconductor’s PSoC6.
In 2014, Arm introduced the M7 core, elevating the performance of Cortex-M to new levels. This core features a six-stage superscalar pipeline architecture, supports out-of-order execution, and is further enhanced by a complete floating-point unit. STMicroelectronics’ STM32 F730x8 integrates the M7 core with various peripherals and the company’s proprietary ART accelerator technology (which enables zero-wait state execution from flash memory).
Cortex-A
In 2005, in response to the mobile business’s shift towards smartphones and tablets, Arm launched the first member of the Cortex-A family. Cortex-A is designed to provide a range of application processor-specific features and paved the way for deploying Arm cores in servers and other high-end computing systems.
A major distinction between Cortex-A processors and other series processors is the support for a paged memory management unit (MMU). Linux and similar operating systems require an MMU because it can map programs and their data in physical memory to different virtual address spaces, providing a degree of security protection that prevents data used by different tasks from being corrupted, and also allows physical memory to be viewed as a large cache. Even though programs are dynamically loaded and unloaded, it can also avoid the issues caused by memory fragmentation.
A potential drawback of using paged virtual addresses is that they can interfere with real-time operations, so while the Cortex-A processors include an MMU, it is not present in product lines with stronger embedded system capabilities. A key innovation of the Cortex-A architecture from its inception is TrustZone, which enables a hardware-based security layer that allows a hypervisor to deny any task access to certain parts of the processor and memory without the required security certificates. TrustZone allows cryptographic operations and other sensitive operations to be protected by a virtual processor secured by a hardware firewall.
In terms of cores, the range goes from relatively simple Cortex-A5 to high-performance superscalar processors like Cortex-A72, which integrates the ability to issue three instructions simultaneously and execute out-of-order, simplifying scheduling for maximum efficiency.
The second significant innovation in the Cortex-A family is the LITTLE architecture, which was introduced in 2011, reflecting the coupling of different Cortex-M cores for the application processor market following the introduction of M4, enhancing support for application processor demands.
For larger LITTLE, Arm adopted a method of combining low-end cores (like A5 or A7) with high-performance, often superscalar implementations. Whenever possible, the operating system keeps the low-power processor active for as long as possible and only activates the high-power core when the workload exceeds a specific threshold. Unlike traditional dual-core architectures, tasks can migrate from one processor to another based on system conditions. As performance demands increase, more and more Cortex-A implementations are adopting four high-end cores around processor complexes. This arrangement can save power by shutting down one or more cores during periods of relatively calm performance demands.
Cortex-R
Cortex-R is Arm’s third major series of cores, supporting the next generation of complex automotive and networking systems through real-time and highly reliable features. Some target applications require deterministic performance, meaning that the caches typically used to accelerate other Arm processors may not always be the best solution. Since caches dynamically replace instruction and data values with the most recently used entries, critical information may not be in the cache when interrupt service routines or real-time tasks require it. The Cortex-R family overcomes this issue by supporting tightly coupled memory (TCM). Thus, critical information can be stored in it during operation, and software management avoids the risk of instructions and data being replaced by the cache management subsystem.
Since the original Cortex-R4 was born, this family has evolved significantly, with Cortex-R5 and R7 cores featuring low-latency peripheral ports. Most cores are designed to work with on-chip buses like the Arm Hardware Bus (AHB) or, in more recent cores, the Advanced eXtensible Interface (AXI) architecture. Low-latency ports connect the cores directly to important peripherals without requiring bus arbitration or waiting for other bus access activities to complete.
To support highly reliable operations, caches, TCM, and system buses on Cortex-R cores can use error correction coding to transparently correct single-bit errors and detect double-bit errors. Since modular redundancy is a core part of safety-critical systems, Cortex-R series cores are designed to operate with replicas in locked steps. If an on-chip monitor detects output differences, it can warn of issues so that software can take corrective action. An example of a chip produced using the Cortex-R series is Cypress Semiconductor’s Traveo S6J33xx series MCU, which integrates a Cortex-R5F core that operates at frequencies up to 240MHz and includes peripherals optimized for driving instrument clusters in automotive dashboards.
Arm v8
In 2011, with the creation of the version 8 architecture, the second wave of changes in Arm core products arrived, including enhanced capabilities for running specific applications in 64-bit mode, greatly expanding the maximum addressable memory space for application processors. Arm v8 processors with 64-bit capabilities can operate in either 32-bit or 64-bit mode. The 32-bit operation provides backward compatibility for applications written for version 7 processors. Since version 8 processors in the Cortex-M series focus on MCU applications, they do not support 64-bit addressing. However, they do add many additional instructions and features to improve performance and enhance secure operations.
One significant advancement is the redesigned memory protection unit (MPU), which allows for more flexible partition management. Another is full support for execute-only memory to help prevent reverse engineering and hacking. However, the most significant change in security is the support for the TrustZone mechanism, specifically optimized for deeply embedded processors.
For the TrustZone in the Cortex-M version, there is no need for a software virtual device manager to manage transitions between secure and non-secure states. Instead, dedicated instructions can be used to pass data from non-secure tasks to secure functions to protect those allowed to run in privileged mode. Without the correct permissions, even high-priority interrupts cannot read secure data from registers. Security features allow the creation of well-protected IoT devices, which can be implemented through MCUs based on cores like Cortex-M23 and Cortex-M33.
Microchip’s SAML11 MCU features the Cortex-M23 enhanced on-chip encryption controller, providing hardware security assurances for sensor nodes and similar designs. Nordic Semiconductor’s nRF 9160 uses Cortex-M33 to provide device processing capabilities that require secure RF communication.
Understanding Arm Architecture and Its Cores
Figure 3: Example of Microchip’s SAML11 MCU.
Conclusion
There is no doubt that Arm is one of the greatest success stories in the global electronics industry. To meet the needs of many different markets, Arm’s extensive product portfolio continues to expand in multiple directions. The further differentiation of product lines such as Cortex-A, Cortex-M, and Cortex-R has proven to be the foundation of this rapid growth and will continue to drive the widespread application of Arm cores in emerging new fields.
Source: Mouser Electronics
Author: Mark Patrick

Understanding Arm Architecture and Its Cores

Disclaimer:This article is original by the author, and the content reflects the author’s personal views. The Electronics Enthusiasts Network reproduces it only as a means of conveying a different perspective and does not represent the Electronics Enthusiasts Network’s endorsement or support of this view. If there are any objections, please contact the Electronics Enthusiasts Network.
Read More Exciting Articles
  • Is the automotive electronics market still good in 2020?

  • What is the global semiconductor market landscape post-pandemic?

  • What will the next decade bring for tablets?

  • After the pandemic, power semiconductors accelerate domestic substitution!

  • Net profit increased by 1753.65%! The three packaging giants and the CIS packaging leader rise against the trend!

Leave a Comment

Your email address will not be published. Required fields are marked *