AI Based on This Architecture is Ubiquitous

The technological wave triggered by generative AI has brought multidimensional and large-scale demands to the cloud computing industry chain and data centers. However, AI is not limited to data centers; many practitioners are embedding AI capabilities into edge devices and terminals. In the near future, AI will permeate billions of devices worldwide, benefiting every consumer’s work and life.

“Traditionally, the AI that everyone refers to is server-based AI or generative AI. But we believe that AI can also exist at the edge, in the network, from traditional data endpoints to storage and servers, integrating into every computing process,” said Mohamed Awad, Senior Vice President and General Manager of Arm’s Infrastructure Division, at the Arm Tech Symposia annual technology conference held in Shenzhen last month.

AI Based on This Architecture is UbiquitousMohamed Awad, Senior Vice President and General Manager of Arm’s Infrastructure Division

From infrastructure aimed at cloud services and data centers to smart terminals like smartphones, and to the edge that brings IoT data closer to data sources and users, Arm has deployed solutions at every node in the AI field.

The market competition for AI is essentially a competition of developer ecosystems. To support the customized and proprietary needs of users in the AI era, Arm empowers the developer ecosystem with a complete and comprehensive computing platform, becoming an “accelerator” for product launches and industrial innovation.

Bringing AI Capabilities to the Cloud and Edge

In traditional infrastructure architectures, the center consists of standard off-the-shelf CPUs, with memory and multiple accelerators connected to the CPU. This means that each accelerator needs to access memory through the CPU. In the AI era, such an architecture is clearly inadequate to handle the data and computation demands of AI. Cloud service providers and data center infrastructure suppliers urgently need customized CPUs that allow each CPU core to connect directly to each accelerator, achieving system-level memory consistency.

What sets Arm apart is its support for chip design companies to customize chip architectures on demand, empowering the entire development process based on a robust Arm software ecosystem, thus accelerating time to market. This is why leading cloud providers like NVIDIA, Amazon, Alibaba Cloud, and Microsoft have chosen to develop computing chips based on Arm architecture. The architecture used in NVIDIA’s GH200 Grace Hopper super chip is a custom design developed in collaboration with Arm. This architecture includes 72 Arm Neoverse cores, combined with NVIDIA’s GPUs, allowing each core to connect directly to each accelerator, resulting in a tenfold improvement in AI performance compared to systems based on x86 architecture.

In the smart terminal field, the technology and scenario innovations surrounding smartphones have entered a heated stage. Smartphones are not only the most commonly used communication and entertainment devices for consumers but have also become carriers for mobile office work and targets for deploying large models. The diverse product ideas and development strategies of smartphone manufacturers urgently require a computing foundation that can scale according to various consumer needs.

When smartphone manufacturers choose the configuration of their CPU clusters, they often select suitable CPUs based on the user scenarios of their target audience. This is part of Arm’s CPU strategy, aimed at enabling its partners to choose CPUs with appropriate performance for their customers. In the current high-end smartphone market, we often see many different innovations, and each company’s strategy varies, which is precisely the diversification and differentiation that Arm hopes to see from the flexible configuration of Arm IP. This year, Arm launched the 2023 Total Computing Solution (TCS23), which integrates physical IP, architecture, tools, and software to provide one-stop, simplified technical support for SoC development. As part of TCS23, the Armv9 Cortex computing cluster has achieved double-digit performance improvements for three consecutive years. Arm’s flagship GPU, Immortalis, has not only brought ray tracing and variable rate shading capabilities to mobile devices but also optimized the interaction process between external memory, CPU clusters, and system-level caches in TCS23, thereby enhancing overall performance.

“The Arm Total Computing Solution we provide for mobile platforms empowers the realization of AI on mobile devices. Additionally, our Arm Cortex-M52 and Cortex-M55 are products launched to support AI, continuously empowering and supporting AI development,” said Mohamed Awad.

As AI moves to the edge, IoT becomes not just a collector and transmitter of device information but can also utilize AI for predictive maintenance, sensor fusion, industrial control, and other functions.

Arm has launched a comprehensive IoT solution that simplifies development and accelerates product design by combining hardware IP, platform software, machine learning (ML) models, tools, and more. Arm Helium technology, as a vector extension solution for the Cortex-M processor series, significantly enhances the machine learning and digital signal processing capabilities of small, embedded devices. In Arm’s latest Cortex-M52, Arm Helium adds 150 new scalar and vector instructions to the Armv8.1-M architecture series, achieving a 2.7 times improvement in digital signal processing capabilities and a 5.6 times improvement in machine learning capabilities compared to previous products. To address the diverse and varied characteristics of IoT device scenarios, the Arm Corstone solution provides key IP integration configurations, enabling rapid development of IoT products with different performance requirements, truly pushing AI computation to the edge.

Empowering the Developer Ecosystem with a Comprehensive Computing Platform

While AI brings infinite business opportunities to the global computing industry, it also presents severe computational challenges. On one hand, the exponential growth of connected devices and data traffic places heavy pressure on computing infrastructure such as data centers; on the other hand, as advanced processes approach physical limits, the development of Moore’s Law slows down, making it increasingly difficult to balance performance and power consumption.

In response to the development trends and latest demands of the computing industry, Arm has transformed from a well-known IP supplier to a computing platform company, focusing on providing flexibility in choices for partners through complete and comprehensive solutions while continuing its IP licensing business. From mobile platforms to infrastructure, IoT, and automotive fields, Arm has launched corresponding computing platforms that simplify the development process while providing greater freedom for chip designers to customize chips based on their own scenarios and use cases.

In addition to a series of computing platforms such as the Arm Total Computing Solution, Arm Neoverse platform, Arm Corstone, and SOAFEE, Arm has recently launched the Arm Neoverse Computing Subsystem (CSS) and Arm Total Design, further activating the power of the ecosystem.

Neoverse is Arm’s product line for servers and infrastructure, and Neoverse CSS provides users with Neoverse cores, CMN mesh architecture, and system IP, as well as system management, power management, software, and development tools needed for optimized performance, enabling users to deliver customized chips at lower costs, in shorter timeframes, and with lower risks.

“Arm Neoverse CSS saves engineering teams the equivalent of one year of work for 80 engineers. One customer reported that after using Neoverse CSS, their project went from concept definition to tape-out in just 13 months,” said Mohamed Awad.

The Microsoft Azure Cobalt 100 CPU is built on Arm Neoverse CSS and includes 128 Neoverse cores. The advantages of the Neoverse CSS and Arm platform’s software ecosystem allow Microsoft to focus more on unique innovations and optimizations while saving a significant amount of development work. Arm expects that in 2024, more domestic and international cloud computing and data center-related companies will put their first-generation CSS designs into production.

Building on Neoverse CSS, Arm has launched the Arm Total Design ecosystem project, allowing applications in AI, cloud, networking, and edge infrastructure to widely utilize the Arm Neoverse architecture. Pre-integrated and validated IP and EDA tools from partners such as Cadence, Rambus, and Synopsys, design services from partners like ADTechnology, Alphawave Semi, Broadcom, Capgemini, and MediaTek, as well as foundry services from Intel Foundry Services and TSMC, and commercial software and firmware from infrastructure firmware suppliers like Armory, will work together with Arm to serve users.

The Cortex product line, which includes terminal and edge products, has always adhered to the logic of closely coupling software and hardware ecosystem construction, providing overall support for developers. For example, with the Cortex-M52, developers previously had to combine three different computing units—CPU, DSP, and NPU—with three different software toolkits to achieve both digital signal processing and machine learning capabilities in low-power processors, undergoing a complex development process. Arm provides developers using Cortex-M52 with a single toolchain to address traditional computing tasks, digital signal processing, and machine learning loads with a consistent development process, significantly shortening time to market while enhancing the development experience.

For Arm, China is an important and rapidly growing market, and one of the most innovative and potential markets for Arm. Arm has 15 million developers worldwide, with 4 million in China.

In the server and infrastructure sector, Arm Neoverse is fully embracing local ecosystems and open-source community building. According to Zou Ting, Global Vice President of Arm’s China Business, Arm Neoverse has many customers in the Chinese market, especially in the infrastructure sector. Arm actively participates in the local ecosystem construction of data centers, cloud computing, and other areas, as well as in the open-source software community construction like Longxin, helping these communities better integrate into Arm’s global ecosystem. In the terminal and edge sectors, Arm also provides one-stop services and rich ecological resources for local developers.

AI Based on This Architecture is UbiquitousZou Ting, Global Vice President of Arm’s China Business

With the support of Arm technology, the inference running on billions of devices worldwide will undergo a transformation. At the same time, this requires the industry to work together to accelerate the sharing speed of AI training and inference between data centers and devices, thereby improving the efficiency and cost-effectiveness of AI while enhancing its security, ultimately achieving ubiquitous AI.

Author: Zhang XinyiEditor: Zhao ChenDesign: MariaSupervisor: Lian Xiaodong

Leave a Comment