The leap in computing power for edge AI fundamentally reconstructs the interaction paradigm between humans, machines, and the physical world.Author | Yunpeng Editor | Mo YingOne of the core values brought by the popularity of DeepSeek to the global AI industry is the significant acceleration of AI application adoption, especially in the domestic market, where almost all types of products are increasingly integrating AI capabilities.This means that AI large language models will no longer just be floating in the cloud but can truly land on devices in our daily lives, moving towards the edge and endpoint. The IoT market, as a representative field of edge intelligence, has become one of the main focuses of this wave of edge AI.Today, with the rapid development of AI technology, fields like IoT and edge AI are gradually involving more complex algorithms, and there is a growing demand for edge AI inference. Consequently, applications are increasingly demanding computing performance and efficiency, while data security issues are becoming more prominent.From smart cities, smart industries, smart homes, smart wearables to new retail, the widespread application of AI technology has brought new challenges in AI computing demand and security. In the face of these new demands, the industry is calling for new solutions.
Just yesterday, Arm launched the world’s first IoT-optimized Armv9 edge AI computing platform, centered around the new Cortex-A320 CPU and Arm Ethos-U85 NPU. It is not a simple stacking of CPU and AI accelerators but a solution that achieves deep integration and mutual enhancement of CPU and NPU.
This is the first time that many advantageous features of the Armv9 architecture have landed in the IoT market, with upgrades in efficiency, performance, and security directly addressing the new demands of edge AI. At the same time, Arm has also expanded the software layer’s KleidiAI into the IoT field, further simplifying edge AI development.How does Cortex-A320 address the pain points of industry demands in the edge AI field represented by IoT, and what key technological upgrades does Armv9 bring? How do the new advancements at the software level accelerate technological innovation and application landing in the IoT field? We will attempt to find the answers.01.The Storm of Edge AI: Computing Power and Security Become Two Major Challenges for IoTIn recent years, with the development and widespread application of AI technology and the rapid growth of computing demands, more and more AI workloads have shifted from data centers and the cloud to the edge for processing. This has brought tangible benefits to people’s production and lives, but it has also introduced computing performance bottlenecks and security challenges.
For example, in the industrial quality inspection field, an AI visual inspection system that reduces the production line’s missed detection rate by 1% could mean savings of millions. However, achieving this 1% target in complex factory conditions is technically challenging; in the automotive industry, the success of autonomous driving is closely related to the safety of drivers, requiring data from sensors like LiDAR and cameras to be fused and processed by models within 100ms.In the smart medical field, edge AI systems analyze monitor data in real-time to warn of early symptoms of related conditions. The efficiency and accuracy of this process heavily depend on the performance of the underlying chips; delays in warnings and decreased prediction accuracy due to insufficient computing power can significantly impact patient health.As the demand for complex tasks like real-time analysis of AI HD video and AI industrial equipment fault detection continues to grow, the importance of edge AI computing capabilities becomes increasingly prominent. The IoT field is calling for a comprehensive innovation from chip architecture to algorithm layers to truly unleash the revolutionary potential brought by AI.In addition to computing performance and efficiency, the development of edge AI also brings risks in data security. In edge computing, as more and more edge AI devices connect to the network, data transmission between edge devices and the cloud or other edge devices is susceptible to network attacks.Therefore, in the face of the rapid development of edge AI, edge devices must possess stronger inference capabilities and more secure computing architectures.From an industry perspective, traditional edge devices’ chips cannot meet the increasing demands for real-time AI processing and compute-intensive inference tasks.At yesterday’s Arm product launch, Arm’s IoT business development vice president Ma Jian mentioned that in her recent discussions with many partners regarding edge AI, the unanimous feedback was that AI will redefine their product designs—by incorporating AI accelerators with features like Transformer acceleration or adopting CPUs that better support AI.
▲Ma Jian, Vice President of Business Development at Arm IoT Division
The IoT industry has a strong demand for higher performance Cortex-A level computing capabilities and upgrades to the Armv9 architecture on the edge, and Arm’s new edge AI platform directly addresses these pain points.02.10x AI Computing Performance: Heterogeneous Computing Expands Scenario Adaptability Supporting Four Major Security Features of Armv9Based on this industry background, Arm has launched the first Cortex-A CPU designed for edge AI based on the Armv9 architecture—Cortex-A320, along with an edge AI computing platform that combines Cortex-A320 and Ethos-U85.In terms of AI computing performance and energy efficiency, Cortex-A320 offers up to 10 times the AI computing performance compared to Cortex-A35 and over 6 times compared to Cortex-A53, which is crucial for enhancing the inference capabilities of edge devices.
Compared to Cortex-A520, Cortex-A320 achieves a 50% improvement in energy efficiency, with a 15% performance improvement over Cortex-A53 and about a 30% improvement over Cortex-A35 at the same chip area.What kind of changes can such performance and energy efficiency improvements bring to the industry? For instance, compared to the Cortex-M series, Cortex-A320 has significantly improved memory addressing capabilities, overall performance, and security protection, enabling it to support various human-machine interaction scenarios effortlessly, especially excelling in visual interaction applications like video streams.Imagine at an important exhibition, you run into an acquaintance but can’t recall their name; your smart glasses can instantly recognize and provide information about them, even embedding speaking points into your line of sight to assist you in delivering an “impromptu” speech.As technology continues to advance, edge devices like smart glasses are gradually becoming indispensable assistants in our lives, allowing us to free our hands and focus on more tasks we want to accomplish.The changes brought by Arm Cortex-A320 are not just in technical parameters; in some areas, they can even lead to a reconstruction of business models. The leap in computing power for edge AI fundamentally reconstructs the interaction paradigm between humans, machines, and the physical world.In addition to performance and energy efficiency improvements, the support for heterogeneous computing characteristics is equally important.This time, Arm’s Cortex-A320 and Ethos-U85 together form a CPU+NPU collaborative computing architecture, creating a complete heterogeneous computing platform. It is worth noting that Ethos-U85 is the third generation NPU in the Arm Ethos-U product line, and it is currently the most powerful and energy-efficient Ethos NPU, with a 4x performance improvement and a 20% increase in energy efficiency compared to the previous generation.Based on heterogeneous computing capabilities, any AI operations that developers do not wish to run on Ethos-U85 can fall back to Cortex-A320, utilizing its Neon/SVE2 engine to execute on the CPU more flexibly and effectively.This way, the smart IoT and consumer electronics ecosystem can run the most suitable workloads at the right time and place.Under the deep integration of CPU and NPU, the new AI computing platform can cover more application scenarios, achieving multimodal environmental perception and understanding, including vision and natural language, and subsequently running AI agents to autonomously plan and execute complex tasks.An 8x improvement in machine learning computing performance enables it to empower edge AI devices to run large models with over 1 billion parameters locally, allowing generative AI based on large models to better land in the IoT field.It is worth mentioning that in the current high demand for memory access performance for AI large models, Cortex-A320 supports a larger addressable memory space, allowing for more flexible management of multi-level memory access latency.At the same time, Cortex-A320 can run more feature-rich operating systems, making device management more flexible.Finally, in terms of security, Armv9 supports MTE (Memory Tagging Extension), PAC (Pointer Authentication), BTI (Branch Target Identification), and S-EL2 virtualization, providing end-to-end security protection for edge devices.Overall, at the hardware level, Arm’s Cortex-A320 and Ethos-U85 heterogeneous computing platform genuinely helps enterprises address the new demands for performance, energy efficiency, and security in edge AI. The numerous advantageous features of the Armv9 architecture are beginning to accelerate AI innovation and application landing for IoT enterprises.
It is reported that the edge AI computing platform released by Arm has already received support from companies such as Amazon Web Services, Siemens, Renesas Electronics, and Advantech.03.Arm’s Software Breakthrough Accelerates the Scaling of IoT Edge AI ApplicationsOf course, Arm’s complete solution goes beyond this. Hardware is the foundation, software is the accelerator, and the combination of hardware and software can more efficiently address edge AI pain points. This time, Arm has also introduced KleidiAI into the IoT field to accelerate the landing of AI applications at the edge.Looking at industry development, a complete software ecosystem can lower the barriers for developers, and software has always been an indispensable part of Arm’s computing platform.As mentioned earlier, there is a wide variety of edge devices with significant differences in hardware performance and functionality. In this scenario, excellent software technology can optimize algorithms and models, enabling AI models to run efficiently on various edge devices.For the limited computing resources of edge devices, software technology can reduce the demand for computing resources through model compression, quantization, and optimization algorithms while ensuring AI performance, thereby improving computing efficiency.KleidiAI is a computing kernel designed specifically for AI framework developers, allowing developers to seamlessly achieve optimal performance on Arm CPUs, suitable for various devices.Since its initial launch to the terminal market last year and subsequent expansion into the infrastructure field, KleidiAI has now further covered the IoT field, providing developers in various fields with the necessary performance, tools, and software library support.In simple terms, the core function of the Arm Kleidi software library is to help developers accelerate AI applications on Arm CPUs, as most AI inference workloads globally run on Arm CPUs. Developers do not need to learn new tools and skills or engage in complex integration work, significantly lowering the development threshold and cost for IoT applications.With an excellent software ecosystem and rich development tool support, the flexibility of the Arm AI computing platform has been greatly released.This time, Arm’s Cortex-A320 is compatible with various operating systems such as Linux, Zephyr, and RTOS, and relies on Arm Kleidi to adapt to mainstream AI frameworks like Llama.cpp, ExecuTorch, and MediaPipe, achieving a 70% performance improvement,which is crucial for the landing of AI applications at the edge.
Ma Jian specifically mentioned at the launch that this advantage allows Cortex-A320 to have better flexibility across multiple market sectors, application scenarios, and operating systems, greatly expanding the choice space for partners and enabling better adaptation to different scenario needs when planning product roadmaps.This expansion of Arm Kleidi into the IoT field further strengthens Arm’s technological advantage in the IoT AI domain, attracting more developers and enterprises to develop IoT applications based on the Arm architecture, promoting cooperation and innovation in the upstream and downstream industries.Looking ahead, Arm’s edge AI computing platform will undoubtedly play a crucial role in the IoT ecosystem, with various sub-markets benefiting from Cortex-A320, Arm undoubtedly brings more possibilities to the IoT industry, laying a new foundation for product and application innovation in the IoT market.
04.Conclusion: Accelerating IoT Edge AI with Arm’s Dual Focus on Software and HardwareIn recent years, Arm has been continuously transforming towards a platform-based approach. The launch of the first Armv9 processor Cortex-A320 aimed at IoT, along with the edge AI computing platform centered on Cortex-A320 and Ethos-U85, and the expansion of KleidiAI into the IoT field, undoubtedly adds momentum to the accelerated development and application landing of edge AI through hardware and software collaboration, and is also a strong demonstration of Arm’s active role in empowering AI across various fields.Looking to the future, AI computing is accelerating from the cloud to the edge, and the demand for edge AI computing will undoubtedly continue to grow. As Ma Jian mentioned during the meeting, “The future of AI is at the edge, and the future of edge AI belongs to Arm.” We look forward to it.
(This article is original content from NetEase News • NetEase’s special content incentive program signed account [Smart Things], unauthorized reproduction is prohibited.)