Case Study: China Mobile’s ARM-Based NFV Solution Deployment

Project Contributors: Duan Xiaodong, Zhang Hao, Cai Yali, Li Ji, Yue Lei, Yu Qing, Chen Jiayuan, Gao Congwen, Qin Jie, Wang Yihui
1. Case Overview
In response to the national call to “accelerate breakthroughs in cutting-edge technologies for network development and key core technologies with international competitiveness, speed up the promotion of domestic autonomous and controllable replacement plans, and build a secure and controllable information technology system,” China Mobile comprehensively analyzed the autonomous and controllable risks of the NFV product system in early 2019. Combining theoretical research and pilot verification, China Mobile actively promoted the mature application of ARM servers and their end-to-end NFV solutions, effectively avoiding the risk of a single technical route in China Mobile’s network cloud infrastructure.
1. Case Background
As the computing foundation providing computing power for data centers, servers are mainly composed of motherboards, processors, memory, hard disks, network chips, and power supplies, while the software mainly includes operating systems, middleware, and applications, forming an overall server software and hardware ecosystem. Currently, Intel’s x86 ecosystem remains mainstream, but with the release of new generation general-purpose ARM processors like Huawei’s Kunpeng, ARM-based servers are expected to make breakthroughs in the market.
x86 and ARM are two independent processor architectures, with the main difference being their instruction sets. x86 uses CISC instruction sets, while ARM uses RISC instruction sets, which do not need to consider backward compatibility and have no historical burdens. Through optimization for specific services, ARM can greatly improve processing efficiency and has become a strong competitor to x86. Currently, many companies globally have developed ARM processors, including Huawei HiSilicon, Marvell, Amazon, Ampere, and Feiteng, and their products have been scaled and deployed.
ARM processors originated in low-power, low-computation scenarios in the mobile internet field, such as smartphones and wearable devices, and now occupy over 99% of the mobile market share. With continuous advancements in ARM technology, starting from its V8 instruction set version, it supports 64-bit and introduces more physical cores to meet the computing power needs of data centers. The increase in multi-core performance brings qualitative improvements to processors, supporting the development of the ARM architecture from the mobile side to the data center server field.
In the software ecosystem, the number of operating systems supporting ARM architecture is increasing, including RedHat, SUSE, Ubuntu, Kylin, and Deepin OS. In the telecommunications field, Huawei’s cloud-based 4/5G core network products fully support ARM.

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 1 ARM Processor Development and Operating System Support
2. User Needs and Pain Points
Before 2018, China Mobile’s centralized procurement servers were mainly x86 architecture, relying on a single CPU supplier. Since 2019, with changes in the external environment, the risks of dependence on a single technical route have increased, and the supply chain faces security issues, posing significant challenges to our company’s cloud infrastructure construction; with 5G commercialization imminent, network security also faces higher requirements.
Against this backdrop, Huawei launched the ARM-based NFV system products. To comprehensively test product maturity and seek effective measures to avoid supply chain risks, China Mobile initiated a feasibility study to introduce the ARM technology route, constructing a dual-plane resource pool.
3. Case Overview
China Mobile’s NFV technical architecture follows the ETSI standard “three layers and one domain” architecture and formulates corresponding functions and interfaces in the orchestration management domain according to customized operational and maintenance needs. Considering that the core differences between ARM and x86 architectures stem from different instruction sets, the impact of migrating to ARM increases the closer one gets to the underlying hardware. Therefore, for the feasibility study of introducing ARM, the primary focus is on its underlying hardware capabilities and a comparison of its advantages and disadvantages against x86; secondly, attention is paid to the platform layer’s ability to stably support upper-layer business capabilities and whether it meets the reliability requirements of basic communication services of “five nines.”

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 2 The Impact of Introducing ARM on the NFV System
Through preliminary theoretical analysis and research, China Mobile launched a pilot project based on ARM’s NFV solution, which began in early June 2019 and ended in late December of the same year, conducted in parallel by the laboratory and the Zhejiang external field. The verification content comprehensively covered technology fields such as servers, distributed storage, virtual layers, MANO, and IMS/EPC/5GC, fully validating the functionality, three-layer decoupling, performance, reliability, and security of China Mobile’s network cloud platform and network elements.
During the pilot verification, a cross-architecture performance evaluation system was established for the first time at the hardware layer, with detailed tests of server performance, introducing the Redfish management interface to achieve unified management of different types of servers from different vendors; the virtual layer studied the Linux kernel of ARM architecture, virtualization components, and OpenStack’s ARM adaptation and telecom-grade enhancements; at the business layer, end-to-end business process and performance testing were validated, including core network elements across heterogeneous resource pool groups and MANO’s unified management of heterogeneous resource pools. The entire project ultimately formed a technical specification system based on 60+ company standards and over 4000 use cases. After nearly six months of rigorous testing, the results showed that the ARM-based NFV end-to-end system can meet the commercial requirements of China Mobile’s network cloud.
2. Solution
1. Technical Architecture
China Mobile’s ARM-based NFV solution is consistent with the x86 architecture, following the ETSI standard “three layers and one domain” architecture, as shown in the figure below, including the hardware layer, virtual layer, network element layer, and orchestration management domain. The hardware layer adopts ARM computing servers and distributed storage, while the virtual layer uses OpenStack, with all virtualized network elements and resources managed uniformly by MANO.

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 3 ARM-Based NFV System Architecture
2. Technical Advantages and Highlights of the Solution
(1) Hardware: Innovatively Proposes a Cross-Architecture Hardware Performance Evaluation System and Develops Unified Hardware Management Based on Redfish
The hardware-related research is based on Huawei’s Taishan ARM servers, with the CPU being the Kunpeng 920 processor, using a 7nm process, a main frequency of 2.6GHz, and supporting up to 64 cores, configured with three 10G network cards to support the physical isolation of business, management, and storage networks.
Due to the limitations of testing algorithms, the industry-standard SPEC CPU tools cannot fully reflect the CPU’s capabilities in supporting concurrent computing and other aspects. In conducting performance tests comparing contemporary ARM and x86, an innovative cross-architecture processor performance testing system was proposed, using concurrent processing capability, real-time capability, and the traditional x86 advantage – encryption and decryption processing capability – as the main benchmarking items to comprehensively validate processor performance across multiple dimensions. Considering that in NFV scenarios, what is provided to upper-layer applications is virtualized computing resources, the unit computing capability was also included as an important comparison dimension. Testing showed that the single-core capability of ARM processors is lower than that of the same generation x86 single-thread capability, consistent with its lightweight core and multi-core architectural characteristics. It is recommended that ARM servers leverage their multi-core advantages in cloud computing scenarios to increase computing resource allocation and ensure upper-layer application performance.
To reduce the adaptation development costs of upper-layer management platforms when interfacing with multiple vendors and server models and further promote the complete decoupling of software and hardware, a unified server management interface should be established for upper-layer calls. In this project, the out-of-band management interface of ARM servers first applied the Redfish interface protocol, which is based on a RESTful architecture and promoted by DMTF standard organizations. Compared to traditional IPMI and SNMP interfaces, the Redfish interface is fully functional, has an advanced architecture, and strong scalability; it uses JSON for clear data representation, eliminating the need for parsing; and is based on HTTPS protocol, ensuring high security. Through the Redfish unified management interface, 180+ indicators are comprehensively defined for server asset management, component information queries, sensor monitoring, power and fan management, fault alarms, log management, and BMC and BIOS parameter configuration, significantly enhancing management and adaptation efficiency.
(2) Virtual Layer: Complete the Overall Migration from x86 to ARM, Retaining Telecom-Grade Enhancements
The virtual layer is responsible for the virtualization abstraction and management of hardware resources, which is highly related to the underlying hardware architecture. The virtual layer migrates from x86 to ARM architecture, involving the overall switch of the operating system (Linux kernel/virtualization components) and OpenStack’s adaptation to ARM. In terms of operating system switching, new requirements for hyper-threading functionality, ARM CPU instruction sets, and image specifications are added. Based on this, a large number of telecom-grade enhancements are conducted, reflected in the standards for software-hardware decoupling, virtual interrupts, virtual machine migration performance, and forwarding performance optimizations. The adaptation of OpenStack to ARM is reflected in the implementation details and operational capabilities of components such as Nova, Glance, Neutron, etc.
Verification of the ARM-based virtual layer includes testing system real-time performance, scaling performance, fault recovery performance, performance degradation, and other aspects. Among them, the ARM virtual layer software has advantages in concurrent creation and deletion operations.
(3) Network Element Layer: Complete Application Software Adaptation and Build Dual-Plane Resource Pools to Ensure Business Disaster Recovery
The migration of network element software from x86 to ARM requires adaptation and performance tuning for the new architecture to meet the operational requirements of basic communication services. By introducing the ARM technology architecture, a dual-plane network cloud resource pool of x86 and ARM is constructed, significantly enhancing business disaster recovery and security capabilities.
  • Adaptation of Network Element Software to ARM

There are two ways to port software based on other architectures to ARM architecture. One is to re-adapt, compile, and tune the source code of the software to support the ARM architecture; this method is suitable for software with high performance requirements, such as telecommunications software products; the other is to use instruction translation technology to run the software directly on the ARM architecture without obtaining source code, with no new development and adaptation, but with overall performance at about 50%-80% of native applications, suitable for applications with low performance requirements. This project adopts the former, adapting and recompiling GuestOS software for IMS, EPC, and 5GC network element software, especially focusing on performance tuning for media forwarding network elements to ensure the inheritance of business functions, while meeting optimal deployment requirements in terms of compatibility, functionality, and performance.

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 5 Adaptation Process of Network Element Software to ARM Architecture
  • Dual-Plane Resource Pools of ARM and x86 Support Business Disaster Recovery

Core network elements for services such as IMS, EPC, and 5GC based on ARM have the capability to pool with x86 mixed resource pools, meaning that network elements deployed in the ARM resource pool, such as CSCF, MME, and AMF, can pool indistinguishably with similar network elements in the x86 resource pool. The resource pools are independently deployed, forming a dual-plane resource pool of x86 and ARM, with clear orchestration and scheduling processes, low operational difficulty, effectively ensuring the secure and reliable operation of services.

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 7 Dual-Plane Disaster Recovery of ARM and x86 Resource Pools
(4) MANO: Reduce Operational Difficulty and Unified Management of ARM and x86 Resource Pools
The introduction of the ARM technology route has developed the resource pool types from a single x86 resource pool to coexistence of x86 and ARM resource pools. To maintain business flexibility and reduce deployment dependencies, business network elements and management domain elements must be deployable in both types of resource pools, meaning that MANO must have the capability to manage both types of resource pools simultaneously. As shown in the figure below:

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Figure 4 MANO Unified Management of Dual-Plane Resource Pools
The pilot verification shows that MANO has the capability to manage resource pools based on both ARM and x86 uniformly, including:
  • Resource Management: Supports unified resource management and display of VIM resource pools based on ARM and x86;

  • VNF/NS Lifecycle Management: Supports unified lifecycle management of VNFs and NSs within ARM and x86 resource pools;

  • Alarm Management: Supports unified alarm management for VIMs based on ARM and x86, with alarm filtering and processing according to ARM/x86 VIM types;

  • Performance Management: Supports unified performance management for VIMs based on ARM/x86.

3. Commercial Value
The pilot project of China Mobile’s ARM-based NFV solution has formed a full-field technical solution based on the ARM system, constructing a dual-platform for network cloud x86+ARM hardware, achieving full-field coverage for underlying, platform layer, and upper-layer services, validating the feasibility of the dual technology route of x86 and ARM in the network cloud system, providing alternative solutions to avoid single technology reliance and prevent supply chain security risks, and accumulating experience for the subsequent introduction of diversified technology routes. The pilot project successfully supported Zhejiang Mobile in building the world’s first 5G SA cloud network based on ARM’s autonomous and controllable technology. Based on the pilot results, one of China Mobile’s eight major regions, Southeast China, is conducting 5G network deployment based on ARM, potentially gaining a competitive edge in future 5G business development.
The pilot project of China Mobile’s ARM-based NFV solution has effectively promoted the maturity and technological evolution of equipment suppliers’ products. Based on the NFV application scenario, comprehensive adaptation and systematic research on the technological evolution of the ARM architecture are being conducted to accelerate the technological iteration and gradual maturity of equipment providers, supporting phased promotion and mass commercialization in NFV deployment.
As the world’s largest NFV operator, China Mobile has proactively introduced the ARM technology route, achieving the world’s first large-scale application of ARM-based NFV solutions in the telecommunications field. In response to the relatively weak ARM application ecosystem in the data center field, it actively expands the upper-layer application ecosystem, showcasing the commercial prospects of ARM in the telecommunications field to the industry, supporting the company’s future business development, and invigorating industry ecological partners, which is conducive to building a healthy and open ARM ecosystem.

END

This case study was selected as one of the award-winning cases in the “2019 Annual SDN, NFV, and Network AI Excellent Case Collection and Evaluation” event, guided by the SDN/NFV/AI Standards and Industry Promotion Committee, and jointly hosted by C114 and IT168 in September 2020.

About the Green Computing Industry Alliance (GCC)Since its establishment in 2016, GCC aims to collaboratively build a green, open, autonomous, and shared ecosystem, committed to promoting the development of the green computing industry, building a platform for industry communication and cooperation, and enhancing enterprises in fields such as PCs, servers, storage, operating systems, and databases, promoting win-win cooperation in the computing field. It has now become a global alliance with the most complete ARM infrastructure server chip partners, including Tianjin Feiteng, HiSilicon, Marvell, Ampere, etc.

Case Study: China Mobile's ARM-Based NFV Solution Deployment

Leave a Comment