If you like it, follow us!
Produced by | OSCHINA
Written by | Da Dong
In the era of cloud-native, the Go language has become the preferred language for cloud-native infrastructure construction due to its native support for high concurrency, ranking high in various programming language rankings and becoming one of the fastest-growing emerging programming languages. In contrast, Rust, which was created to replace C/C++, has been relatively lukewarm for a long time, falling into the awkward situation of being well-received but not widely adopted.
Fortunately, as the cloud-native wave gradually enters its second half, enterprises and users have higher security requirements for data, and new concepts such as “trusted native” and “confidential computing” have emerged in the industry, attracting many software and hardware giants to enter the field, providing Rust, a high-performance programming language focused on memory safety, with an excellent development opportunity.
Ant Group, a pioneer in the field of confidential computing technology both domestically and internationally, has its open-source SOFAEnclave confidential computing solution adopted by several cloud computing giants, including Microsoft. The technical papers produced by the team have also been published multiple times at international top conferences, receiving wide recognition from both industry and academia. To gain deeper insights into this cutting-edge technology field, we invited Yan Shoumeng, the Director of Confidential Computing at Ant Group, to unveil the mysteries of confidential computing.
What is Confidential Computing?
With the rapid development of cloud computing, more and more critical services and high-value data have been migrated to the cloud, making cloud security a focus of attention in both academia and industry.
Confidential computing fills a gap in current cloud security—encryption of data-in-use. The past practice was to encrypt data at rest (e.g., on hard drives) and in transit (e.g., over networks), while decrypting it during use (e.g., in memory) for processing. Confidential computing can protect the confidentiality and integrity of data in use.
In other words, with confidential computing technology, traditional enterprises with high data security requirements, such as finance, banking, and government enterprises, can confidently use public cloud services.
Currently, many cloud computing giants are promoting this technology: Microsoft announced in July 2017 that it began accepting early trial applications for Azure confidential computing; IBM announced the preview version of IBM Cloud Data Guard in December 2017; Google also open-sourced a confidential computing framework named Asylo in May 2018.
In August 2019, the Linux Foundation announced the formation of the Confidential Computing Consortium (CCC) with several technology giants, including Alibaba, Arm, Baidu, Google, IBM, Intel, Microsoft, Red Hat, Swisscom, and Tencent. This has brought confidential computing technology into the view of more developers.
So, how is confidential computing achieved?
In fact, all the aforementioned cloud computing giants rely on a technology called “Trusted Execution Environment (TEE)” to implement confidential computing.
As the name suggests, TEE provides a secure computing environment isolated from untrusted environments, and it is this isolation and trusted verification mechanism that makes confidential computing possible.
TEE is generally implemented directly based on hardware, such as Intel SGX, Intel TDX, AMD SEV, ARM TrustZone, and RISC-V Keystone; TEE can also be constructed based on virtualization technology, such as Microsoft’s VSM, Intel’s Trusty for iKGT & ACRN.
Among them, Intel Software Guard Extensions (SGX) is currently the most advanced TEE implementation in commercial CPUs, providing a new instruction set that allows users to define a secure memory area called an Enclave. The CPU ensures that the Enclave is strongly isolated from the outside world and provides memory encryption and remote attestation mechanisms to protect the confidentiality, integrity, and verifiability of Enclave code and data. Unlike previous TEE implementations, such as ARM TrustZone, SGX allows each app to have its own independent TEE and even create multiple TEEs, while TrustZone has only one TEE for the entire system; this also eliminates the need to request device manufacturers to load Trusted Applications (TAs) into the TEE. Due to the advanced nature of SGX, the cloud-based confidential computing field has even recognized the term Enclave to refer to TEE.
Through confidential computing technology, it can address the trust issues of users with sensitive business data regarding cloud-native platforms, which introduces the concept of “trusted native,” allowing cloud-native infrastructure to be more trustworthy on the user side.
Challenges in Developing Confidential Computing Applications
Clearly, the maturity of this technology means that users with sensitive business data, including finance and government enterprises, can migrate to the cloud, presenting a broad market prospect. Although this sounds promising, confidential computing still faces many challenges in practical applications.
-
First, the Enclave is a restricted environment, and the programming interfaces and models differ significantly from the familiar Linux environment for developers.
-
Second, developers need to invest effort in learning various Enclave hardware architectures available in the market.
-
Furthermore, the mainstream cluster scheduling systems (like K8s) do not yet support Enclave, limiting the large-scale usage of Enclaves.
Taking the development of confidential computing applications based on Intel SGX CPUs as an example.
SGX applications are based on a partitioned model: the user-mode (untrusted) application (shown in red in the image) can embed a protected area (shown in green in the image) that is protected by SGX TEE, called an Enclave. Intel CPUs that support SGX ensure that the protected content within the Enclave is encrypted in memory and strongly isolated from the outside. If external code wants to enter the Enclave to execute trusted code, it must go through a designated entry point, which can implement access control and security checks to ensure that the Enclave cannot be abused by the outside.
Since SGX applications are based on this partitioned architecture, application developers typically need to use some SGX SDKs, such as Intel SGX SDK, Open Enclave SDK, Google Asylo, or Apache Rust SGX SDK, etc. However, regardless of which SDK is used, developers will encounter the following development dilemmas:
-
They must partition the target application: developers need to decide which components should be placed inside the Enclave and which should be outside, as well as how the two sides will communicate. For complex applications, determining an efficient, reasonable, and secure partitioning scheme is itself a considerable challenge, not to mention the engineering effort required to implement the partition.
-
They are limited to a specific programming language: regardless of which SDK is used, a developer will be restricted to the language supported by that SDK, which usually means C/C++ (when using Intel SGX SDK, Open Enclave SDK, or Google Asylo), and cannot use more user-friendly programming languages such as Java, Python, or Go.
-
They can only obtain very limited functionality: due to hardware limitations and security considerations, the Enclave cannot directly access the (untrusted) OS outside the Enclave. The lack of OS support within the Enclave means that various SDKs can only provide a very small subset of functionalities that would be available in an untrusted environment, making it impossible for many existing software libraries or tools to run within the Enclave.
The above dilemmas make developing applications for SGX a painful experience, restricting the popularity and acceptance of SGX and confidential computing.
Ant Group’s Confidential Computing Software Stack SOFAEnclave
To address these challenges, Ant Group developed the SOFAEnclave confidential computing software stack, which consists of three parts as shown in the figure:
Occlum LibOS
Occulum is Ant Group’s open-source TEE operating system and the first open-source project initiated by a Chinese company in the CCC Confidential Computing Consortium.
Occlum provides a POSIX programming interface, supports various mainstream languages (C/C++, Java, Python, Go, Rust, etc.), and supports multiple secure file systems. It can be said that Occlum provides a Linux-compatible Enclave runtime environment, making it easy for confidential computing to support existing applications and enabling confidential application developers to reuse their existing development skills. Occlum has been widely used in industrial scenarios and has also published academic papers at top system conferences like ASPLOS 2020, representing the leading level of the confidential computing industry.
From an architectural perspective, Occlum not only provides basic Linux-like operating system capabilities but also offers a user interface similar to Docker, with commands like Occlum build and Occlum run being similar to Docker commands.
In terms of community, Occlum is the default runtime for Alibaba’s Inclavare Containers and is collaborating with other community projects like Hyperledger Avalon. At the same time, Occlum has been donated to the Confidential Computing Consortium (CCC) and is currently the only open-source project from China. Additionally, it is worth mentioning that Microsoft Azure Cloud introduced new advancements in confidential computing technology at the Microsoft Ignite conference last September and publicly recommended developing confidential computing applications based on Occlum on Azure.
Occlum open-source address: https://github.com/occlum/occlum
HyperEnclave
As mentioned earlier, there are currently multiple Enclave hardware platforms in the market. Each of these Enclaves has its own characteristics, but they also impose a significant learning burden on developers. As users of these hardware, Ant’s technology team hopes for a unified Enclave abstraction and also desires more flexible control over the startup and attestation of Enclaves.
To address these issues, Ant’s confidential computing team proposed the HyperEnclave confidential computing hardware virtualization technology. This is a unified Enclave platform. As an abstraction layer, it can map to various existing Enclave hardware implementations and utilize future hardware capabilities, such as Intel MKTME/TDX. “It can even support machines without Enclave extensions; on such machines, we implemented an isolation mechanism based on virtualization technology—we developed a Type 1.5 hypervisor to create and manage virtualization-based Enclaves. In terms of trust, we implemented a user-controllable trust mechanism based on trusted computing technologies (such as TPM, etc.).”
Based on HyperEnclave, along with memory encryption hardware capabilities like AMD SEV or Intel MKTME, HyperEnclave can also protect against physical attacks. Interestingly, HyperEnclave supports existing Enclave SDKs. This means that users’ existing Enclave applications, which could only run on x86 platforms, can now run on any hardware platform supported by HyperEnclave (including domestic CPUs), greatly alleviating the difficulties users face in cross-platform porting of Enclave code while providing users with more flexible control over the trust chain.
Let’s take a closer look at the lifecycle stages of this system. First, the Linux system starts as usual. Next, the hypervisor module begins to load. After the hypervisor is loaded, it demotes the original Linux host to an untrusted guest.
This hypervisor supports the creation of Enclave virtual machines. Enclave virtual machines support the partitioned programming model provided by traditional confidential computing SDKs. Enclave virtual machines also support running the entire application inside the Enclave using Occlum.
To summarize the characteristics of this virtualization technology:
First, the design principle is security-first. The TCB is a very small, formally verifiable hypervisor developed using a memory-safe language, RUST.
Second, it supports trusted boot and remote attestation based on TPM/TXT.
Third, it is compatible with the existing Linux ecosystem. As mentioned earlier, this is a Type 1.5 Hypervisor, meaning it has characteristics of both Type 1 and Type 2 hypervisors; more specifically, it boots like Type 2 but runs like Type 1. This allows us to adapt well to the current mainstream Linux deployment methods. Additionally, this hypervisor can work well with KVM inside the demoted Linux.
Fourth, we can easily introduce hardware-provided memory encryption capabilities, such as Intel MKTME/TDX or AMD SEV.
KubeTEE
The previously mentioned Occlum and HyperEnclave technologies are still aimed at single computing nodes. However, current internet applications are based on large-scale clusters, especially those based on Kubernetes. Kubernetes provides many basic capabilities for cluster management, scheduling, and monitoring, but these capabilities do not fit well into confidential computing scenarios. First, we need Kubernetes to recognize Enclave hardware, expose enclaves to containers, monitor Enclave resources, and handle Enclave-specific transactions like remote attestation, etc.
Yan Shoumeng’s team developed KubeTEE, which is an organic combination of Kubernetes and Enclave, i.e., TEE. Based on KubeTEE, users can easily manage confidential computing clusters using Kubernetes workflows, deploy Enclave services, utilize Enclave middleware, and more.
KubeTEE also includes a component called AECS, based on the remote attestation mechanism of confidential computing, which simplifies the key distribution and deployment process for Enclaves within the cluster.
Through these three components, Ant Group’s open-source SOFAEnclaves technology stack addresses the three major challenges currently faced in the practical application of confidential computing. The related technologies and concepts have received widespread recognition in both industry and academia, placing them at the leading level globally.
Rust Shines in Confidential Computing
We note that in Ant Group’s open-source SOFAEnclaves software stack, the Rust language plays a very important role. Both the Occlum and HyperEnclave components are primarily developed using Rust.
Yan Shoumeng told us that as an emerging programming language that balances safety and high performance, Rust has been widely used within Ant Group, especially in the field of confidential computing, where it has become the main programming language for project development.
“Most of our work is basically written in Rust. On one hand, the Rust language ecosystem is mature enough, and its features like memory safety are highly valued by us. Moreover, the development efficiency of Rust is also very high, greatly enhancing our team’s productivity.”
It is reported that within Ant’s confidential computing technology team, there is a senior Rust evangelist—Tian Hongliang, the core developer of Occlum. There is a small story behind the promotion of Rust within Ant. At that time, the Java technology stack was mainstream within Ant, and Tian Hongliang was a loyal advocate of Rust. Although the company allowed him to work with Rust, he felt a bit lonely due to the lack of like-minded colleagues. However, he soon participated in an internal programming competition at Alipay, where he was the only one using Rust among 100 participants; others used either Java or Python, which were significantly slower than Rust in terms of performance. In this competition, Tian Hongliang excelled and dominated the results.
It was through this competition that Rust’s reputation soared within the company, and many colleagues expressed interest in Rust. Seizing the opportunity, Tian Hongliang conducted public lectures on Rust within the company and became Alibaba Cloud’s Rust evangelist.
In the field of cloud computing, the Rust language is receiving increasing attention from cloud service providers. Last November, AWS brought on board Felix Klock, co-founder of the Rust compiler, and opened nearly 120 Rust-related positions to further expand its Rust development team. In February of this year, Microsoft also released recruitment information, announcing the formation of a Rust developer team. On February 9, Mozilla announced the establishment of the Rust Foundation in collaboration with AWS, Microsoft, Google, and other cloud computing giants, elevating Rust’s status in the cloud computing field to a new height.
Meanwhile, in the 2020 Gartner Cloud Security Technology Maturity Curve report, confidential computing was listed as one of 33 important technologies and is predicted to become the most prevalent cloud-native security technology in the next 5 to 10 years. The rising Rust may usher in a new wave of growth alongside the rise of confidential computing.
Yan Shoumeng stated, “We hope that the SOFAEnclave confidential computing software stack can help lower the barriers to confidential computing and promote the evolution from cloud-native to trusted native. The three components of SOFAEnclave, Occlum and KubeTEE, have already been open-sourced, and HyperEnclave will soon be open-sourced as well. We look forward to strengthening communication and cooperation with the industry!”
Interviewee Introduction
Yan Shoumeng is a senior technical expert at Ant Group and the R&D director responsible for confidential computing. His current research interest is confidential computing technology based on Trusted Execution Environments (TEEs). He leads the development of Ant Group’s SOFAEnclave (Occlum, HyperEnclave, KubeTEE, etc.) confidential computing software stack and has initiated and led the formulation of multiple TEE standards both domestically and internationally. Before joining Ant Group, Yan Shoumeng was a senior principal researcher at Intel’s China Research Institute, primarily engaged in research on security isolation technologies, with research results incorporated into related Intel hardware and software products. He obtained his Ph.D. in 2005 from the School of Computer Science at Northwestern Polytechnical University. He holds over 20 patents and has published papers at top conferences such as ASPLOS, PLDI, FSE, and MM.
This article is an original piece by OSCHINA; please indicate the source when reprinting!
2020 Turing Award Winner Announced, Author of Classic Textbook “Dragon Book” Has a Profound Influence
2021-04-01
The Linux Foundation Announces It Will Host AsyncAPI
2021-03-31
IBM Launches First Quantum Developer Certification
2021-03-31
If you think it’s good, please give it a thumbs up and share it!