Rust Takes Over C: Notable C Projects Rewritten in Rust

Welcome to subscribe to my paid column “Zhang Handong’s Rust Channel” on Mowen Dongxi, where I take you deep into understanding everything related to the Rust language, its ecosystem, and applications in the commercial field. This article is an excerpt from it.

“

The purpose of this article is not to praise Rust, but to provide more cases for companies considering adopting Rust as a reference. The first article in this series can be found here: “Rust Takes Over C Language: The Technological Transformation Happening in Rust for Linux”

The previous article introduced the transformation of Rust taking over C in Rust for Linux, and this article continues to explore which notable C projects have been rewritten in Rust.

sudo-rs

sudo-rs^[1] is part of the Prossimo project, led by ISRG, and is independently audited for security with funding from the NLNet Foundation. The development team consists of members from Ferrous Systems and the Tweede Golf team.

The sudo tool provides privileged users on Unix-like systems (such as Linux and FreeBSD) a way to run commands as root. It poses certain risks, as low-privilege malicious users or software may find ways to abuse it, such as exploiting vulnerabilities in the code to elevate their access to root or superuser level. Ideally, sudo and su should be as secure and bug-free as possible, as they serve as the entry point to complete control of the system.

“

“The sudo command is a typical security-critical tool that is ubiquitous yet often overlooked. Security improvements for such tools will have a huge impact on the entire industry.” – Dan Lorenc, CEO and co-founder of Chainguard (a cybersecurity company)

According to Josh Aas, Executive Director of ISRG’s Prossimo project, one-third of the security vulnerabilities in the original sudo stem from memory management issues.

In August 2023, `sudo-rs` released its first stable version^[2]. The rewrite of sudo-rs in Rust also brought additional benefits: sudo-rs developed a test suite that helped uncover bugs in the original sudo C implementation. Since sudo is already a very mature software, rewriting it in Rust required covering a comprehensive functionality test suite.

From September 4 to September 15, 2023, ROS (**Radically Open Security^[3]**) conducted a crystal box penetration test on sudo-rs, aimed at verifying that privileged operations could not be executed without proper authentication. This audit was conducted on the b5eb2c6 branch version of the sudo-rs codebase; click here to view the complete audit report^[4].

The ROS team discovered one medium severity issue and two low severity issues:

CLN-001: Relative path traversal vulnerability (medium)
CLN-003: Cargo configuration does not strip symbols (low)
CLN-004: Incorrect default permissions set for chown calls (low)

In addition to these findings, ROS also fuzz tested different components of the sudo-rs codebase but found no issues.

pendulum

Pendulum^[5] is also part of the Prossimo project, led by ISRG, and in July 2023, the Sovereign Tech Fund invested in Pendulum, ensuring development and maintenance in 2023, with further maintenance and adoption work in 2024. The Pendulum project includes Statime (PTP) and ntpd-rs (NTP), aimed at building modern, open-source implementations of the Network Time Protocol and Precision Time Protocol, with the following two goals:

Provide reliable time synchronization
Be extensible to accommodate future improvements in time standards

“

The NTP and PTP running on millions of devices and servers are crucial components of the internet and other critical infrastructure: finance and broadcasting, power grids, telecommunications, and security protocols.

“

The Network Time Protocol (NTP) is an internet protocol used to synchronize computer clocks with time sources over a network. It is one of the oldest parts of the TCP/IP suite. The term NTP refers to both the protocol and the client-server program running on computers. David Mills, a professor at the University of Delaware, developed NTP in 1981. It was designed to be highly fault-tolerant and scalable while supporting time synchronization.

NTP

The NTP time synchronization process involves the following three steps:

The NTP client initiates a time request exchange with the NTP server.
The client can then calculate link delay and local offset, adjusting its local clock to match the clock on the server computer.
Typically, about six exchanges are needed over five to ten minutes to initially set the clock.

Once synchronized, the client updates its clock approximately every 10 minutes, usually requiring just one message exchange, aside from the synchronization with the server. This transaction occurs over the User Datagram Protocol (UDP) on port 123. NTP also supports broadcasting synchronization of peer computer clocks. There are thousands of NTP servers worldwide. They can access high-precision atomic clocks and GPS clocks. Specialized receivers are needed to communicate directly with NTP servers for time services.

Accurate time across all devices on a computer network is important for many reasons; even a one-second difference can cause problems. Therefore, the security robustness and performance of NTP are crucial.

ntpd-rs^[7] is an open-source implementation of the Network Time Protocol fully written in Rust, focusing on exposing a minimal attack surface. It is expected to support the NTPv5 protocol.

In May 2023, `ntpd-rs` underwent a security audit by ROS^[8]. In May 2023, `ntpd-rs` underwent a security audit by ROS^[9].

The audit found the project’s first CVE^[10]: ntpd-rs does not validate the length of NTS cookies in received NTP packets, allowing attackers to crash the server by sending a specially crafted NTP packet containing a cookie length shorter than expected by the server. When the server is not configured to handle NTS packets, it can also lead to a server crash. This CVE is not a memory safety issue but poses a DDoS risk.

Additionally, the NTS (Network Time Security) extension of NTP establishes a trusted link between NTP servers and clients using TLS. This means that some sensitive security keys are stored in memory and may be extracted by attackers. Although this attack is difficult to implement, the audit recommends further increasing its difficulty.

When keys are discarded, the zeroize crate ensures that the memory storing the keys is set to zero. However, this does not completely guarantee that the keys no longer exist in memory, as Rust allows moving memory. Key bytes may remain at their original location. Therefore, ensuring that sensitive data is not unnecessarily copied or moved and promptly cleaning up all possible copies is a key consideration for enhancing security.

PTP

NTP (Network Time Protocol) is a network protocol used for synchronizing clocks between different computers. Its design goal is to ensure that the clocks of all interconnected machines differ from UTC time by only a few milliseconds.

PTP stands for “Precision Timing Protocol”. The design goal of PTP is to ensure that the clock deviation between machines is in the sub-microsecond range.

Clocks managed by PTP follow a master-slave hierarchy. Slave clocks synchronize to their master clock. The hierarchy is updated by the Best Master Clock (BMC) algorithm running on each clock. A clock with only one port can be either a master clock or a slave clock. Such clocks are called ordinary clocks (OC). Clocks with multiple ports can be a master clock on one port and a slave clock on another. Such clocks are called boundary clocks (BC). The top-level master clock is referred to as the _grandmaster clock_. The grandmaster clock can be synchronized with GPS. This allows different networks to achieve synchronization with high accuracy.

PTP implementations can be primarily divided into hardware and software approaches. Hardware support is the main advantage of PTP, typically used in situations where network performance and security requirements are high, while others use software implementations. Various network switches and network interface controllers (NICs) support PTP. Although non-PTP hardware can be used within the network, enabling PTP hardware for all PTP clocks can achieve maximum accuracy.

With PTP, we can implement many high-precision real-time applications, such as audio and video transmission, financial transactions, and autonomous vehicles. Additionally, PTP is widely used in industrial automation, achieving efficient and high-quality automated production through precise time synchronization of industrial control systems. Furthermore, PTP can help network administrators detect and address network latency by determining delays or clock offsets, enabling real-time monitoring and communication optimization.

Statime^[11] is an open-source implementation of the Precision Time Protocol (PTP) written in Rust. High-precision timing is a critical part of network infrastructure. Statime provides a memory-safe alternative for existing implementations. The advantage of implementing PTP in Rust is that both the software and hardware support modules are Rust implementations, leveraging Rust’s benefits.

Binder

In November 2023, the Google Android team announced that they had rewritten the Binder code for Android in Rust and submitted it to the Linux kernel.

Binder is responsible for inter-process communication (IPC) on Android and other tasks, and replacing it with memory-safe Rust code should significantly enhance system security.

In an RFC^[12] sent by Google engineers to the Linux kernel mailing list, they wrote: “We generally do not endorse rewrites, but…”. Why rewrite in Rust?

Binder has evolved over the past 15 years to meet the ever-changing demands of Android. During this time, its responsibilities, expectations, and complexity have significantly increased. While the team expects Binder to continue evolving with Android, several factors currently limit their ability to develop and maintain it. Briefly, these factors include:

Complexity: Binder intersects with various points in Android and takes on many responsibilities beyond IPC. It serves different purposes for different people, and due to its numerous functionalities and interactions, its complexity is quite high. In only 6kLOC, it must pass transactions to the correct thread. It must correctly parse and convert the transaction contents, which may contain multiple different types of objects (e.g., pointers, file descriptors) that can interact with each other. It controls the size of the thread pool in user space and ensures transactions are assigned to threads in a way that avoids exhausting the thread pool. It must correctly forward reference count changes of shared objects between multiple processes. It must handle numerous error scenarios and involves/nested 13 different locks, 7 reference counters, and atomic variables. Finally, it must accomplish all this as quickly and efficiently as possible. Even a slight performance regression can lead to a noticeable decline in user experience.
Improvements needed: As the codebase has organically grown, there may be functions with thousands of lines, error-prone error handling, and chaotic structures. After more than a decade of development, this codebase requires comprehensive improvements.
Safety-critical: Binder is a key part of Android’s sandboxing strategy. Even the lowest-privileged sandboxes in Android (such as Chrome renderers or SW codecs) can directly access Binder. Providing robust security and being resilient to vulnerabilities is critical for Binder compared to any other component.
Using Rust addresses some of the challenges that the Google Android team has encountered in Binder over the past few years. It can prevent errors related to reference counting, locks, boundary checks, etc., and has made significant improvements in error handling to reduce complexity. Additionally, it can use a more expressive type system to encode ownership semantics of various structures and pointers, freeing programmers from managing object lifecycles, reducing risks of use-after-free and similar issues.

Rust’s use of many different pointer types in its type system to encode ownership semantics may be one of the most important aspects of its assistance in Binder. The Binder driver has many different objects with complex ownership semantics; some pointers have reference counting, some have exclusive ownership, while others merely reference objects and keep them active in other ways. Using Rust allows different pointer types to be used for each pointer, enabling the compiler to enforce correct implementation of ownership semantics.

Another useful feature is Rust’s error handling. Rust allows features such as destructors to simplify error handling, and if errors are not handled correctly, compilation will fail. This means that although Rust requires you to write more lines of code than C (for example, writing down invariants that are implicit in C), the Rust driver is still slightly smaller than the C binder: Rust is 5.5kLOC, C is 5.8kLOC. (These numbers do not include empty lines, comments, binderfs, and any C debugging tools not yet implemented in the Rust driver. These numbers include abstractions in rust/kernel/ that are unlikely to be used by any other drivers except Binder).

Although this rewrite fundamentally rethinks the structure of the code and the assumptions enforced, we did not fundamentally change the way the driver performs its tasks. Many careful considerations were made regarding the existing design. The purpose of the rewrite is to improve the health, structure, readability, robustness, security, maintainability, and scalability of the code. We also added more inline documentation and improved the way assumptions are enforced in the code. Furthermore, all unsafe code is annotated with a “SAFETY” comment explaining its correctness.

PubNub

PubNub^[13] is dedicated to building an advanced edge network messaging system for constructing any real-time functionality combination, including chat, live audience participation, multi-user collaboration, device control, data streaming, and geolocation/scheduling. PubNub is massive, with over 800 million device connections each month, over 30 trillion API calls per month, and a service level of five nines (99.999% uptime).

PubNub was previously written in C, investing a lot of time and effort to achieve service stability and high performance. But why switch to Rust? In a recent interview^[14], PubNub’s CTO discussed this issue.

About five years ago, PubNub began exploring Rust internally, and through gradual exploration, they found Rust very attractive in terms of memory safety and performance close to C, especially given PubNub’s massive scale, which is essentially why PubNub adopted Rust.

During their previous use of C, the PubNub team frequently encountered “segmentation faults.” Such occurrences typically indicate potential data corruption or loss, which is a significant issue. While C offers strong performance and saves on hardware costs, it does not save on engineering costs. For large-scale systems like PubNub, engineering costs far exceed hardware costs. Additionally, PubNub has also utilized all recruiting methods to find C experts, as even ten years ago, finding C language specialists was already a challenge. Even if a C expert was found, they might not want to write C anymore. In PubNub, writing super-stable C code is a must; however, as a C developer, encountering segmentation faults or similar issues is an inevitable part of the journey; it is not a question of whether problems will occur, but rather when they will happen.

PubNub also attempted to use Go to rewrite part of the PubSub (publish/subscribe) bus, but the performance was far inferior to C. Even under low load, latency immediately slowed down by a factor of 10. Additionally, there were GC pauses, leading to periodic spikes in latency. Hence, they switched to Rust.

Now, Rust is the most popular language at PubNub, and so far, all new services at PubNub are typically written in Rust, with all future services expected to be Rust, due to the excellent results they have seen from it.

ockam

Ockam^[15] is a set of open-source programming libraries and command-line tools for coordinating end-to-end encryption, mutual authentication, key management, credential management, and execution of authorization policies in large-scale environments.

“

Ockam discussed with InfluxData’s CTO Paul Dix in a video about why InfluxDB and Ockam were rewritten in Rust^[16], indeed, InfluxDB was also rewritten in Rust, but it transitioned from Go to Rust.

Ockam was originally developed using C, but after a few months, they decided to abandon that tens of thousands of lines of C code and rewrite it in Rust^[17]. Here is the story of Ockam rewriting C in Rust.

In 2019, Ockam began building with C, hoping that Ockam could run on various devices, from constrained edge devices to powerful cloud servers. They also wanted Ockam to be usable in any type of application, regardless of the language used to build that application.

Based on this goal, C became the team’s preferred choice for building this system. C can be compiled to run on 99% of computers and can run almost anywhere (once you figure out how to handle the toolchain for each specific target). Furthermore, all other popular languages can call C libraries through some native function interface.

At the core of Ockam is a set of layered encryption and message-based protocols, such as Ockam secure channels and Ockam routing. These are asynchronous, multi-step, stateful communication protocols, and they wanted to abstract all the details of these protocols away from application developers. The user experience they envisioned was a single function call to create a secure channel for end-to-end authentication and encryption.

However, code related to encryption often has many pitfalls; even a small mistake can compromise system security. Therefore, simplicity is not just an aesthetic concept for Ockam, but a key requirement to ensure that everyone can build secure systems. Ockam aims to hide these security pitfalls and provide an interface that is easy to use correctly yet difficult to misuse for developers.

This is where C falls seriously short. The Ockam team found it unsuccessful in exposing a secure and simple interface in C. With each iteration, they found that application developers needed to understand too much about protocol states and state transitions.

At the same time, they also created a prototype of an Ockam secure channel overlaying Ockam routing using the Elixir language. Elixir programs run on BEAM, the Erlang virtual machine. BEAM provides Erlang processes, which are lightweight, stateful concurrent executors. As executors can maintain internal states while running concurrently, a set of stateful protocol stacks can be easily run: Ockam transport + Ockam routing + Ockam secure channel. This enabled them to hide all stateful layers and create a simple one-line function that anyone could call to establish an end-to-end encrypted secure channel that can be routed through multiple hops and protocols. Application developers would call this simple function, and multiple concurrent executors would run the underlying stateful protocols. The function would return when the channel is established or an error occurs. This was precisely the interface the Ockam team desired.

However, Elixir does not perform well on small/restricted computers (embedded) and is not suitable for wrapping with specific language idioms.

Thus, these issues prompted the team to start exploring the Rust language. Soon, several features of Rust attracted them:

Compatibility with C-ABI calling conventions. Rust libraries can export interfaces compatible with C calling conventions. This means that any language or runtime environment capable of linking statically or dynamically and calling functions in C libraries can also link and call functions in Rust libraries in the same way. As most languages support native functions in C, they also support native functions in Rust. This makes Rust equivalent to C from the perspective of needing language-specific wrappers around our core library.
Cross-platform support. Rust uses LLVM for compilation, meaning it can target a wide variety of computers for compilation. This set may not cover as large a range as C using GCC and various proprietary GCC branches, but it is still a very large subset, and work is ongoing to enable Rust to compile with GCC. With continuous support for new LLVM targets and potential support for Rust on GCC, this seems to be a good choice from the perspective of our need to run anywhere.
Strong typing and powerful type system. Rust’s memory safety features eliminate the possibility of use-after-free, double free, overflow, out-of-bounds access (non-compile-time), data races, and many other common errors known to cause 60-70% of high-severity vulnerabilities in large C or C++ codebases. Rust provides this safety at compile time without needing to use a garbage collector to manage memory safely at runtime, avoiding performance overhead. This gives Rust a significant advantage in writing high-performance, secure code that runs in constrained environments.
async/await asynchronous programming and pluggable asynchronous runtimes. The last feature that convinced the team that Rust is very suitable for Ockam is async/await. Ockam has determined that lightweight actors are needed to create simple and secure interfaces for the Ockam protocol stack. Based on Rust’s ecosystem, tokio and async-std can easily build Ockam’s actor implementation. Another significant aspect is that in Rust, async/await has an important distinction from async/await in other languages (like Javascript): its asynchronous runtime (tokio/async-std) is pluggable. When in embedded environments, a lighter asynchronous runtime can be chosen. This means that regardless of where it runs (large computers or small computers), it can present the same interface to users. All protocol interfaces based on Ockam Workers can also present the same simple interface, regardless of where they run.

Based on these advantages, Ockam decided to rewrite the entire project in Rust.

Postscript

The purpose of this article is not to praise Rust but to provide more cases for companies considering adopting Rust as a reference.

This article records only the tip of the iceberg. I believe that in the future, we will see more and more cases.

Thank you for reading! Happy New Year 2024!

References

[1]

sudo-rs: https://github.com/memorysafety/sudo-rs

[2]

In August 2023, sudo-rs released its first stable version: https://www.memorysafety.org/blog/sudo-first-stable-release/

[3]

Radically Open Security: https://www.radicallyopensecurity.com/

[4]

Click here to view the complete audit report: https://github.com/memorysafety/sudo-rs/blob/audit-report/docs/audit/audit-report-sudo-rs.pdf

[5]

Pendulum: https://tweedegolf.nl/en/pendulum

[6]

Sovereign Tech Fund: https://www.sovereigntechfund.de/

[7]

ntpd-rs: https://github.com/pendulum-project/ntpd-rs

[8]

ntpd-rs underwent a security audit by ROS: https://tweedegolf.nl/en/blog/94/report-ntp-security-audit

[9]

ntpd-rs underwent a security audit by ROS: https://tweedegolf.nl/en/blog/94/report-ntp-security-audit

[10]

First CVE: https://github.com/pendulum-project/ntpd-rs/security/advisories/GHSA-qwhm-h7v3-mrjx

[11]

Statime: https://github.com/pendulum-project/statime

[12]

RFC: https://lore.kernel.org/lkml/[email protected]/

[13]

PubNub: https://www.pubnub.com/

[14]

Recent interview: https://corrode.dev/podcast/s01e02-pubnub/

[15]

Ockam: https://github.com/build-trust/ockam

[16]

Video discussing why InfluxDB and Ockam were rewritten in Rust with Ockam and InfluxData’s CTO Paul Dix: https://www.influxdata.com/resources/meet-the-founders-an-open-discussion-about-rewriting-using-rust/

[17]

Then a few months later decided to abandon that tens of thousands of lines of C code and rewrite it in Rust: https://www.ockam.io/blog/rewriting_in_rust