Mixing Rust in C++

This article will help you gradually introduce some Rust code into your C++ projects.

A few days ago, I read in a book about a certain programming language:

“… is a general-purpose programming language that emphasizes rich types and lightweight abstraction design and usage. It is particularly suitable for resource-constrained application scenarios, such as those found in software infrastructure. … rewards programmers who take the time to master the skills of writing high-quality code. … is a language prepared for those who take programming tasks seriously. Our civilization heavily relies on software; we must ensure it is of high quality.”

In an article primarily focused on Rust, if you guessed that the deliberately omitted language is C++, you are correct. Perhaps the part about “taking programming seriously” gave it away. But that’s okay.

In fact, it is not surprising that there is interest in Rust within the C++ domain: both languages operate in the same field. They both aim to make system-level programming more scalable and less error-prone. Rust has clearly drawn a lot of inspiration from C++, such as the <span>const</span> keyword, which is directly borrowed from C++20, and the concept of <span>template</span>, which also originates from C++ and is now widely used in Rust.

Beyond borrowing, Rust strives to improve the state of system programming. Its emphasis on correctness and memory safety has piqued the curiosity of many C++ developers. But as we consistently focus on in this series of articles: what if you want to “oxidize” your C++ project? In other words, how do you gradually introduce Rust into an existing C++ codebase?

Toolchain Support

Like other languages, to enhance interoperability between Rust and C++, there are some ready-made tools available, which are much more convenient than manually writing C-ABI interfaces. One important tool in this regard is cxx, written by the renowned David Tolnay, who is also the author of popular crates like [syn] and [quote].cxx promises to provide a “safety mechanism” for mutual calls between Rust and C++.

cxx implements cross-language interoperability differently than [PyO3], and there are reasons for that. PyO3 allows you to describe various Python concepts from the Rust side, while cxx only covers concepts that exist in both C++ and Rust. These concepts can be converted between languages with “zero or near-zero overhead,” which is crucial in a systems programming environment. More importantly, cxx is about “defining a highly expressive set of functionalities so that we can make strong safety guarantees today and expand over time.”

In short, you get a boundary that is both safe and efficient for FFI.

However, this comes at a cost: you make some compromises in expressiveness. You need to spend some time adapting and “squeezing” problems into the common subset supported by cxx.

Getting Started: Calculating Hash Values

Let’s start with a simple example: a C++ application using Rust implementation to calculate the CRC32 hash value of given input data. The program will read bytes from standard input until it encounters a newline character and output the hash value in hexadecimal format to standard output:

$ cat hello.txt | cxx_crc32fast
1cf81ca7

First, quickly set up a Rust library that can accomplish this task:

$ cargo new cxx-crc32fast --bin &amp;&amp; cd cxx-crc32fast
$ cargo add [email protected]

We will delegate the actual CRC32 calculation to the <span>crc32fast</span> crate. Our crate will be responsible for exposing its functionality to the C++ caller. The <span>crc32fast</span> crate provides two APIs: one is the function <span>hash</span>, and the other is the struct <span>Hasher</span>. The former is very simple, requiring just a slice to compute the hash in one go, but the downside is that it requires all data to be passed in at once. We want to implement the ability to process data in chunks, so we will use the <span>Hasher</span> struct. To ensure that <span>crc32fast::Hasher</span> works well with cxx, we need to wrap it and expose some methods. Here is a preliminary attempt:

struct Hasher(crc32fast::Hasher);

impl Hasher {
    fn new() -&gt; Self {
        Self(crc32fast::Hasher::new())
    }

    fn update(&amp;mut self, buf: &amp;[u8]) {
        self.0.update(buf)
    }

    fn finalize(self) -&gt; u32 {
        self.0.finalize()
    }
}

Great! Now we can create our own <span>Hasher</span> instance and forward calls to its <span>update</span> and <span>finalize</span> methods. Next, let’s add the magic of cxx to this code.

Building the Bridge: Interaction Between Rust and C++

cxx allows you to specify how Rust functionalities are exposed to C++ and vice versa using the <span>#[cxx::bridge]</span> procedural macro. Here is the basic structure:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        /* Define and expose content to C++ in Rust */
    }

    unsafe extern "C++" {
        /* Define and expose content to Rust in C++ */
    }
}

In this article, we mainly focus on the <span>extern "Rust"</span> block, which declares the types and functions defined in Rust and exposes them for use in C++. This allows C++ applications to call these functionalities.

We hope to share the <span>Hasher</span> type with C++. cxx supports “shared types,” allowing both languages to access the internal details of the type. Sounds good, let’s give it a try.

First, add the cxx dependency:

$ cargo add cxx@1

Then set up the bridge module, defining <span>Hasher</span> as a shared type:

#[cxx::bridge]
mod ffi {
    struct Hasher(crc32fast::Hasher);
}

Run a <span>cargo check</span>:

$ cargo check
   Checking cxx-crc32fast v0.1.0 (/home/hd/dev/tg/edu/cxx-crc32fast)
error: tuple structs are not supported

Hmm, it seems that tuple structs cannot be used directly. We can change to named fields:

#[cxx::bridge]
mod ffi {
    struct Hasher {
        hash: crc32fast::Hasher
    }
}

Check again:

$ cargo check
   Checking cxx-crc32fast v0.1.0 (/home/hd/dev/tg/edu/cxx-crc32fast)
error: unsupported type
--&gt; src/lib.rs:5:15
 |
5 |         hash: crc32fast::Hasher
 |               ^^^^^^^^^^^^^^^^^

Oh, it turns out that cxx cannot verify the safety of external types, so we cannot use <span>crc32fast::Hasher</span> directly. In this case, shared types seem to be of no help.

At this point, we can consider using “opaque types.” They hide fields and are accessed indirectly through pointers. We declare the opaque type in the <span>extern "Rust"</span> block, and cxx will look for a matching definition in the Rust code. Let’s try the following:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
    }
}

struct Hasher(crc32fast::Hasher);

impl Hasher {
    // Methods omitted
}

This time the compilation passed, although there are still some unused warnings.

Implementing Hasher Methods

Next, we need to expose the methods of <span>Hasher</span>. Associated functions cannot be exposed directly, so we add a static function <span>init()</span> to call <span>Hasher::new()</span>:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
        fn init() -&gt; Hasher;
    }
}

struct Hasher(crc32fast::Hasher);

fn init() -&gt; Hasher {
    Hasher::new()
}

// ...other methods omitted...

Run the check:

$ cargo check -q
error: returning opaque Rust type by value is not supported

Of course, opaque types must be returned by pointer, typically pointing to heap-allocated values. So we modify it to:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
        fn init() -&gt; Box&lt;Hasher&gt;;
    }
}

struct Hasher(crc32fast::Hasher);

fn init() -&gt; Box&lt;Hasher&gt; {
    Box::new(Hasher::new())
}

This time the compilation passed smoothly, and one less warning. Next, let’s continue to expose the <span>update</span> and <span>finalize</span> methods:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
        fn init() -&gt; Box&lt;Hasher&gt;;
        fn update(self: &amp;mut Hasher, buf: &amp;[u8]);
        fn finalize(self: Hasher) -&gt; u32;
    }
}

The corresponding implementation remains unchanged:

impl Hasher {
    fn new() -&gt; Self {
        Self(crc32fast::Hasher::new())
    }

    fn update(&amp;mut self, buf: &amp;[u8]) {
        self.0.update(buf)
    }

    fn finalize(self) -&gt; u32 {
        self.0.finalize()
    }
}

At this point, the entire bridging logic is complete. Next, we can call these Rust functions from the C++ side. Recently, while doing FFI (Foreign Function Interface) development with Rust and C++, I encountered an interesting problem. We wanted to expose the Rust <span>crc32fast</span> library for use in C++, but we stumbled upon several pitfalls and eventually found some clever solutions.

Problem Background

Let’s first look at the general structure of the bridging code:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
        fn init() -&gt; Box&lt;Hasher&gt;;
        fn update(&amp;mut self, buf: &amp;[u8]);
        fn finalize(self) -&gt; u32;
    }
}

This code defines a type called <span>Hasher</span> and aims to expose three methods: initialization, updating data, and finally calculating the CRC32 value. It seems straightforward, but the compiler throws an error:

error: unsupported method receiver
 --&gt; src/lib.rs:10:21
  |
10 |         fn finalize(self) -&gt; u32;
  |                     ^^^^

This indicates that the “self parameter is not supported.” So where is the problem?

In-Depth Analysis

It turns out that the C++ side receives an “opaque” <span>Hasher</span> type (i.e., an opaque type), which does not know the complete structure of this type. Therefore, when you call <span>finalize(self)</span>, C++ has no idea how to handle the value passing action — it cannot copy or move this object.

In other words, C++ does not support passing objects managed by Rust by value.

What to do? There are several ideas to work around this issue.

Method 1: Use Cloning Instead of Consuming Operations

If this type is lightweight and implements <span>Clone</span>, you can modify the <span>finalize</span> method to accept a reference:

fn finalize(&amp;self) -&gt; u32;

Then, in the implementation, perform a clone:

impl Hasher {
    fn finalize(&amp;self) -&gt; u32 {
        self.0.clone().finalize()
    }
}

This will compile successfully. However, this method has a prerequisite:the cloning cost of this type must be low enough. Otherwise, it may not be suitable.

Method 2: Expose as a Static Function

If this type cannot be cloned efficiently, or you want to maintain the original design intent, you can expose the <span>finalize</span> method in a different way.

You can change it to a static function that accepts a <span>Box<Hasher></span> parameter:

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type Hasher;
        fn init() -&gt; Box&lt;Hasher&gt;;
        fn update(&amp;mut self, buf: &amp;[u8]);
        fn finalize(h: Box&lt;Hasher&gt;) -&gt; u32;
    }
}

The corresponding implementation also changes:

fn finalize(h: Box&lt;Hasher&gt;) -&gt; u32 {
    h.finalize()
}

impl Hasher {
    fn finalize(self) -&gt; u32 {
        self.0.finalize()
    }
}

Now, the C++ side can use it like this:

auto hasher = init();
hasher-&gt;update(...);
uint32_t crc = finalize(std::move(hasher));

This approach is more general and better suited for complex types.

Generating the C++ Bridging Part

Next, we need to enable C++ to actually call these methods. We need to add the <span>cxx-build</span> build dependency:

cargo add cxx-build@1 --build

Then write a <span>build.rs</span> file:

fn main() {
    cxx_build::bridge("src/lib.rs")
        .compile("cxx-crc32fast");
    println!("cargo:rerun-if-changed=src/lib.rs");
}

After running, C++ header files and implementation files will be generated in the <span>target/cxxbridge/</span> directory.

For example, in the generated <span>lib.rs.h</span>, the declaration of <span>Hasher</span> is as follows:

struct Hasher final : public ::rust::Opaque {
    void update(::rust::Slice&lt;::std::uint8_t const&gt; buf) noexcept;
    ~Hasher() = delete;
private:
    friend ::rust::layout;
    struct layout {
        static ::std::size_t size() noexcept;
        static ::std::size_t align() noexcept;
    };
};

Here are a few key points:

  • <span>Hasher</span> inherits from <span>::rust::Opaque</span>, indicating that it is an opaque type managed by Rust.

  • The constructor and destructor are both deleted (<span>= delete</span>), indicating that the C++ side cannot create or destroy this object on its own.

  • All memory management is left to Rust.

Let’s Take a Look at Macro Expansion

If you are interested in the underlying mechanism, you can use <span>cargo expand</span> to see what the <span>#[cxx::bridge]</span> macro expansion looks like:

cargo expand ffi

You will find many “magical” trait implementations and boundary checks, such as:

fn __AssertUnpin&lt;T: ?Sized + Unpin&gt;() {}

This is a technique to ensure that the <span>Hasher</span> implements the <span>Unpin</span> trait — if it does not, the compilation will fail.

Additionally:

#[export_name = "cxxbridge1$Hasher$operator$sizeof"]
extern "C" fn __sizeof_Hasher() -&gt; usize {
    Layout::new::&lt;Hasher&gt;().size()
}

These functions will export the size and alignment information of <span>Hasher</span> for C++ to use.

Reflection

Although we are only exposing a simple CRC32 calculation functionality, there are still many challenges to face when interacting across languages. Especially when one side (like C++) knows very little about the types managed by the other side (like Rust), many originally simple methods need to be redesigned.

However, it is precisely these “constraints” that provide dual guarantees of safety and performance. You can think of it as a “security gate” — although it is a bit cumbersome to pass through, it prevents accidental data corruption or memory leaks.

So, the next time you encounter a similar problem, consider thinking from a different angle:how to find the most elegant solution while meeting the interface specifications?

This is not only a technical issue but also a test of thinking style. Here is a code implementation of mixed programming in Rust and C++, which calculates the CRC32 checksum value through FFI (Foreign Function Interface).

#[doc(hidden)]
#[export_name = "cxxbridge1$init"]
safe extern "C" fn __init() -&gt; *mut Hasher {
   let __fn = "cxx_crc32fast::ffi::init";
   fn __init() -&gt; ::cxx::alloc::boxed::Box&lt;Hasher&gt; {
       super::init()
   }
   ::cxx::private::prevent_unwind(
       __fn,
       move || ::cxx::alloc::boxed::Box::into_raw(__init()),
   )
}

The above code defines a function named <span>__init</span>, which returns a pointer to the <span>Hasher</span> struct. This function is responsible for initializing the hash calculator.<span>prevent_unwind</span> function plays a key role here, as it can prevent panics in Rust from propagating across the FFI boundary to the C++ side.

Continuing on:

#[doc(hidden)]
#[export_name = "cxxbridge1$Hasher$update"]
safe extern "C" fn __Hasher__update(
   __self: &amp;mut Hasher,
   buf: ::cxx::private::RustSlice,
) {
   let __fn = "cxx_crc32fast::ffi::Hasher::update";
   fn __Hasher__update(__self: &amp;mut Hasher, buf: &amp;[u8]) {
       Hasher::update(__self, buf)
   }
   ::cxx::private::prevent_unwind(
       __fn,
       move || unsafe { __Hasher__update(__self, buf.as_slice::&lt;u8&gt;()) },
   )
}

This part of the code defines the <span>__Hasher__update</span> function, which is used to update the data buffer in the hash calculator. This is equivalent to feeding the data to be calculated into the hash calculator.

Finally, for the cleanup work:

#[doc(hidden)]
#[export_name = "cxxbridge1$finalize"]
safe extern "C" fn __finalize(h: *mut Hasher) -&gt; u32 {
   let __fn = "cxx_crc32fast::ffi::finalize";
   fn __finalize(h: ::cxx::alloc::boxed::Box&lt;Hasher&gt;) -&gt; u32 {
       super::finalize(h)
   }
   ::cxx::private::prevent_unwind(
       __fn,
       move || unsafe { __finalize(::cxx::alloc::boxed::Box::from_raw(h)) },
   )
}

<span>__finalize</span> function is responsible for completing the final hash calculation and returning the result value.

To make this mechanism work, we also need to write a corresponding C++ program. Create a file named <span>src/crc32fast.cc</span> with the following content:

#include "cxx-crc32fast/include/crc32fast.h"
#include "cxx-crc32fast/src/lib.rs.h"
#include &lt;iostream&gt;
#include &lt;iomanip&gt;
#include &lt;vector&gt;

int main() {
   // Read input from stdin
   std::istreambuf_iterator&lt;char&gt; begin{std::cin}, end;
   std::vector&lt;unsigned char&gt; input{begin, end};
   rust::Slice&lt;const uint8_t&gt; slice{input.data(), input.size()}; // drop the linefeed

   // Hash it
   rust::Box&lt;Hasher&gt; h = init();
   h-&gt;update(slice);
   uint32_t output = finalize(std::move(h));

   // Write to stdout.
   std::cout &lt;&lt; std::setw(8) &lt;&lt; std::setfill('0') &lt;&lt; std::hex &lt;&lt; output &lt;&lt; std::endl;
}

This program will read data from standard input, then call the previously defined Rust interface for hash calculation, and finally output the result.

To let the build system know how to handle these mixed-language codes, we need to add some configurations in the <span>build.rs</span> file:

fn main() {
   cxx_build::bridge("src/lib.rs")
       .file("src/crc32fast.cc")
       .compile("cxx-crc32fast");

   println!("cargo:rerun-if-changed=src/lib.rs");
   println!("cargo:rerun-if-changed=src/crc32fast.cc");
   println!("cargo:rerun-if-changed=include/crc32fast.h");
}

To let the Cargo build system understand that this is an executable program rather than a library file, we also need to modify the <span>Cargo.toml</span> file:

[lib]
crate-type = ["bin"]

Since the main function is defined in C++ and not in Rust, we need to add the following at the beginning of the <span>src/lib.rs</span> file:

#![no_main]

This will avoid the compiler from failing to find the main function.

Now you can try to build and run this program:

$ cargo build
$ cat hello.txt | target/debug/cxx_crc32fast
01d7afb4

Although there is room for improvement in the build configuration, it is at least running normally.

This implementation demonstrates how the cxx library helps us build a bridge between Rust and C++. Although we only used part of its functionality this time, we will explore more complex usages in the future, such as building JSON formatting tools and large projects based on Qt.

Click 👇 to follow

Reference link: https://tweedegolf.nl/en/blog/135/mix-in-rust-with-c++

Leave a Comment