A Comprehensive Guide to Rust Standard Library Traits: Memory Management and Type Conversion

Introduction

When building software systems, defining and using appropriate traits is key to making the code structure highly extensible and flexible. The Rust standard library provides a rich set of traits, and using them correctly not only clarifies the code structure but also aligns better with the conventions of the Rust ecosystem. This article will delve into the key traits in Rust, including those related to memory management, marker traits, type conversion, and operator traits.

Memory-Related Traits: Clone / Copy / Drop

Clone Trait

<span>Clone</span> trait is used to define the deep copy behavior of data:

pub trait Clone {
    fn clone(&self) -> Self;
    fn clone_from(&mut self, source: &Self) {
        *self = source.clone()
    }
}

<span>Clone</span> trait has two methods: <span>clone()</span> and <span>clone_from()</span>, the latter has a default implementation, so usually we only need to implement <span>clone()</span>. You might wonder about the purpose of <span>clone_from()</span>, as <span>a.clone_from(&b)</span> seems equivalent to <span>a = b.clone()</span>.

In fact, they are not entirely the same. If <span>a</span> already exists and the <span>clone</span> operation would allocate memory, using <span>a.clone_from(&b)</span> can avoid allocation and improve efficiency.

If every field of a data structure implements <span>Clone</span>, you can simplify the code using the <span>#[derive(Clone)]</span> macro:

#[derive(Clone, Debug)]
struct Developer {
    name: String,
    age: u8,
    lang: Language
}

#[allow(dead_code)]
#[derive(Clone, Debug)]
enum Language {
    Rust,
    TypeScript,
    Elixir,
    Haskell
}

fn main() {
    let dev = Developer {
        name: "Tyr".to_string(),
        age: 18,
        lang: Language::Rust
    };
    let dev1 = dev.clone();
    println!("dev: {:?}, addr of dev name: {:p}", dev, dev.name.as_str());
    println!("dev1: {:?}, addr of dev1 name: {:p}", dev1, dev1.name.as_str());
}

Running this code shows that for <span>name</span> (a <span>String</span> type), the heap memory is also cloned, so <span>Clone</span> is a deep copy—both stack and heap contents are copied.

Copy Trait

Unlike <span>Clone</span>, <span>Copy</span> has no methods—it is a marker trait. Its definition is:

pub trait Copy: Clone {}

From the definition, it can be seen that to implement <span>Copy</span>, a type must first implement <span>Clone</span>, and then implement an empty <span>Copy</span> trait. You might ask: what is the use of a trait with no behavior?

Although it has no methods, it can serve as a trait constraint for type safety checks—hence it is called a marker trait.

If all fields of a data structure implement <span>Copy</span>, you can derive <span>Copy</span> using <span>#[derive(Copy)]</span>. If you try to add <span>Copy</span> to <span>Developer</span> and <span>Language</span>:

#[derive(Clone, Copy, Debug)]
struct Developer {
    name: String,
    age: u8,
    lang: Language
}

#[derive(Clone, Copy, Debug)]
enum Language {
    Rust,
    TypeScript,
    Elixir,
    Haskell
}

This code will fail: <span>String</span> does not implement <span>Copy</span>. Therefore, <span>Developer</span> can only be cloned, not copied. Remember: if a type implements <span>Copy</span>, assignment and function calls will copy the value; otherwise, ownership will be moved.

Drop Trait

We have discussed <span>Drop</span> in the memory management section. Here is its definition:

pub trait Drop {
    fn drop(&mut self);
}

In most cases, you do not need to manually implement <span>Drop</span> for your types: the system will automatically call the <span>drop</span> method for each field in the data structure in order. However, there are two situations where you might need to implement <span>Drop</span> manually:

  1. When you want to perform certain operations at the end of the data’s lifecycle, such as logging
  2. When you need to release external resources. The compiler does not know about any additional resources you might hold, so it cannot release them for you

Note that the <span>Copy</span> trait and the <span>Drop</span> trait are mutually exclusive; both cannot coexist. When you try to implement both <span>Copy</span> and <span>Drop</span> for the same data type, the compiler will give an error. This is easy to understand:<span>Copy</span> performs shallow bitwise copies, assuming that the copied data has no resources to release; while the existence of <span>Drop</span> is precisely to release additional resources.

Marker Traits: Sized / Send / Sync / Unpin

Sized Trait

<span>Sized</span> trait marks types with a specific size. When using generic parameters, Rust automatically adds a <span>Sized</span> constraint to the generic parameters:

struct Data<T> {
    inner: T,
}

fn process_data<T>(data: Data<T>) {
    todo!();
}

This is equivalent to:

struct Data<T: Sized> {
    inner: T,
}

fn process_data<T: Sized>(data: Data<T>) {
    todo!();
}

In most cases, we want this constraint to be added automatically, as it allows the size of the generic structure to be fixed at compile time, enabling it to be passed as a function parameter. However, this automatically added constraint is not always suitable. In rare cases, we want <span>T</span> to be dynamically sized. What should we do? Rust provides <span>?Sized</span> to lift this constraint.

Send / Sync

<span>Send</span> and <span>Sync</span> are the foundation of Rust’s concurrency safety:

  • If a type <span>T</span> implements the <span>Send</span> trait, it means that <span>T</span> can be safely moved from one thread to another, i.e., its ownership can be transferred between threads
  • If a type <span>T</span> implements the <span>Sync</span> trait, it means that <span>&T</span> can be safely shared between multiple threads. A type <span>T</span> satisfies the <span>Sync</span> trait if and only if <span>&T</span> satisfies the <span>Send</span> trait

For user-defined data structures, if all their internal fields implement <span>Send</span> / <span>Sync</span>, then the data structure itself will automatically implement <span>Send</span> / <span>Sync</span>. Basically, native data structures support <span>Send</span> / <span>Sync</span>, so the vast majority of custom data structures also satisfy <span>Send</span> / <span>Sync</span>. In the standard library, data structures that do not support <span>Send</span> / <span>Sync</span> mainly include:

  • Raw pointers <span>*const T</span> / <span>*mut T</span>. They are unsafe, so neither <span>Send</span> nor <span>Sync</span> is applicable
  • <span>UnsafeCell<T></span> does not support <span>Sync</span>. That is, any data structure using <span>Cell</span> or <span>RefCell</span> does not support <span>Sync</span>
  • Reference counting <span>Rc</span> supports neither <span>Send</span> nor <span>Sync</span>, so <span>Rc</span> cannot be used across threads

Type Conversion: From / Into / AsRef / AsMut

In software development, we often need to convert one data structure into another in certain contexts. Rust provides two sets of traits for conversions between value types and reference types:

  • Value to value conversion: <span>From<T></span> / <span>Into<T></span> / <span>TryFrom<T></span> / <span>TryInto<T></span>
  • Reference to reference conversion: <span>AsRef<T></span> / <span>AsMut<T></span>

From / Into

First, let’s look at <span>From<T></span> and <span>Into<T></span>. Their definitions are as follows:

pub trait From<T> {
    fn from(T) -> Self;
}

pub trait Into<T> {
    fn into(self) -> T;
}

When you implement <span>From<T></span>, <span>Into<T></span> is automatically implemented. This is because:

// Implementing From automatically implements Into
impl<T, U> Into<U> for T where U: From<T> {
    fn into(self) -> U {
        U::from(self)
    }
}

So in most cases, it is sufficient to implement <span>From<T></span>, and both conversion methods will work. For example:

let s = String::from("Hello world!");
let s: String = "Hello world!".into();

These two methods are equivalent. Which one should you choose?<span>From<T></span> can infer types based on context and has more use cases; additionally, since implementing <span>From<T></span> will automatically implement <span>Into<T></span> rather than the other way around, you should implement <span>From<T></span> when needed, not <span>Into<T></span>.

Using <span>From<T></span> and <span>Into<T></span> can make function interfaces more flexible. For example:

use std::net::{IpAddr, Ipv4Addr, Ipv6Addr};

fn print(v: impl Into<IpAddr>) {
    println!("{:?}", v.into());
}

fn main() {
    let v4: Ipv4Addr = "2.2.2.2".parse().unwrap();
    let v6: Ipv6Addr = "::1".parse().unwrap();
    
    // IPAddr implements From<[u8; 4]> to convert IPv4 addresses
    print([1, 1, 1, 1]);
    
    // IPAddr implements From<[u16; 8]> to convert IPv6 addresses
    print([0xfe80, 0, 0, 0, 0xaede, 0x48ff, 0xfe00, 0x1122]);
    
    // IPAddr implements From<Ipv4Addr>
    print(v4);
    
    // IPAddr implements From<Ipv6Addr>
    print(v6);
}

AsRef / AsMut

Having understood <span>From<T></span> / <span>Into<T></span>, understanding <span>AsRef<T></span> and <span>AsMut<T></span> becomes easy. They are used for conversions from reference to reference. First, let’s look at their definitions:

pub trait AsRef<T> where T: ?Sized {
    fn as_ref(&self) -> &T;
}

pub trait AsMut<T> where T: ?Sized {
    fn as_mut(&mut self) -> &mut T;
}

In the trait definitions, <span>T</span> can be a dynamically sized type, such as <span>str</span>, <span>[u8]</span>, etc. <span>AsMut<T></span> is similar to <span>AsRef<T></span>, except it generates a mutable reference from a mutable reference, so we mainly focus on <span>AsRef<T></span>.

Let’s look at the interface for opening files in the standard library: <span>std::fs::File::open</span>:

pub fn open<P: AsRef<Path>>(path: P) -> Result<File>

The parameter <span>path</span> is a type that satisfies <span>AsRef<Path></span>, so you can pass types like <span>String</span>, <span>&str</span>, <span>PathBuf</span>, <span>Path</span>, etc. Additionally, when you use <span>path.as_ref()</span>, you will get a <span>&Path</span>.

Let’s write some code to experience the use and implementation of <span>AsRef<T></span>:

#[allow(dead_code)]
enum Language {
    Rust,
    TypeScript,
    Elixir,
    Haskell,
}

impl AsRef<str> for Language {
    fn as_ref(&self) -> &str {
        match self {
            Language::Rust => "Rust",
            Language::TypeScript => "TypeScript",
            Language::Elixir => "Elixir",
            Language::Haskell => "Haskell",
        }
    }
}

fn print_ref(v: impl AsRef<str>) {
    println!("{}", v.as_ref());
}

fn main() {
    let lang = Language::Rust;
    // &str implements AsRef<str>
    print_ref("Hello world!");
    // String implements AsRef<str>
    print_ref("Hello world!".to_string());
    // Our custom enum also implements AsRef<str>
    print_ref(lang);
}

Operator-Related: Deref / DerefMut

<span>Deref</span> and <span>DerefMut</span> are operator-related traits, defined as follows:

pub trait Deref {
    // The type of the dereferenced result
    type Target: ?Sized;
    fn deref(&self) -> &Self::Target;
}

pub trait DerefMut: Deref {
    fn deref_mut(&mut self) -> &mut Self::Target;
}

As you can see, <span>DerefMut</span> “inherits” from <span>Deref</span>, but additionally provides a <span>deref_mut</span> method to obtain a mutable dereference.

For ordinary references, dereferencing is intuitive, as it only has one address pointing to the value, from which the desired value can be obtained, as shown in the following example:

let mut x = 42;
let y = &mut x;
// Dereferencing internally calls DerefMut (which is implemented as *self)
*y += 1;

But for smart pointers, dereferencing which field is not intuitive. Let’s see how the <span>Rc</span> we learned earlier implements <span>Deref</span>:

impl<T: ?Sized> Deref for Rc<T> {
    type Target = T;
    fn deref(&self) -> &T {
        &self.inner().value
    }
}

It can be seen that it ultimately points to the <span>value</span> inside the <span>RcBox</span> on the heap, and dereferencing yields the value corresponding to <span>value</span>.

In Rust, most smart pointers implement <span>Deref</span>, and we can also implement <span>Deref</span> for our own data structures. Here is an example:

use std::ops::{Deref, DerefMut};

#[derive(Debug)]
struct Buffer<T>(Vec<T>);

impl<T> Buffer<T> {
    pub fn new(v: impl Into<Vec<T>>) -> Self {
        Self(v.into())
    }
}

impl<T> Deref for Buffer<T> {
    type Target = [T];
    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

impl<T> DerefMut for Buffer<T> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.0
    }
}

fn main() {
    let mut buf = Buffer::new([1, 3, 2, 4]);
    // Because we implemented Deref and DerefMut, buf can directly access the sort method
    // This line is equivalent to: (&mut buf).deref_mut().sort()
    buf.sort();
    println!("buf: {:?}

Leave a Comment