Official Rust Implementation of Model2Vec: A Lightweight Tool for Embedding Model Loading and Inference

Introduction

In today’s natural language processing field, embedding technology has become an indispensable part. Whether for text classification, sentiment analysis, or information retrieval, high-quality embedding models can significantly enhance task performance. However, as model sizes continue to grow, efficiently loading and inferring these models has become a pressing issue.

Today, we will introduce a Rust implementation developed by the MinishLab team—model2vec-rs. This tool not only provides lightweight static embedding model loading and inference capabilities but also achieves a performance improvement of 1.7 times through excellent optimization! Next, let’s dive deeper into this powerful tool!

Main Content

1. What is model2vec-rs?

model2vec-rs is the official Rust implementation of Model2Vec, focusing on providing users with a fast and efficient solution for static embedding model loading and inference. Compared to the Python version, it achieves an astonishing throughput of 8000 samples/second on a single-threaded CPU, nearly twice as fast as the Python version!

Core Features:

  • High Performance: Significantly enhances computational efficiency through the advantages of the Rust language.
  • Lightweight: Supports various pre-trained models, ready to use without complex configurations.
  • Ease of Use: Extremely simple and intuitive, whether called through code or command line operations.

2. Quick Start

To help developers get started quickly, model2vec-rs provides a very friendly onboarding experience. Here are a few simple steps:

2.1 Add Dependency

First, add the<span>model2vec-rs</span> dependency to your project:

cargo add model2vec-rs

2.2 Load Model and Generate Embeddings

Next, you can easily load a pre-trained model and generate embedding vectors. Here is a complete example code:

use anyhow::Result;
use model2vec_rs::model::StaticModel;

fn main() -> Result<()> {
    // Load model
    let model = StaticModel::from_pretrained("minishlab/potion-base-8M", None, None, None)?;

    // Prepare input sentences
    let sentences = vec![
        "Hello world".to_string(),
        "Rust is awesome".to_string(),
    ];

    // Generate embeddings
    let embeddings = model.encode(&sentences);
    println!("Embeddings: {:?}", embeddings);

    Ok(())
}

2.3 Use CLI Tool

If you prefer command line operations, you can also generate embeddings directly through the CLI:

# Single sentence input
cargo run -- encode "Hello world" minishlab/potion-base-8M

# File input
echo -e "Hello world\nRust is awesome" > input.txt
cargo run -- encode input.txt minishlab/potion-base-8M --output embeds.json

3. Supported Models

model2vec-rs comes with several high-quality pre-trained models, covering various needs from general tasks to specific scenarios. Here is a list of some available models:

Model Name Language Sentence Transformer Parameter Count Applicable Tasks
potion-base-32M[1] English bge-base-en-v1.5 32.3M General Tasks
potion-base-8M[2] English bge-base-en-v1.5 7.5M General Tasks
potion-base-4M[3] English bge-base-en-v1.5 3.7M General Tasks
potion-retrieval-32M[4] English bge-base-en-v1.5 32.3M Information Retrieval
M2V_multilingual_output[5] Multilingual LaBSE 471M General Tasks

These models are hosted on theHuggingFace Hub[6], and users can load them directly using the<span>from_pretrained</span> method.

4. Performance Comparison

To validate the superiority of the Rust implementation, we conducted benchmark tests comparing model2vec-rs with the Python version. The results show that in a single-threaded CPU environment, the Rust version achieves a throughput of8000 samples/second, while the Python version only reaches4650 samples/second, resulting in a performance improvement of approximately1.7 times!

Implementation Method Throughput (samples/second)
Rust 8000
Python 4650

This result fully demonstrates the tremendous potential of the Rust language in high-performance computing.

Conclusion

model2vec-rs, with its outstanding performance and simple API design, provides developers with a new solution for static embedding model loading and inference. Whether you are a beginner or an experienced engineer, you can easily get started and benefit from it. If you are looking for an efficient and reliable embedding tool, consider trying model2vec-rs; it will surely become a valuable assistant in your development!

Finally, don’t forget to give this excellent open-source project a star!Repository Address: https://github.com/MinishLab/model2vec-rs[7]

I hope this article is helpful to you! If you have any questions or suggestions, feel free to leave a comment for discussion~ 😊

References

[1]

potion-base-32M: https://huggingface.co/minishlab/potion-base-32M

[2]

potion-base-8M: https://huggingface.co/minishlab/potion-base-8M

[3]

potion-base-4M: https://huggingface.co/minishlab/potion-base-4M

[4]

potion-retrieval-32M: https://huggingface.co/minishlab/potion-retrieval-32M

[5]

M2V_multilingual_output: https://huggingface.co/minishlab/M2V_multilingual_output

[6]

HuggingFace Hub: https://huggingface.co/

[7]

https://github.com/MinishLab/model2vec-rs: https://github.com/MinishLab/model2vec-rs

Leave a Comment