Rust Axum Backend Development: Practical Implementation of Data Models, Migrations, DTOs, and Repository Patterns

Introduction

When building modern web applications, good database design and code architecture are crucial. This article will take you deep into how to design a user data storage solution within the Rust Axum framework, learning best practices for database migrations, DTOs (Data Transfer Objects), index optimization, and the repository pattern.

This article is the third part of the Axum backend series, laying the groundwork for the next article that will implement a complete JWT authentication system (including email verification, OAuth, 2FA, Magic Link, etc.). If you are learning Rust backend development, this will be an excellent opportunity to gain insights into enterprise-level application architecture.

Key Decisions in Database Design

Primary Key Selection: Auto-increment ID vs UUID

When designing the user table, the first question we face is: should we use an auto-incrementing integer or a UUID as the primary key?

Issues with Auto-increment Integers:

Users can guess other users’ IDs through incrementing numbers.
Exposes business metrics (like total user count and growth rate).
Seeing a user ID of 50000 indicates there are about 50,000 users.

Advantages of UUID:

ID is random and unpredictable (e.g., f47ac10b-58cc-4372-a567-0e02b2c3d479).
Does not leak business data.
Safer URL structure:GET /api/users/f47ac10b-58cc-4372-a567-0e02b2c3d479

Therefore, we choose UUID for better security and scalability.

Text Field Types: VARCHAR vs TEXT

For text fields like username, email, password, and bio, should we use VARCHAR or TEXT?

**VARCHAR(n)**: Fixed maximum length, suitable for short strings, allowing the database to optimize storage and indexing better.
TEXT: Variable length with no limit, suitable for unknown or variable-length content.

Our choice:

Username, email, and password use VARCHAR (with reasonable length limits).
Bio uses TEXT (as users may write longer descriptions).

Timezone Handling: Local Time vs UTC

Consider a scenario where users from New York, London, and Tokyo create accounts simultaneously.

Issues with Storing Local Timezones:

user A creates account: "2024-12-01 10:00:00 EST"
user B creates account: "2024-12-01 15:00:00 GMT"
user C creates account: "2024-12-02 00:00:00 JST"

Although all three events occur at the same moment, sorting by creation time becomes complex, requiring timezone conversion on the server side.

Advantages of Storing in UTC:

All timestamps are uniformly stored as UTC.
Displayed times are converted to the user’s local timezone.
Sorting, filtering, and comparison operations are simple and efficient.

We use TIMESTAMP WITH TIME ZONE, which PostgreSQL stores internally in UTC but retains timezone information for correct conversion.

Database Migration: Version Control Your Database

What is Database Migration?

Imagine you and your team are developing together. What happens without migrations?

Problems Without Migrations:

Developer A manually creates tables.
Developer B is unaware of this change, leading to code errors.
Everyone’s database structure is inconsistent.
When deploying to production, it’s unclear which version to use.

Solution Using Migrations:

Developer A creates a migration file:

-- migrations/001_create_users.sql
CREATE TABLE users (
    id UUID PRIMARY KEY,
    username VARCHAR(255) NOT NULL,
    email VARCHAR(255) NOT NULL
);

This file is committed to Git. After Developer B pulls the code, they run:

sqlx migrate run

Now everyone has the same database structure!

Creating User Table Migration

First, install the SQLx CLI tool:

cargo install sqlx-cli

Create a migration file:

sqlx migrate add create_users_table

Write the complete migration script:

-- Enable UUID extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Create user table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    username VARCHAR(255) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    bio TEXT,
    image VARCHAR(255),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- Create indexes to optimize query performance
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_username ON users(username);
CREATE INDEX idx_users_created_at ON users(created_at);

-- Create a trigger function to automatically update timestamps
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ language 'plpgsql';

-- Create trigger
CREATE TRIGGER update_users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

Database Indexes: Improving Query Performance

Indexes are like the index pages at the back of a book. Instead of searching page by page, you can directly locate a specific page.

Why Do We Need Indexes?

Consider a user login scenario:

SELECT * FROM users WHERE email = '[email protected]';

Without Index:

The database must scan every row (table scan).
100,000 users require checking 100,000 rows.
As users grow, logins become slower.

With Index:

The database maintains a sorted lookup table.
Lookups are almost instantaneous, whether there are 1,000 or 1 million users.

We create indexes for the following fields:

email: for login authentication.
username: for profile lookups.
created_at: for analytics and registration reports.

Database Triggers: Automatically Updating Timestamps

We want the updated_at field to automatically update when user data changes. There are two ways:

Method One: Application Layer Handling

// Each update requires manually setting the timestamp
user.updated_at = Utc::now();
database.update_user(user).await?;

Problem:Easy to forget, especially as the application grows with multiple update entry points.

Method Two: Database Layer Trigger

-- Create function
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();  -- Set new value
    RETURN NEW;
END;
$$ language 'plpgsql';

-- Create trigger
CREATE TRIGGER update_users_updated_at
    BEFORE UPDATE ON users  -- Trigger before update
    FOR EACH ROW  -- Execute for each row
    EXECUTE FUNCTION update_updated_at_column();

Workflow:

The database detects an UPDATE operation on the users table.
The trigger fires before the update.
The function sets updated_at = NOW().
The original update continues, including the updated timestamp.

This method is bulletproof; regardless of how user data is updated, the timestamp is always correct.

Creating Rust Data Models

User Model Implementation

Define the user structure in src/models/user.rs:

use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use sqlx::FromRow;
use uuid::Uuid;

#[derive(Debug, Clone, Serialize, Deserialize, FromRow)]
pub struct User {
    pub id: Uuid,                      // Maps to database UUID
    pub username: String,              // Maps to VARCHAR(255)
    pub email: String,                 // Maps to VARCHAR(255)
    pub password_hash: String,         // Stores hashed password, not plaintext
    pub bio: Option,           // Maps to TEXT (nullable)
    pub image: Option,         // Maps to VARCHAR(255) (nullable)
    pub created_at: DateTime,     // Maps to TIMESTAMP WITH TIME ZONE
    pub updated_at: DateTime,     // Maps to TIMESTAMP WITH TIME ZONE
}

Derived Traits Explanation:

Debug: for debugging prints.
Clone: allows creating copies.
Serialize/Deserialize: JSON serialization support.
FromRow: automatically converts database rows to Rust structures.

Security Note:We store password_hash instead of plaintext passwords. User passwords are hashed before storage, and the hash is compared during login.

Repository Pattern: Separating Data Access Logic

What is the Repository Pattern?

The repository pattern creates an abstraction layer between business logic and data storage.

Poor Practice (Mixed Concerns):

// API handler directly includes SQL queries
async fn get_user_handler(id: Uuid, db: DatabasePool) -> Result<json, AppError> {
    let user = sqlx::query_as!(User, "SELECT * FROM users WHERE id = $1", id)
        .fetch_one(&db)
        .await?;
    
    Ok(Json(user))
}
</json

Good Practice (Repository Pattern):

// Repository handles database details
struct UserRepository {
    db: DatabasePool,
}

impl UserRepository {
    async fn find_by_id(&self, id: Uuid) -> Result {
        // Database logic goes here
    }
}

// API handler remains clean
async fn get_user_handler(id: Uuid, repo: UserRepository) -> Result<json, AppError> {
    let user = repo.find_by_id(id).await?;
    Ok(Json(user))
}
</json

Implementing User Repository

In src/repositories/user_repository.rs:

use sqlx::{PgPool, Row};
use uuid::Uuid;
use crate::models::User;

#[derive(Clone)]
pub struct UserRepository {
    db: PgPool,
}

impl UserRepository {
    // Constructor
    pub fn new(db: PgPool) -> Self {
        Self { db }
    }
    
    // Create new user
    pub async fn create(
        &self, 
        username: &str, 
        email: &str, 
        password_hash: &str
    ) -> Result {
        let user = sqlx::query_as::<_, User>(
            r#"
            INSERT INTO users (username, email, password_hash)
            VALUES ($1, $2, $3)
            RETURNING id, username, email, password_hash, bio, image,
                      created_at, updated_at
            "#
        )
        .bind(username)
        .bind(email)
        .bind(password_hash)
        .fetch_one(&self.db)
        .await?;
        
        Ok(user)
    }
    
    // Find user by ID
    pub async fn find_by_id(&self, id: Uuid) -> Result<option, sqlx::Error> {
        let user = sqlx::query_as::<_, User>(
            r#"
            SELECT id, username, email, password_hash, bio, image,
                   created_at, updated_at
            FROM users
            WHERE id = $1
            "#
        )
        .bind(id)
        .fetch_optional(&self.db)
        .await?;
        
        Ok(user)
    }
    
    // Find user by email (for login)
    pub async fn find_by_email(&self, email: &str) -> Result<option, sqlx::Error> {
        let user = sqlx::query_as::<_, User>(
            r#"
            SELECT id, username, email, password_hash, bio, image,
                   created_at, updated_at
            FROM users
            WHERE email = $1
            "#
        )
        .bind(email)
        .fetch_optional(&self.db)
        .await?;
        
        Ok(user)
    }
    
    // Update user information (partial update)
    pub async fn update(
        &self,
        id: Uuid,
        username: Option<&str>,
        email: Option<&str>,
        bio: Option<&str>,
        image: Option<&str>
    ) -> Result<option, sqlx::Error> {
        let user = sqlx::query_as::<_, User>(
            r#"
            UPDATE users
            SET username = COALESCE($2, username),
                email = COALESCE($3, email),
                bio = COALESCE($4, bio),
                image = COALESCE($5, image)
            WHERE id = $1
            RETURNING id, username, email, password_hash, bio, image,
                      created_at, updated_at
            "#
        )
        .bind(id)
        .bind(username)
        .bind(email)
        .bind(bio)
        .bind(image)
        .fetch_optional(&self.db)
        .await?;
        
        Ok(user)
    }
}
</option</option</option

Key Points:

Use fetch_optional() to return Option, clearly handling the “not found” case.
COALESCE function implements partial updates: passing Some updates, passing None keeps the original value.
.bind() method prevents SQL injection attacks.

Data Transfer Objects (DTO)

DTO defines the data structure entering and exiting the API, separate from the database model.

Why Separate DTO and Model?

Consider user registration, where the client sends:

{
    "username": "alice",
    "email": "[email protected]",
    "password": "secretpassword"
}

But our User model contains fields not provided by the client:

id: generated by the database.
password_hash: derived from the password hash.
bio/image: optional fields.
created_at/updated_at: set by the database.

Implementing User DTO

In src/schemas/user_schemas.rs:

use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use validator::Validate;

// Create user request
#[derive(Debug, Deserialize, Validate)]
pub struct CreateUserRequest {
    #[validate(length(min = 3, max = 50, message = "Username must be between 3 and 50 characters"))]
    pub username: String,
    
    #[validate(email(message = "Invalid email format"))]
    pub email: String,
    
    #[validate(length(min = 8, message = "Password must be at least 8 characters"))]
    pub password: String,
}

// Update user request
#[derive(Debug, Deserialize, Validate)]
pub struct UpdateUserRequest {
    #[validate(length(min = 3, max = 50, message = "Username must be between 3 and 50 characters"))]
    pub username: Option,
    
    #[validate(email(message = "Invalid email format"))]
    pub email: Option,
    
    #[validate(length(max = 500, message = "Bio cannot exceed 500 characters"))]
    pub bio: Option,
    
    #[validate(url(message = "Image must be a valid URL"))]
    pub image: Option,
}

// User response (excluding sensitive information)
#[derive(Debug, Serialize)]
pub struct UserResponse {
    pub id: Uuid,
    pub username: String,
    pub email: String,
    pub bio: Option,
    pub image: Option,
    pub created_at: DateTime,
    pub updated_at: DateTime,
    // Note: no password_hash
}

// Automatic conversion: User -> UserResponse
impl From for UserResponse {
    fn from(user: crate::models::User) -> Self {
        Self {
            id: user.id,
            username: user.username,
            email: user.email,
            bio: user.bio,
            image: user.image,
            created_at: user.created_at,
            updated_at: user.updated_at,
        }
    }
}

Key Features:

CreateUserRequest: contains only the fields required for registration.
UpdateUserRequest: all fields are optional, supporting partial updates.
UserResponse: excludes password_hash, ensuring password security.
From trait: automatically converts database models to API responses.

Integrating into Axum Application

Updating Application State

Modify src/state.rs:

use sqlx::PgPool;
use crate::repositories::UserRepository;

#[derive(Clone)]
pub struct AppState {
    pub db: PgPool,
    pub user_repository: UserRepository,
}

impl AppState {
    pub async fn new(database_url: &str) -> Result {
        // Create database connection pool
        let db = PgPool::connect(database_url).await?;
        
        // Automatically run migrations
        sqlx::migrate!("./migrations").run(&db).await?;
        
        // Create user repository
        let user_repository = UserRepository::new(db.clone());
        
        Ok(Self {
            db,
            user_repository,
        })
    }
}

Updating Main Function

Modify src/main.rs:

use axum::{routing::get, Router};
use std::env;

mod handlers;
mod models;
mod repositories;
mod schemas;
mod state;

use handlers::health::health_check;
use state::AppState;

#[tokio::main]
async fn main() {
    // Load environment variables
    dotenvy::dotenv().ok();
    
    // Get database URL
    let database_url = env::var("DATABASE_URL")
        .expect("DATABASE_URL must be set");
    
    // Create application state (connect to database and run migrations)
    let app_state = AppState::new(&database_url)
        .await
        .expect("Failed to connect to database");
    
    println!("Database connection successful!");
    
    // Create routes
    let app = Router::new()
        .route("/health", get(health_check))
        .with_state(app_state);
    
    // Start server
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000")
        .await
        .unwrap();
    
    println!("Server running at http://localhost:3000");
    axum::serve(listener, app).await.unwrap();
}

Running the Application

Ensure the PostgreSQL container is running:

docker start realworld-db

Run the application:

cargo run

At startup, it will automatically:

Read .env configuration.
Connect to PostgreSQL.
Run database migrations.
Create UserRepository.
Start the server.

Project Structure Overview

src/
├── handlers/           # API endpoint handlers
│   └── health.rs      # Health check
├── models/            # Database entity definitions
│   ├── mod.rs
│   └── user.rs        # User model
├── repositories/      # Data access layer
│   ├── mod.rs
│   └── user_repository.rs  # User data operations
├── schemas/           # API request/response definitions
│   ├── mod.rs
│   └── user_schemas.rs     # User DTO
├── state.rs           # Application state
├── lib.rs             # Module declarations
└── main.rs            # Application entry

This structure provides:

Separation of Concerns: Each module is responsible for a single responsibility.
Clear Dependencies: Repository depends on models, state depends on repositories.
Easy Testing: Each layer can be tested independently.

Conclusion

This article detailed the infrastructure for building a user management system within the Rust Axum framework:

Database Design Decisions: Choosing UUID as the primary key enhances security, using UTC for uniform time handling.
Database Migration: Managing database structure changes through version control ensures team collaboration consistency.
Performance Optimization: Using indexes to speed up queries, using triggers to automatically maintain timestamps.
Repository Pattern: Separating data access logic improves code testability and maintainability.
DTO Pattern: Separating API contracts from database models protects sensitive information.

This solid foundation prepares us for the next article—where we will implement a complete authentication system, including JWT, OAuth, 2FA, Magic Link, email verification, and password reset features.

If you are interested in Rust backend development, following this series will help you master best practices for enterprise-level applications. The next article will be even more practical and in-depth, so let’s look forward to it!

Reference Articles

Axum Backend Series: Models, Migration, DTOs and Repository Pattern: https://blog.0xshadow.dev/posts/backend-engineering-with-axum/axum-model-setup/

Book Recommendations

This book, “The Rust Programming Language” (2nd Edition), is an authoritative learning resource written by the Rust core development team and translated by members of the Chinese Rust community. It is suitable for all software developers who wish to evaluate, get started, improve, and study the Rust language, and is regarded as essential reading for Rust development work.

The book introduces the fundamental concepts of the Rust language to unique practical tools, covering advanced concepts such as ownership, traits, lifetimes, and safety guarantees, as well as practical tools like pattern matching, error handling, package management, functional features, and concurrency mechanisms. It includes three complete project development case studies, guiding readers from zero to developing practical Rust projects.

Notably, this book has been updated to the Rust 2021 version, meeting the systematic learning needs of beginners and serving as a reference guide for experienced developers, making it the best entry point for building solid Rust skills.