Unified Paging Articles

Day 3 – The Virtual Memory Mechanism of Linux

2025-10-18 by boardor

This article aims to introduce the memory usage methods in the Linux kernel, detailing the various regions of memory allocation and their functions. 1. Physical Memory Allocation 1. Physical Memory Allocation Diagram Linux kernel’s physical memory allocation General Overview: The entire physical memory is divided into four blocks. Block 1: Linux Kernel Program This is … Read more

Single-GPU Operation for Thousands of Large Models: UC Berkeley’s S-LoRA Method

2025-06-12 by boardor

Originally from PaperWeekly Author: Dan Jiang Affiliation: National University of Singapore Generally speaking, the deployment of large language models follows the “pre-training – then fine-tuning” model. However, when fine-tuning a base model for numerous tasks (such as personalized assistants), the training and service costs can become very high. Low-Rank Adaptation (LoRA) is a parameter-efficient fine-tuning … Read more

S-LoRA: Enabling Thousands of Large Models on a GPU

2025-05-03 by boardor

Machine Heart reports Editor: Danjiang Generally, the deployment of large language models adopts a “pre-training – then fine-tuning” approach. However, when fine-tuning the base model for numerous tasks (such as personalized assistants), the training and service costs can become extremely high. Low-Rank Adaptation (LoRA) is a parameter-efficient fine-tuning method, typically used to adapt the base … Read more