Linux – Block Device Drivers

1.What is a Block Device Driver

A block device driver is used to operate storage devices such as hard disks.

Linux - Block Device Drivers1.1 Concept of Block Device DriversDevices that can randomly access fixed-size data chunks (1 block = 512 bytes) are called block devices. Block device files are generally used in a way that mounts a file system, which is also the usual access method for block devices. The access method for block devices is random.1.2 Sector

The smallest addressable unit in a block device is a sector, and the sector size is generally a power of two. The most common size is 512 bytes. A block is an abstraction of the file system and can only be accessed based on blocks. Physical disk addressing is done at the sector level, and all disk operations accessed by the kernel are done at the block level. A sector is the smallest addressable unit of the device, so a block cannot be smaller than a sector; it can only be a multiple of the sector size.

1.3 Kernel Requirements for Block Size

The block size must be a multiple of the sector size and less than the page size, so the block size is usually 512 bytes, 1K, or 4K.

2.Comparison of Block Device Drivers and Character Device Drivers

1. The block device interface is relatively complex and not as clear and easy to use as character devices.

2. Block device drivers have a significant impact on the overall system performance; speed and efficiency are key considerations in designing block device drivers.

3. The system uses buffers and optimized management of access requests (merging and reordering) to improve system performance.

Linux - Block Device Drivers

3.Introduction to Related Knowledge of Block Device Drivers

Head: A disk has as many heads as it has surfaces.

Cylinder: There can be many rings on a head, and these rings are called cylinders.

Sector: The smallest unit of data access on a cylinder is a sector, and the size of a sector is 512 bytes.

1 block = 512 bytes, 1024 bytes, 2048 bytes, 4096 bytes

1 sector = 512 bytes

The data that a block device can store = heads * cylinders * sectors * 512

Linux - Block Device Drivers4.Framework Diagram of Block Device DriversLinux - Block Device Drivers

1. Virtual File System (VFS): Hides the specific details of various hardware, providing a unified interface for users to operate different hardware. It is based on different file system formats, such as EXT, FAT, etc. User programs operate on devices through VFS, and functions like open, close, write, and read are provided on top of VFS.

2. Disk Cache: A fast cache for the hard disk, where user caches the recently accessed file data. If it can be found in the cache, there is no need to access the hard disk, as the access speed of the hard disk is much slower.

3. Mapping Layer: This layer is mainly used to determine the block size of the file system and then calculate how many blocks the requested data contains. It also calls specific file system functions to access the file’s inode and determine the logical address of the requested data on the disk.

4. Generic Block Layer: The Linux kernel views block devices as data spaces composed of several sectors. The read and write requests from the upper layer are constructed into one or more bio structures in the generic block layer.

5. I/O Scheduler Layer: Responsible for scheduling, inserting, buffering, sorting, merging, and distributing block I/O operations from the generic block layer (elevator scheduling algorithm) to make disk operations more efficient.

6. Block Device Driver Layer: At the bottom of the block system architecture, the block device driver accesses the hardware for data based on the sorted requests.

user:open     read    write    close-------------------(io request)-----------------------------------kernel    |Intermediate Layer: (block_device)        |    Converts user I/O requests into BIO (block, input, output),        |    Consecutive bios in physical memory are merged into a request, this request        |    is placed in a queue in the kernel.        |---------------------------------------------------------        |driver:gendisk        |    1. Allocate object        |    2. Initialize object        |    3. Initialize a queue  head----request(read)----request(write)---...        |    //4. Initialize hard disk device        |    5. Register, unregister------------------------------------------------------------------        hardware :   Allocated memory (simulating a real device) (1M)

5.API of Block Device Drivers

1. Structure object of gendisk    struct gendisk {           int major;   // Major device number of the block device        int first_minor; // Starting minor device number        int minors; // Number of devices, number of partitions        char disk_name[DISK_NAME_LEN]; // Name of the disk        struct disk_part_tbl  *part_tbl;        // Pointer to the partition table of the disk        struct hd_struct part0;        // Description of part0 partition        const struct block_device_operations *fops;        // Structure of block device operation methods        struct request_queue *queue;        // Queue (important)        void *private_data;        // Private data    };       Structure of partition    struct hd_struct {        sector_t start_sect; // Starting sector number        sector_t nr_sects;   // Number of sectors                                                                                                     int  partno;        // Partition number    };       // Structure of block device operation methods    struct block_device_operations {        int (*open) (struct block_device *, fmode_t);        int (*release) (struct gendisk *, fmode_t);        int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);        int (*getgeo)(struct block_device *, struct hd_geometry *);            // Sets the number of heads, cylinders, and sectors of the disk. hd_geometry    }    

Linux - Block Device Drivers

2. Initialization of structure object    struct gendisk *mydisk;    struct gendisk *alloc_disk(int minors)    //void put_disk(struct gendisk *disk)    // Return reference count    Function: Allocate memory for gendisk and complete necessary initialization    Parameters:        @minors: Number of partitions    Return value: Returns the starting address of the allocated memory on success, NULL on failure    int register_blkdev(unsigned int major, const char *name)    //void unregister_blkdev(unsigned int major, const char *name)    Function: Request the major device number for the device driver    Parameters:        @major : 0: Automatically request                  >0 : Statically specify        @name  : Name  cat /proc/devices    Return value:            major=0 ; Returns the major device number on success, error code on failure            major>0 : Returns 0 on success, error code on failure    void set_capacity(struct gendisk *disk, sector_t size)    Function: Set the capacity of the disk    struct request_queue *blk_mq_init_sq_queue(struct blk_mq_tag_set *set,const struct blk_mq_ops     *ops,unsigned int queue_depth,unsigned int set_flags)    //void blk_cleanup_queue(struct request_queue *q)     Function: Used to set the queue helper with mq ops under the given queue depth, and pass the helper through mq ops flags     Parameters:     @The initialized tag object, the tag is used by the upper layer, which contains the number of hardware queues, the operation method structure of the queue, flags, etc.     @The operation method structure to be placed in the tag     @The specified queue depth in the tag     @The processing flags in the tag, such as BLK_MQ_F_SHOULD_MERGE, BLK_MQ_F_BLOCKING, etc.     Return value: Returns the queue pointer on success, error code pointer on failure    3. Registering and Unregistering    void add_disk(struct gendisk *disk)    // Register    void del_gendisk(struct gendisk *disk)    // Unregister

6.Examples of Block Device Drivers

#include <linux/blk-mq.h>#include <linux/blkdev.h>#include <linux/genhd.h>#include <linux/hdreg.h>#include <linux/init.h>#include <linux/module.h>#define BLKSIZE (1 * 1024 * 1024)#define BLKNAME "mydisk"
struct gendisk* mydisk;int major;struct request_queue* queue;struct blk_mq_tag_set tag;char *dev_addr;
blk_status_t myqueue_handle(struct blk_mq_hw_ctx* ctx,    const struct blk_mq_queue_data* hd){    // Handle queue read/write    return 0;}
const struct blk_mq_ops qops = {    .queue_rq = myqueue_handle,};
int mydisk_open(struct block_device* blkdev, fmode_t mode){    printk("%s:%d\n", __func__, __LINE__);    return 0;}
int mydisk_getgeo(struct block_device* blkdev, struct hd_geometry* hd){    printk("%s:%d\n", __func__, __LINE__);    hd->heads = 4;   // Heads    hd->cylinders = 16; // Cylinders    hd->sectors = (BLKSIZE / hd->heads / hd->cylinders / 512); // Sectors    return 0;}
void mydisk_close(struct gendisk* disk, fmode_t mode){    printk("%s:%d\n", __func__, __LINE__);}
struct block_device_operations fops = {    .open = mydisk_open,    .getgeo = mydisk_getgeo,    .release = mydisk_close,};
static int __init mydisk_init(void){    // 1. Allocate object    mydisk = alloc_disk(4);    if (mydisk == NULL) {        printk("alloc gendisk memory error\n");        return -ENOMEM;    }    // 2. Initialize object    major = register_blkdev(0, BLKNAME);    if (major <= 0) {        printk("get blk device number error\n");        return -EAGAIN;    }    queue = blk_mq_init_sq_queue(&tag, &qops, 2, BLK_MQ_F_SHOULD_MERGE);    if (IS_ERR(queue)) {        printk("request mq error\n");        return PTR_ERR(queue);    }    mydisk->major = major;    mydisk->first_minor = 0;    strcpy(mydisk->disk_name, BLKNAME);    set_capacity(mydisk, BLKSIZE >> 9);    mydisk->fops = &fops;    mydisk->queue = queue;    // 3. Allocate memory to use as hard disk    dev_addr = vmalloc(BLKSIZE);    if(dev_addr == NULL){        printk("alloc memory error\n");        return -ENOMEM;    }    // 4. Register    add_disk(mydisk);    return 0;}
static void __exit mydisk_exit(void){    del_gendisk(mydisk);    vfree(dev_addr);    blk_cleanup_queue(queue);    unregister_blkdev(major,BLKNAME);    put_disk(mydisk);}
module_init(mydisk_init);module_exit(mydisk_exit);MODULE_LICENSE("GPL");

7.Structures, Relationships, and Functions Related to Queue Processing

struct  request_queue {    /* Doubly linked list data structure, all IO requests added to the queue are formed into a doubly linked list */    struct  list_head  queue_head;     struct list_head    requeue_list; // Request queue    spinlock_t      requeue_lock;     // Spinlock for the queue    unsigned long     nr_requests;     /* Maximum number of requests */    unsigned long     queue_flags;/* Current request queue status QUEUE_FLAG_STOPPED */    …};struct  request{    struct list_head queuelist;/* List element in the request object */    struct request_queue *q;    /* Pointer to the request queue storing the current request */    unsigned int __data_len;    /* Total data amount required for the current request */    sector_t __sector;         /* Starting sector of the current request */    struct bio *bio;        /* Information carried by the bio object transferred to the request object */    struct bio *biotail;    /* Bio linked list */    …};Typically, a request can contain multiple bios, and one bio corresponds to one I/O request   struct bio {            struct bio *bi_next;     /* Pointer to the next bio object */    unsigned long  bi_flags;     /* Status, command, etc. */    unsigned long bi_rw;     /* Indicates READ/WRITE */    struct block_device    *bi_bdev;    /* Pointer to the block device associated with the request */    unsigned short bi_vcnt;     /* Number of elements in the bi_io_vec array */    unsigned short bi_idx;     /* Index of the current processing element in the bi_io_vec array */    unsigned int bi_size;     /* Total amount of data to be transferred this time, byte (multiple of sector size) */    struct bio_vec *bi_io_vec;/* Points to an array of IO vectors, each element corresponds to a physical page's page object */  };struct bio_vec {            struct page  *bv_page; // Points to the struct page object corresponding to the page used for data transfer    unsigned int bv_len;   // Indicates the size of the data to be transferred    unsigned int bv_offset;// Indicates the offset of the data within the page};

Relationship between request_queue, request, and bio

Linux - Block Device DriversLinux - Block Device Drivers

Functions Related to Queue Processing

blk_mq_start_request(rq); // Start processing the queueblk_mq_end_request(rq, BLK_STS_OK); // End queue processingrq_for_each_segment(bvec, rq, iter) // From request->bio_vecvoid* b_buf = page_address(bvec.bv_page) + bvec.bv_offset; // Convert page address to linear address (physical address)rq_data_dir(rq))   // Get the direction of this read/write from the request  WRITE 1   READ 0dev_addr+(rq->__sector *512) // Address of the disk device

8.Final Example of Block Device Drivers

#include <linux/blk-mq.h>#include <linux/blkdev.h>#include <linux/genhd.h>#include <linux/hdreg.h>#include <linux/init.h>#include <linux/module.h>
#define BLKSIZE (1 * 1024 * 1024)#define BLKNAME "mydisk"
struct gendisk* mydisk;int major;struct request_queue* queue;struct blk_mq_tag_set tag;char *dev_addr;
blk_status_t myqueue_handle(struct blk_mq_hw_ctx* hctx,    const struct blk_mq_queue_data* bd){    blk_status_t status = BLK_STS_OK;    struct request *rq = bd->rq;    struct bio_vec bvec;    struct req_iterator iter;    loff_t pos = blk_rq_pos(rq) << SECTOR_SHIFT;
    // Start processing the queue    blk_mq_start_request(rq);    // Processing the queue    rq_for_each_segment(bvec, rq, iter)    {        unsigned long b_len = bvec.bv_len;        void* b_buf = page_address(bvec.bv_page) + bvec.bv_offset;        // Prevent overflow        if ((pos + b_len) > BLKSIZE)            b_len = (unsigned long)(BLKSIZE - pos);        if (rq_data_dir(rq))//WRITE            memcpy(dev_addr+ pos, b_buf, b_len);        else               //READ            memcpy(b_buf, dev_addr + pos, b_len);        pos += b_len;    }    // End queue processing    blk_mq_end_request(rq, status);    return BLK_STS_OK;//always return ok}
const struct blk_mq_ops qops = {    .queue_rq = myqueue_handle,};
int mydisk_open(struct block_device* blkdev, fmode_t mode){    printk("%s:%d\n", __func__, __LINE__);    return 0;}
int mydisk_getgeo(struct block_device* blkdev, struct hd_geometry* hd){    printk("%s:%d\n", __func__, __LINE__);    hd->heads = 4;   // Heads    hd->cylinders = 16; // Cylinders    hd->sectors = (BLKSIZE / hd->heads / hd->cylinders / 512); // Sectors    return 0;}
void mydisk_close(struct gendisk* disk, fmode_t mode){    printk("%s:%d\n", __func__, __LINE__);}
struct block_device_operations fops = {    .open = mydisk_open,    .getgeo = mydisk_getgeo,    .release = mydisk_close,};
static int __init mydisk_init(void){    // 1. Allocate object    mydisk = alloc_disk(4);    if (mydisk == NULL) {        printk("alloc gendisk memory error\n");        return -ENOMEM;    }    // 2. Initialize object    major = register_blkdev(0, BLKNAME);    if (major <= 0) {        printk("get blk device number error\n");        return -EAGAIN;    }    queue = blk_mq_init_sq_queue(&tag, &qops, 2, BLK_MQ_F_SHOULD_MERGE);    if (IS_ERR(queue)) {        printk("request mq error\n");        return PTR_ERR(queue);    }    mydisk->major = major;    mydisk->first_minor = 0;    strcpy(mydisk->disk_name, BLKNAME);    set_capacity(mydisk, BLKSIZE >> 9);    mydisk->fops = &fops;    mydisk->queue = queue;    // 3. Allocate memory to use as hard disk    dev_addr = vmalloc(BLKSIZE);    if(dev_addr == NULL){        printk("alloc memory error\n");        return -ENOMEM;    }    // 4. Register    add_disk(mydisk);    return 0;}
static void __exit mydisk_exit(void){    del_gendisk(mydisk);    vfree(dev_addr);    blk_cleanup_queue(queue);    unregister_blkdev(major,BLKNAME);    put_disk(mydisk);}
module_init(mydisk_init);module_exit(mydisk_exit);MODULE_LICENSE("GPL");

9.Testing Block Device Drivers

1. Install the block device driver    sudo insmod mydisk.ko2. View the device node    ls -l /dev/mydisk    brw-rw---- 1 root disk 252, 0 6月  22 14:14 /dev/mydisk3. View disk-related commands    sudo  fdisk -l    Disk /dev/mydisk:1 MiB,1048576 字节,2048 个扇区    单元:扇区 / 1 * 512 = 512 字节    扇区大小(逻辑/物理):512 字节 / 512 字节    I/O 大小(最小/最佳):512 字节 / 512 字节  4. Partition    sudo fdisk /dev/mydisk    m 获取帮助    d   删除分区    p   打印分区表    n   添加新分区    w   将分区表写入磁盘并退出    q   退出而不保存更改    在dev下产生/dev/mydisk1节点5. Format the partition    sudo mkfs.ext2 /dev/mydisk1
6. Mount the disk to a directory     sudo mount -t ext2 /dev/mydisk1  ~/udisk
7. Store files on the disk     sudo cp ~/work/day10/02mydisk/mydisk.c .
8. Unmount     cd     sudo umount udisk
9. Read the data from the block device driver into ram.bin file   sudo cat /dev/mydisk1 > ram.bin
10. Remount to check    sudo mount -o loop ram.bin ~/udisk    Go to the udisk directory to check if the files copied earlier are present, if so, it indicates success.

Leave a Comment