Crash Recovery for User-Space Block Drivers

Follow to see more great articles like this~

Crash recovery for user-space block drivers

By Jonathan Corbet August 29, 2022 DeepL assisted translation https://lwn.net/Articles/906097/

During the 6.0 merge window, the kernel incorporated a new user-space block driver mechanism into the kernel. This subsystem, known as “ublk,” uses io_uring to communicate with user-space drivers, achieving impressive performance scores. Ublk has many interesting potential values, but its use cases remain uncertain. The recently released ublk crash-recovery mechanism clarifies that there are indeed real use cases.

If a block device driver within the kernel crashes, it can potentially cause the entire kernel to malfunction. By moving these drivers to user space, it is theoretically possible to create a more robust system, as the kernel can continue operating normally even after a driver crash. However, in the ublk of the 6.0 kernel, a driver crash results in the disappearance of the related device, and all pending I/O requests fail. From the user’s perspective, this outcome is virtually indistinguishable from a complete system crash. As the patch author Ziyang Zhang pointed out in the explanatory email, some users may find this result disappointing:

This is not a good choice in practice, as users do not want requests to be aborted, I/O errors, or devices to be released. They may wish for a recovery mechanism that prevents request abortion and I/O errors. In short, users just want everything to remain unchanged.

The goal of this patch set is to fulfill this wish.

To implement crash recovery, a user-space block driver should configure the corresponding ublk device with the new UBLK_F_USER_RECOVERY flag. Additionally, there is an optional flag UBLK_F_USER_RECOVERY_REISSUE that controls the recovery method, which will be detailed later. Once configured, normal driver operations do not require any changes.

If a recoverable ublk driver crashes, the kernel will stop the related I/O request queue to avoid adding future requests and patiently wait for a new driver process to appear. It may never arrive, but since a driver claims to be recoverable, the kernel expects it to recover itself. There is no notification mechanism when a driver crashes; user space must detect that an unexpected termination has occurred and start a new driver process.

The new driver process will connect to the ublk subsystem and issue a START_USER_RECOVERY command. This will prompt ublk to verify whether the old driver has indeed disappeared and perform the necessary cleanup, including handling all unprocessed I/O requests. Any requests that appeared after the crash and were not accepted by the old driver can be queued for processing in the new driver. However, requests that have already been accepted may need to be handled more cautiously, as the kernel does not know whether they were actually executed.

Clearly, some ublk backend code cannot correctly handle duplicate writes; in such cases, it is necessary to avoid re-writing them. This is the purpose of the UBLK_F_USER_RECOVERY_REISSUE flag; if this flag is present, all unprocessed requests will be reissued. Otherwise, requests that have been accepted by the driver but have not reached completion will return an error status indicating failure. This situation can even occur with read requests, which people usually assume are harmless when processed multiple times.

After initiating the recovery process, the new driver should reconnect to each device and issue a new FETCH_REQ command on each device to enable the I/O request processing flow. Once all devices are set up, the END_USER_RECOVERY command will restart the request queue, allowing everything to begin anew. With a bit of luck, users may not even notice that the block driver crashed and was replaced.

The ublk subsystem was created by Red Hat, which includes only a simple file-backed driver that essentially replicates the loop driver as an example. At the time, various use cases for this subsystem were vaguely mentioned, but it was unclear how (or if) it could be used beyond the demo. It seems like an interesting solution waiting for problems to arise.

Weeks after ublk was merged, this recovery mechanism emerged from another company (Alibaba), indicating more complex use cases, and that ublk is indeed in active use. Such recovery mechanisms are often developed only after real needs are observed. It is expected that these real-world use cases will soon be discovered (through code), allowing others worldwide to benefit from this work.

This information is also valuable in that it may provide clues about the direction of Linux development in the coming years. There are currently many efforts to blur the boundaries between kernel tasks and user-space processing tasks, and there seems to be no signs of slowing down. More mechanisms similar to ublk are expected to appear in the future. It would be fascinating to know where these changes will take us, and I hope this does not lead to a world where all development work shifts to proprietary user-space drivers.

End of article. LWN articles follow the CC BY-SA 4.0 license.

Welcome to share, reprint, and create based on existing agreements~

Long press the QR code below to follow and see in-depth articles from LWN and various recent discussions in the open-source community~

Crash Recovery for User-Space Block Drivers

Leave a Comment