What Happens When a Deadlock Occurs in Linux Driver Code?

When writing a “dead loop” in the Linux kernel (while(1); or for(;;);), what happens depends on three key factors:

On which CPU the code runs and at what priority (interrupt context or process context);
Whether kernel preemption is enabled;
Whether the CPU is responsible for maintaining system heartbeat (tick, RCU, scheduling clock, IPI, etc.).

Let’s break down the scenarios step by step.

1. Process Context (Most Common Driver Implementation)

Code Form directly written in system calls like xxx_read()/ioctl()/mmap():
```
while(1)cpu_relax();
```
At this point, the thread belongs to the “kernel thread” or “user process in kernel mode”, and is schedulable and preemptible.
Preemption Disabled (CONFIG_PREEMPT_NONE / Older Kernel Versions)

The CPU is immediately occupied 100% by this thread until manually kill -9 or reboot.
Since kernel preemption is disabled, even the scheduler cannot enter, and other threads (including kworker, ksoftirqd, sshd) are completely starved on that CPU.
If the system has only 1 CPU, the entire machine hangs; if multi-core, only that core dies while the others continue to work.
If the mouse, keyboard, or network packets rely on the deadlocked core to handle soft interrupts, it will also “appear” as if the entire machine has crashed.

Preemption Enabled (CONFIG_PREEMPT_VOLUNTARY / CONFIG_PREEMPT / RT Kernel)

If there are no explicit scheduling points inside the loop (cond_resched(), schedule(), might_sleep()), it will still be considered as “involuntary” CPU occupation.
However, the preemption count thread_info->preempt_count is 0, soevery clock interrupt (HZ) will trigger scheduler_tick(), which finds that the current task’s time slice has expired and directly preempts.
Result: – A single-core system will not completely die, but sshd/bash and other interactive processes will respond very slowly (because 99% of the time is still taken by the dead loop task). – /proc/loadavg will immediately spike to 1.0 (single-core) or all N cores will be fully utilized.
If cond_resched() is added inside the loop, then every clock interrupt will actively yield the CPU, and the system will be almost imperceptible — but the loop itself will never end.

How to “self-rescue”

Multi-core machines: From another core, echo l > /proc/sysrq-trigger to print the backtrace of all CPUs, which can immediately locate the dead loop’s PC.
Single-core machines: Can only rely on NMI watchdog (see Section 4) to automatically reboot or manually trigger via serial port sysrq<code>.

2. Interrupt Context (hardirq / softirq / tasklet)

Writing while(1); in irqreturn_t irq_handler()

This interrupt line is masked on all CPUs (mask_irq()), until the handler returns.
If this interrupt is a clock interrupt (IRQ0), the system loses its tick, and the scheduler, RCU, and jiffies all freeze,the entire machine instantly dies.
If it is a network card interrupt, only the network card “appears” to be disconnected, while other services continue.

Dead loop in tasklet / timer callback

Tasklets run as soft interrupts, and preemption is still disabled, resulting in the same effect —the CPU is permanently trapped.
Since soft interrupts cannot sleep, even schedule() cannot be called, the only recovery method is the NMI watchdog.

3. Dead Loop Inside Spin Lock Critical Section

spin_lock(&amp;lock);while(1);

During the lock holding, preemption is disabled by default (PREEMPT_COUNT is incremented by 1).
Any subsequent attempts to acquire the same lock (including interrupt paths) will busy wait,causing cascading deadlocks, and the system will freeze immediately.
This is one of the “most cost-effective” ways to commit suicide in a driver.

4. NMI Watchdog — The Last Lifeline of the Kernel

Modern kernels have CONFIG_LOCKUP_DETECTOR enabled by default(x86 is called nmi_watchdog, arm64 is called hardlockup_detector). The principle is:

Each CPU maintains a per-CPU variable watchdog_touch_ts, which is updated to jiffies on every clock interrupt.
If > 20s (default) have not been updated, another alive CPU will grab the registers of that CPU via NMI (non-maskable interrupt), print the stack trace, and then call panic().
Thus, even a single-core machine can automatically reboot after 20 seconds, leaving a message

BUG: soft lockup - CPU#0 stuck for 22s! [foo/1234]

for post-mortem analysis using crash tools to analyze vmcore.

5. Practical: How to “Safely” Yield the CPU

If the driver indeed needs to poll a hardware bit, please follow the template below:

/* 1. Polling with scheduling points (sleepable context) */while (readl(reg)&amp; BUSY) {  cpu_relax();  /* Only saves power, no impact on correctness */  cond_resched();  /* Check if preemption is needed every round */  if (time_after(jiffies, timeout))     return -ETIMEDOUT;}/* 2. Polling in interrupt context (cannot sleep) */u64 end =get_cycles() + timeout_cycles;while (readl(reg)&amp; BUSY) {  if(get_cycles()&gt; end)      return-ETIMEDOUT;  cpu_relax();  /* Cannot call schedule() or cond_resched() */ }

6. Summary in One Sentence

Process context + preemption disabled → that CPU is permanently starved, while multi-core can survive.
Interrupt context → the same CPU can never escape, and if the tick is lost, the entire machine dies.
Preemption enabled → clock interrupts can preempt, but load will spike;
NMI watchdog will help you reboot after 20 seconds and leave a crash report.

Therefore, writing a dead loop in the kernel is not simply “occupying the CPU”, but rather a high-risk operation that directly violates the fundamental assumptions of the scheduler; any polling in driver programming must have cond_resched() or timeout mechanisms, otherwise it is a ticking time bomb.