Kernel panic in ext4_es_scan / kswapd freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0)

# Kernel panic in `ext4_es_scan` / `kswapd` freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0)

## Summary

The Linux VM inside Docker Desktop panicked during memory reclaim, freezing every container and rendering the Docker daemon completely unresponsive from the host. The macOS-side `com.docker.backend` process remained running but every API request returned `HTTP 500 Internal Server Error` because the VM at `192.168.65.7:2376` was no route to host. Only a full Docker Desktop restart recovered the system.

Root frame: **`rb_erase` invoked from `ext4_es_scan` via `kswapd` ‚Üí null-pointer deref ‚Üí kernel panic**.

## Environment

| | |
|---|---|
| Docker Desktop | 4.73.0 |
| Docker Engine | 29.4.3 |
| Linux kernel (VM) | `6.12.76-linuxkit #1 SMP PREEMPT_DYNAMIC Thu Apr 30 11:25:59 UTC 2026 x86_64` |
| macOS | 14.8.2 (23J126) Sonoma |
| Host CPU | Intel Xeon E5-1650 v3 @ 3.50 GHz (12 cores) |
| Host RAM | 64 GiB |
| VM allocation at time of crash | 8192 MiB (default) / 2 CPUs / 1 GiB swap |
| Virtualisation backend | Apple Virtualization framework |
| Loaded modules at panic | `shiftfs(O) rosetta(O) grpcfuse(O) fakeowner(O) selfowner(O) vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock` |

## Workload at time of panic

Long-running container stack hosting two WordPress sites behind a Cloudflare tunnel:

- 2√ó `mysql:8.0`
- 2√ó `wordpress:php8.3-apache` (with WP-cron sidecar)
- 1√ó `redis:alpine`
- 1√ó `nginx:alpine`
- 1√ó `cloudflare/cloudflared:latest`

Total RSS across containers at the time: ~1.6 GiB. Free memory was not exhausted ‚Äî this is not an OOM, it's a race / null deref in the EXT4 extent-status reclaim path under normal reclaim pressure.

## Timing

- VM start: `2026-05-14 21:38:52 UTC`
- VM uptime at panic: `156104.95 s` (‚âà **43.4 hours**)
- VM totally unreachable from host until manual `docker desktop restart`.

## Panic trace (verbatim from VM console log)

```
[156104.949323] BUG: unable to handle page fault for address: 0000000088468846
[156104.949401] #PF: supervisor write access in kernel mode
[156104.949500] #PF: error_code(0x0002) - not-present page
[156104.949550] PGD 800000013b695067 P4D 800000013b695067 PUD 0
[156104.949628] Oops: Oops: 0002 [#1] PREEMPT SMP PTI
[156104.949672] CPU: 4 UID: 0 PID: 114 Comm: kswapd0 Tainted: G           O       6.12.76-linuxkit #1
[156104.949710] Tainted: [O]=OOT_MODULE
[156104.949741] RIP: 0010:rb_erase+0x2a2/0x380
[156104.949777] Code: 89 16 48 8b 11 48 89 10 48 89 01 48 83 fa 03 76 6a 48 83 e2 fc 48 3b 4a 10 74 2f 48 89 42 08 48 89 f0 e9 92 fe ff ff 48 8b 07 <48> 89 02 48 83 f8 03 76 1d 48 83 e0 fc 48 3b 78 10 0f 84 ac 00 00
[156104.949847] RSP: 0018:ffff8d58c10f7a18 EFLAGS: 00010246
[156104.950004] RAX: 0000000000000001 RBX: ffff8d58c10f7abc RCX: 0000000000000001
[156104.950054] RDX: 0000000088468846 RSI: 0000000000000000 RDI: ffff8d57fe0a2d20
[156104.950119] RBP: 00000000ffffffff R08: ffff8d59855ccc48 R09: ffffffff994fc953
[156104.950176] R10: 000000000066005c R11: 0000000000000000 R12: ffff8d58c10f7a6c
[156104.950226] R13: ffff8d59855cc880 R14: 0000000000000000 R15: ffff8d57fe0a2d20
[156104.950263] FS:  0000000000000000(0000) GS:ffff8d59f7b00000(0000) knlGS:0000000000000000
[156104.950328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[156104.950366] CR2: 0000000088468846 CR3: 000000011197a006 CR4: 00000000001706b0
[156104.950391] Call Trace:
[156104.950411]  <TASK>
[156104.950444]  es_do_reclaim_extents+0xa6/0xf0
[156104.950531]  es_reclaim_extents+0x5c/0xf0
[156104.950575]  ext4_es_scan+0xa6/0x3c0
[156104.950615]  do_shrink_slab+0x13d/0x340
[156104.950659]  shrink_slab+0xd8/0x3a0
[156104.950698]  ? try_to_shrink_lruvec+0x22d/0x320
[156104.950747]  shrink_one+0x121/0x1f0
[156104.950782]  shrink_node+0xa52/0xbe0
[156104.950810]  balance_pgdat+0x455/0x920
[156104.950848]  ? hrtimer_try_to_cancel.part.0+0x52/0x100
[156104.950891]  ? dequeue_entities+0x2e8/0x6a0
[156104.950933]  kswapd+0x1f8/0x3b0
[156104.950964]  ? __pfx_autoremove_wake_function+0x10/0x10
[156104.950996]  ? __pfx_kswapd+0x10/0x10
[156104.951023]  kthread+0xd2/0x100
[156104.951045]  ? __pfx_kthread+0x10/0x10
[156104.951068]  ret_from_fork+0x34/0x50
[156104.951104]  ? __pfx_kthread+0x10/0x10
[156104.951138]  ret_from_fork_asm+0x1a/0x30
[156104.951172]  </TASK>
[156104.951210] Modules linked in: shiftfs(O) rosetta(O) grpcfuse(O) fakeowner(O) selfowner(O) vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock
[156104.951257] CR2: 0000000088468846
[156104.951288] ---[ end trace 0000000000000000 ]---
[156104.951876] Kernel panic - not syncing: Fatal exception
[156104.952759] Kernel Offset: 0x18000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[156104.952822] ---[ end Kernel panic - not syncing: Fatal exception ]---
```

Full panic file (~233 KB) available on request ‚Äî paths below.

## Host-side symptom

`com.docker.backend` continued running but every API call routed to the VM failed:

```
[2026-05-16T17:26:13.419775000Z][com.docker.backend.apiproxy] still dialing 192.168.65.7:2376 after 1.000760104s: connect tcp 192.168.65.7:2376: no route to host
[...]
{"component":"apiproxy","level":"info","msg":"<< GET /containers/json?all=true&filters= Internal Server Error: context deadline exceeded (10.000129545s)"}
{"component":"apiproxy","level":"info","msg":"<< GET /networks Internal Server Error: context deadline exceeded (10.000234719s)"}
```

`docker ps`, `docker info`, `docker version` (server) all returned `HTTP 500 Internal Server Error for API route‚Ä¶`.

## Reproducer

Not deterministic. Pattern observed: after **~43 hours** of normal VM uptime running the workload described above. Memory was not exhausted at the time of panic; this appears to be a race in `ext4_es_scan`'s red-black tree traversal during normal `kswapd` reclaim activity.

A similar code-path crash class (null deref in `rb_erase` reached via `es_do_reclaim_extents`) has been reported on mainline kernels ‚Äî appears to be a pre-existing EXT4 extent-status shrinker race, surfaced under the memory pressure profile that LinuxKit's default 8 GiB allocation creates when running a multi-container persistent workload.

## What I tried

1. **Manual recovery** (only option once panicked): `pkill -9 -f "Docker Desktop\|com.docker.backend"` then `open -a "Docker Desktop"`. Daemon back in ~30 s, all containers restarted cleanly.
2. **Mitigations applied locally to reduce panic odds**:
   - Bumped VM `MemoryMiB` 8192 ‚Üí 16384 (host has 64 GiB free) to reduce `kswapd` activity.
   - Installed a launchd watchdog that probes the daemon every 2 min and restarts Docker Desktop if unreachable, so a future panic auto-recovers in ~3 min instead of waiting for me to notice.

Neither fixes the kernel bug. Filing this so your team has the panic trace.

## Diagnostic files available

If your team would like the raw bundle, the following files are still present:

| File | Contents |
|---|---|
| `~/Library/Containers/com.docker.docker/Data/log/vm/console.log.20260517-112856.549` | VM kernel/console log containing the panic |
| `~/Library/Containers/com.docker.docker/Data/log/host/com.docker.backend.log.20260517-032716.530` | Host-side backend log from the unreachable period |
| `~/Library/Containers/com.docker.docker/Data/log/host/monitor.log.20260517-033122.650` | Host monitor log over the failure window |
| `~/Library/Containers/com.docker.docker/Data/log/host/com.docker.virtualization.log` | VM lifecycle / resource allocation log |

Happy to upload a full `docker desktop diagnose` bundle on request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel panic in ext4_es_scan / kswapd freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0) #7877

Kernel panic in `ext4_es_scan` / `kswapd` freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0)

Summary

Environment

Workload at time of panic

Timing

Panic trace (verbatim from VM console log)

Host-side symptom

Reproducer

What I tried

Diagnostic files available

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


Docker Desktop	4.73.0
Docker Engine	29.4.3
Linux kernel (VM)	`6.12.76-linuxkit #1 SMP PREEMPT_DYNAMIC Thu Apr 30 11:25:59 UTC 2026 x86_64`
macOS	14.8.2 (23J126) Sonoma
Host CPU	Intel Xeon E5-1650 v3 @ 3.50 GHz (12 cores)
Host RAM	64 GiB
VM allocation at time of crash	8192 MiB (default) / 2 CPUs / 1 GiB swap
Virtualisation backend	Apple Virtualization framework
Loaded modules at panic	`shiftfs(O) rosetta(O) grpcfuse(O) fakeowner(O) selfowner(O) vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock`

File	Contents
`~/Library/Containers/com.docker.docker/Data/log/vm/console.log.20260517-112856.549`	VM kernel/console log containing the panic
`~/Library/Containers/com.docker.docker/Data/log/host/com.docker.backend.log.20260517-032716.530`	Host-side backend log from the unreachable period
`~/Library/Containers/com.docker.docker/Data/log/host/monitor.log.20260517-033122.650`	Host monitor log over the failure window
`~/Library/Containers/com.docker.docker/Data/log/host/com.docker.virtualization.log`	VM lifecycle / resource allocation log

Kernel panic in ext4_es_scan / kswapd freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0) #7877

Description

Kernel panic in ext4_es_scan / kswapd freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0)

Summary

Environment

Workload at time of panic

Timing

Panic trace (verbatim from VM console log)

Host-side symptom

Reproducer

What I tried

Diagnostic files available

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Kernel panic in `ext4_es_scan` / `kswapd` freezes entire VM (linuxkit 6.12.76 / Docker Desktop 4.73.0)