feat(linux/wlr): SHM capture fallback + resilient GBM allocation for headless NVIDIA#4946
feat(linux/wlr): SHM capture fallback + resilient GBM allocation for headless NVIDIA#4946atassis wants to merge 3 commits intoLizardByte:masterfrom
Conversation
When DMA-BUF capture fails (e.g. GBM cannot allocate buffers on headless NVIDIA without an active DRM output), fall back to SHM shared memory capture via wl_shm. The SHM path creates a memfd-backed wl_shm_pool, receives pixel data from the compositor via wlr-screencopy, and feeds it to the encoder through the existing wlr_ram_t CPU path. Supported SHM formats: - 4 bpp (XRGB8888/ARGB8888): direct memcpy - 3 bpp (BGR888): pixel conversion to BGRA8888 Key changes: - Bind wl_shm interface in screencopy path - Add create_and_copy_shm() with memfd + mmap allocation - Refactor create_and_copy_dmabuf() to return bool for fallback - Cache GBM failure to avoid per-frame retry - Handle SHM frames in wlr_ram_t::snapshot() via memcpy - Make EGL init non-fatal (SHM path does not require EGL) - Force wlr_ram_t on reinit when SHM fallback is active Tested on headless NVIDIA RTX 5060 Ti with labwc compositor, NVENC HEVC encoding, streaming to Moonlight client.
0262c5d to
2bcea4f
Compare
BGR888 (0x34324742) stores bytes in R,G,B memory order. Previous code copied them straight into BGRA8888, causing red/blue inversion (orange appeared bluish in Moonlight). Swap src positions 0↔2 so B and R land in correct BGRA slots. Also deduplicate SHM format log to fire only once.
|
I also have a follow-up fix that makes GBM allocation work on headless NVIDIA by trying relaxed usage flags (removing The change is small (two files):
Tested on RTX 4060 Ti headless (TrueNAS, no monitor) — GBM succeeds with Should I add this to this PR or open a separate one? |
I think it can be included here if we make the PR title more generic. |
GBM buffer allocation with GBM_BO_USE_RENDERING|GBM_BO_USE_LINEAR fails on headless NVIDIA render nodes. Try progressively relaxed flag combinations before falling back to SHM. Also prefer DRM render nodes over primary nodes in CUDA device lookup — primary nodes require DRM master which is unavailable on headless setups. Tested on RTX 4060 Ti headless (TrueNAS, no monitor): - GBM succeeds with GBM_BO_USE_RENDERING flag alone - VRAM capture path works (zero-copy DMA-BUF → EGL → CUDA → NVENC) - SHM fallback still catches cases where all GBM combos fail
|
Added the GBM resilience fix in e54a35d. Updated the PR title and description to cover both changes. |
|




Description
Two improvements for Wayland capture on headless NVIDIA setups (no physical display connected):
1. SHM capture fallback
On headless NVIDIA, GBM buffer allocation can fail because the driver cannot create GBM buffers without an active DRM output. The existing code logged
"SHM capture not implemented"and gave up.This PR implements the SHM fallback:
wl_shminterface from the Wayland registrymemfd-backedwl_shm_pooland copies frames via SHMgbm_failedflag to avoid retrying every framewlr_ram_tcan operate in SHM-only mode2. Resilient GBM allocation for headless NVIDIA
Even on headless NVIDIA, GBM can work if we relax the usage flags. This change:
RENDERING|LINEAR→RENDERING→LINEAR→ nonerenderD*) over primary nodes (card*) — primary nodes require DRM master which is unavailable headlessWith both changes, the capture priority is:
The DMA-BUF path is completely unchanged when GBM succeeds with default flags.
Tested on
WLR_BACKENDS=headlessRENDERINGflag, VRAM capture path works end-to-endScreenshot
Issues Fixed or Closed
Roadmap Issues
Type of Change
Checklist
AI Usage