Skip to content

[NSKIP055] linuxd rename()/replace() hangs the kernel on hosted Nanvix #552

@ada-x64

Description

@ada-x64

Skip message

NSKIP055: linuxd rename()/replace() hangs the kernel on hosted Nanvix

Root cause

On hosted Nanvix (single-process, multi-process), os.rename() and
os.replace() calls hang the guest kernel — the syscall never returns,
the VM becomes unresponsive, and the next batch member never starts.
This is the hosted analog of NSKIP021
(FAT VFS rename hang on standalone), but is a distinct bug: standalone
rename goes through nanvix-kernel → rust-fatfs → FAT image, while
hosted rename goes through nanvix-kernel → linuxd RPC → host Linux
ext4. Different code paths, separately fixable.

The hosted failure surface is broader than NSKIP021's: every probe
shape hangs (even sibling renames in /tmp root), whereas NSKIP021 lets
some /tmp root cases through. Affected paths include direct
os.rename/os.replace calls plus indirect callers: shutil.move,
dbm.dumb._commit, shelve (via dbm.dumb), py_compile.compile
(writes __pycache__/*.pyc via atomic rename), and __import__ (same
pyc path — triggered by importing any freshly-written /tmp source
file).

Inline repro

import os, tempfile

d = tempfile.mkdtemp()
sub = os.path.join(d, "sub")
os.mkdir(sub)
a = os.path.join(sub, "a")
b = os.path.join(sub, "b")
open(a, "wb").write(b"x")
print("about to os.rename", flush=True)
os.rename(a, b)               # hangs the kernel on hosted Nanvix
print("OK")                   # never reached

Verified probe shapes (all six hang on hosted): rename in /tmp root,
rename in subdir, replace in subdir, rename in __pycache__, replace
with existing destination, rename in nested workspace dirs.

The pyc-write code path (__import__ of fresh /tmp/<x>.py) is the
most common trigger in test_importlib's metadata fixtures and any
runtime that imports a source whose mtime is newer than its compiled
pyc — which is why python3 -B is mandatory in the hosted test
harness (see .nanvix/run-tests.py).

Workarounds

  • Per-test: @unittest.skipIf(is_nanvix and not is_nanvix_standalone, "NSKIP055: ..."). Pair with NSKIP021 (is_nanvix_standalone) at any
    test site that exercises rename in either mode.
  • Process-wide: python3 -B or PYTHONDONTWRITEBYTECODE=1
    suppresses the implicit pyc-write rename, but does not address
    explicit os.rename/os.replace calls in test bodies. Not a
    substitute for NSKIP055 guards.

Finding the footprint

rg -n 'NSKIP055'

Parent

Tracked in #371.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions