test(e2e): reproduce py_binary execve failure under bazel run (runfiles discovery)#1113
Open
gregmagolan wants to merge 1 commit into
Open
test(e2e): reproduce py_binary execve failure under bazel run (runfiles discovery)#1113gregmagolan wants to merge 1 commit into
gregmagolan wants to merge 1 commit into
Conversation
…es discovery) Add a regression case showing a py_binary cannot be run via `bazel run` (or direct exec) when RUNFILES_DIR is not pre-set: the launcher (hermetic_launcher 0.0.9) fails to self-locate its `.runfiles` dir from argv[0] and aborts with "execve failed with errno 2 (return code 1)". `bazel test` masks the bug because it sets RUNFILES_DIR for the test process; the case unsets the runfiles env vars before invoking the binary to force the launcher's self-location path, matching what `bazel run` and a deployed binary hit. Confirmed: passes with RUNFILES_DIR set, fails without. Tagged `manual` so it stays out of the default `//...` gate until the launcher fix lands. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✨ Aspect Workflows Tasks📅 Tue Jun 16 08:39:18 UTC 2026 ✅ 7 successful tasks
⏱ Last updated Tue Jun 16 08:43:07 UTC 2026 · 📊 GitHub API quota 281/15,000 (2% used, resets in 1m) |
gregmagolan
added a commit
to gregmagolan/hermetic-launcher
that referenced
this pull request
Jun 16, 2026
…] is relative When the launcher is exec'd with a relative argv[0] from a cwd unrelated to the binary (notably `bazel run`, which sets cwd inside `<x>.runfiles/_main`, passes argv[0]="bazel-bin/<name>", and does NOT set RUNFILES_DIR), `<argv[0]>.runfiles` resolved against cwd points nowhere. Discovery then fails, the embedded program rlocation is left unresolved, and execve aborts with errno 2 — observed downstream as aspect-build/rules_py#1113 ("execve failed with errno 2") when running a py_binary via `bazel run`. Fix: when argv[0] is not absolute, resolve the real executable path and use it for `<exe>.runfiles` discovery: - macOS: _NSGetExecutablePath - Linux (x86_64/aarch64): readlinkat(/proc/self/exe) - other (s390x/Windows): unchanged (return None -> argv[0] fallback) RUNFILES_DIR / RUNFILES_MANIFEST_FILE still take precedence when set. Adds an integration case `relative_argv0_discovery` reproducing `bazel run`'s geometry exactly (relative argv[0] via arg0 override + cwd inside runfiles + no runfiles env), plus `symlinked_program_relative_argv0` for the venv-shaped relative-symlink program target. Both fail before this change and pass after. NB: the Linux readlinkat paths could not be run locally (macOS host); the macOS path is verified end-to-end. s390x readlinkat is intentionally left unwired. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
malt3
added a commit
to hermeticbuild/hermetic-launcher
that referenced
this pull request
Jun 16, 2026
`bazel run` and directly-executed deployed binaries exec the launcher with a relative argv[0] (e.g. `bazel-bin/foo`) from a working directory inside the runfiles tree and without RUNFILES_DIR set, so computing `<argv[0]>.runfiles` resolved against the wrong directory and runfiles discovery failed with "Failed to initialize runfiles" / execve errno 2. `bazel test` masked this by pre-setting RUNFILES_DIR. See aspect-build/rules_py#1113. Derive the executable path from a safe OS launch-path API instead of argv[0], make it absolute (joining the cwd when relative), and do not resolve symlinks: - Linux (all arches incl. s390x): AT_EXECFN from the ELF auxv - macOS: _NSGetExecutablePath - Windows: QueryFullProcessImageNameW `program_path()` (argv[0]) is removed; `Runfiles::create` now takes `&RuntimeArgs` and calls the new `executable_path()` seam, with a shared `common::absolutize` + per-backend `current_dir` (getcwd). Add an integration regression test (`run_runfiles_discovery`) that execs the stub with a relative argv[0] from a cwd inside the runfiles tree with the runfiles env vars unset; it fails before this change and passes after.
Contributor
|
Can you test this again, but briefly add a |
chpock
added a commit
to chpock/rules_py
that referenced
this pull request
Jun 19, 2026
Upstream issue: aspect-build#1116 Related PR: aspect-build#1113
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A
py_binarycannot be run viabazel run(or as a deployed/direct-exec binary) whenRUNFILES_DIRis not already set: it aborts withbazel testmasks it (it setsRUNFILES_DIR), so existing e2e coverage doesn't catch it;bazel runand deployed binaries do.Root cause (in the launcher, not this repo)
bazel runexecs the binary with a relativeargv[0](bazel-bin/<name>), cwd set inside the runfiles tree (<x>.runfiles/_main), and noRUNFILES_DIR. Thehermetic_launcherstub then computes<argv[0]>.runfilesrelative to cwd — which points nowhere — so runfiles discovery fails, the embedded venv-python rlocation is left unresolved, andexecvefails with errno 2.Verified this reproduces on
hermetic_launchermain (not just the pinned 0.0.9), and that bumping to the latest release 0.0.10 does not fix it. The launcher fix is here: hermeticbuild/hermetic-launcher#44 (resolve the real executable path via_NSGetExecutablePath/readlinkat(/proc/self/exe)whenargv[0]is relative).What changed in this PR
Adds an e2e regression case
cases/run-runfiles-discovery:py_binary,sh_testthat unsetsRUNFILES_DIR/RUNFILES_MANIFEST_FILE/JAVA_RUNFILESbefore invoking the binary, forcing the launcher's self-location path.Tagged
manualso it stays out of the default//...gate until the launcher fix is released and bumped here.Remedy / follow-up
hermetic_launcherversion).bazel_dep(name = "hermetic_launcher", ...)inMODULE.bazelto that version.manualtag oncases/run-runfiles-discoveryso it gatesbazel run.(Bumping to 0.0.10 was tried and verified NOT to fix it, so no bump is included here yet — it would need the released fix from #44.)
Test plan
mainwithexecve errno 2; the same binary succeeds whenRUNFILES_DIRis exported, isolating the cause to launcher runfiles resolution.🤖 Generated with Claude Code