apt: harden sandbox bootstrap against transient archive.ubuntu.com flakes#1284
Merged
apt: harden sandbox bootstrap against transient archive.ubuntu.com flakes#1284
Conversation
…akes
archive.ubuntu.com / security.ubuntu.com resolve to a CDN whose edges
propagate new InRelease manifests and Packages.gz files asynchronously
during Canonical's index pushes. A fresh sandbox can fetch InRelease
from one already-synced edge and Packages.gz from another not-yet-synced
edge, producing errors like:
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/noble-updates/main/binary-amd64/Packages.gz
File has unexpected size (2399568 != 2399874). Mirror sync in progress?
apt's default Acquire::Retries=0, so a single bad fetch fails the
rollout. With BS=256 x R=8 hitting the CDN simultaneously, this is
hit reliably during Canonical sync windows.
Add Acquire::Retries=3 to apt-get update and apt-get install calls in
sandbox bootstrap / image-setup paths (rollout hot path), plus the
matching install calls that race-fetch debs. Task-level test scripts
and one-shot solve.sh helpers are intentionally left unchanged: those
are evaluation artifacts where a loud apt failure is the correct
behavior.
See launchpad bug #1876035 and ubuntu discourse threads for the
upstream issue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willccbb
approved these changes
May 8, 2026
Contributor
Author
|
@codex resolve the mergr conflict |
|
I checked the branch tied to the PR head SHA ( What I ran
Because there were no conflicts and no required code edits, I did not create a commit and therefore did not open a follow-up PR. |
Contributor
Author
|
@cursoragent resolve merge conflicts |
|
Resolved the merge conflict and pushed the branch. Summary:
Validation:
|
Co-authored-by: rasdani <rasdani@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Failure mode
archive.ubuntu.com/security.ubuntu.comresolve to a CDN whose edges propagate newInReleasemanifests andPackages.gzfiles asynchronously when Canonical pushes new package indexes. A fresh sandbox can fetchInReleasefrom one already-synced edge andPackages.gzfrom another not-yet-synced edge, andaptaborts:apt's default
Acquire::Retriesis0— one bad fetch fails the rollout. With BS=256 × R=8 sandboxes hitting the CDN simultaneously, we reliably roll the dice during Canonical sync windows. This took out a recent RL run mid-rollout.Upstream tracking: Launchpad #1876035. Standard mitigation in CI guides, Docker/CircleCI/moby/vscode is
apt-get -o Acquire::Retries=3.What this changes
Add
-o Acquire::Retries=3to everyapt-get update/apt-get installinvoked in sandbox bootstrap / image-setup paths (rollout hot path). The flag is also applied to follow-upapt-get installcalls that race-fetch debs.Files touched
verifiers/envs/experimental/composable/harnesses/opencode.py— opencode harness install script (the canonical SWE rollout install path; this is the file that broke the recent run).verifiers/envs/experimental/composable/harnesses/mini_swe_agent.py— mini-SWE-agent install script.verifiers/envs/experimental/composable/tasksets/swe/multi_swe.py— multi-SWE per-rolloutapt-get install patch.verifiers/envs/experimental/opencode_env.py—DEFAULT_RUN_COMMAND_TEMPLATE(per-rollout sandbox bootstrap).verifiers/envs/experimental/opencode_rlm_env.py—RLM_RUN_COMMAND_TEMPLATE(per-rollout sandbox bootstrap).environments/terminus_harbor/terminus_harbor.py—post_sandbox_setupapt call.environments/hello_mcp_harbor/hello_mcp_harbor.py— sandbox run command.environments/opencode_harbor/opencode_harbor.py— sandbox run command.assets/templates/browserbase/cua/setup.sh,setup-binary.sh,Dockerfile.runtime— CUA browserbase sandbox setup templates.environments/openenv_echo/proj/server/Dockerfile,environments/openenv_textarena/proj/server/Dockerfile— env server image builds.tests/test_opencode_rlm_env.py— updated assertion to match new install string.Skipped (intentionally)
environments/*/tasks/*/tests/test.sh,environments/*/tasks/*/solution/solve.sh— task-level evaluation/solution scripts. apt failures here should be loud; they don't gate training.scripts/install.sh— host dev/setup script for human use.verifiers/envs/experimental/composable/tasksets/swe/swe_lego.py— only contains a comment string mentioningapt-get update, not an actual call.Validation
uv run ruff check .clean.uv run ruff format --checkclean on all touched files.uv run pre-commit run --files <touched>passes.uv run pytest tests/test_opencode_rlm_env.py tests/test_rlm_composable_env.py tests/test_envs.py tests/test_opencode_harbor.py tests/test_build_script.py— all apt-related tests pass; the 2 failures intest_envs.pyare pre-existing OPENAI_API_KEY-not-set smoke tests unrelated to this change.🤖 Generated with Claude Code
Note
Low Risk
Low risk change that only hardens
apt-get update/installinvocations with retries, but it touches many sandbox/image bootstrap paths so any typo could impact environment startup.Overview
Reduces rollout flakiness from transient
archive.ubuntu.commirror/CDN sync issues by addingapt-get -o Acquire::Retries=3to sandbox bootstrapapt-get update/installsteps across harness install scripts, environment setup commands, and Docker image builds.Updates related tests and assertions to match the new
apt-getcommand strings, ensuring v1 harness programs (OpenCode,Pi,MiniSWEAgent,RLM) and experimental env templates expect the hardened setup behavior.Reviewed by Cursor Bugbot for commit cef1182. Bugbot is set up for automated code reviews on this repo. Configure here.