Skip to content

#minor: Implement Modal sandbox provider#66

Open
tthuwng wants to merge 1 commit into
mainfrom
ht/modal-adapter
Open

#minor: Implement Modal sandbox provider#66
tthuwng wants to merge 1 commit into
mainfrom
ht/modal-adapter

Conversation

@tthuwng

@tthuwng tthuwng commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the Modal adapter behind the provider-selection interface that is now on main after #65. This PR is rebased/retargeted from the old jf/provider-selection-clean stack to current main.

Refs vals-ai/valkyrie-ticket-library#21.

What changed

  • Adds modal>=1.4.2 and fills in ModalProviderConfig / ModalSandboxProvider.
  • Implements Modal-backed create_sandbox, get_sandbox, delete_sandbox, list_sandboxes, exec, streaming command, upload_file, and download_file.
  • Keeps Modal sandboxes alive with a long-running entrypoint because Modal terminates sandboxes when the entrypoint exits.
  • Maps common provider behavior to the generic sandbox contract:
    • ImageSource -> Image.from_registry(...)
    • labels -> Modal tags
    • env vars -> Modal sandbox env
    • vCPU/memory -> Modal CPU/MiB settings
    • auto_stop_interval -> Modal idle_timeout
    • command timeout -> shell timeout, matching Daytona's exit-code behavior
    • stderr merged into stdout, matching Daytona's combined PTY output behavior
  • Uses Modal's current sandbox.filesystem APIs for upload/download instead of deprecated sandbox.open / sandbox.mkdir file APIs.
  • Rejects SnapshotSource before connecting to Modal because Daytona snapshots do not have a Modal equivalent.
  • Adds one generic sandbox resource capability: Resources.enable_docker (default false). When true, the Modal provider passes experimental_options={"enable_docker": True}. Daytona remains generic/unchanged; benchmarks still own their Docker-capable image, dockerd startup, compose flow, and cleanup.
  • Documents Modal provider usage and compatibility in README.md.

Compatibility with Valkyrie provider-selection work

Jarett's Valkyrie provider-selection PR resolves provider config from AWS Secrets Manager as a JSON object and calls sandbox_provider_config_from_mapping({**secret, "type": provider_type}).

This PR supports that path for Modal by accepting Secrets Manager-style keys:

{
  "MODAL_TOKEN_ID": "...",
  "MODAL_TOKEN_SECRET": "...",
  "MODAL_ENVIRONMENT": "..."
}

The lowercase request-body form remains supported too: token_id, token_secret, and environment. environment is optional and should only be set when that Modal environment exists.

For benchmarks that need nested Docker, the benchmark service can return:

Resources(vcpu=..., memory=..., disk=..., enable_docker=True)

Valkyrie's provider-selection flow passes task_data.resources into SandboxCreateRequest, so this stays provider-uniform instead of adding a VCB-specific provider type or Modal-only config blob.

Simplification pass

After review, this PR removed non-essential Modal-specific surface area:

  • Removed the unwired Modal header config path; real request selection already comes through sandbox_provider config / Secrets Manager mapping.
  • Removed private upload/download pass-through wrappers and decorates the public methods directly.
  • Removed redundant command state and duplicate not-found mapping.
  • Removed obsolete file/open/mkdir fake test scaffolding and dead header tests.
  • Kept the provider contract generic; no type: "vcb" special case and no CBS-owned dockerd startup.

Decisions documented

Modal credentials ownership path

Hosted benchmark services should normally let the service deployment own MODAL_TOKEN_ID / MODAL_TOKEN_SECRET. In practice that means the registry/deployment environment owns Modal credentials rather than every Valkyrie request carrying them.

The adapter still supports optional request-body or Secrets-Manager-sourced credentials for development, self-hosted use, or explicit overrides:

  • request body: {"type": "modal", "token_id": "...", "token_secret": "...", "environment": "..."}
  • AWS Secrets Manager payload via Valkyrie: MODAL_TOKEN_ID, MODAL_TOKEN_SECRET, optional MODAL_ENVIRONMENT

Docker-in-sandbox support

Docker-in-sandbox is not enabled by default.

Benchmarks that require dockerd / nested Docker, such as VCB or ProgramBench-style flows, opt in through the generic Resources.enable_docker field. The provider only grants the underlying sandbox capability. The benchmark service is still responsible for using a Docker-capable image, starting dockerd with provider-compatible flags, running compose, and cleaning up containers/volumes.

This keeps the CBS contract uniform across providers and avoids benchmark-specific provider types.

Retry semantics parity with Daytona

Transient Modal connection errors now map to SandboxConnectionError and are retried up to 3 attempts with a fixed 2s wait on provider operations and process/file startup paths. Non-transient Modal errors are not retried, and nonzero command exits still surface as SandboxCommandError with the original exit code.

Validation

Local validation on this branch after the latest generic Docker opt-in pass:

  • uv run ruff check . — passed
  • uv run pytest tests/test_modal_sandbox.py tests/test_client.py -q — 52 passed
  • uv run pytest -q — 233 passed, 1 Starlette/httpx deprecation warning
  • uv run basedpyright . — 0 errors, 0 warnings

Coverage added/verified:

  • Modal config parsing from lowercase request body and Jarett/Valkyrie-compatible Secrets Manager shape.
  • Modal adapter create/get/delete/list mappings against a faked Modal SDK.
  • generic Resources.enable_docker parsing from retrieved task metadata.
  • Modal create mapping passes experimental_options={"enable_docker": True} only when resources.enable_docker is true.
  • exec/stream command behavior, timeout/cwd wrapping, output merging, file upload/download.
  • retry behavior for transient Modal connection failures.
  • not-found mapping for removed/missing sandboxes.
  • existing WebSocket setup path accepts sandbox_provider: {"type": "modal"} and routes through the provider-selection contract.

Live Modal smoke status:

  • Configured a local Modal CLI profile named valsai without printing token values.
  • Ran a real Modal provider smoke: create sandbox, exec env-var command, upload file, download file, list by tag, delete sandbox.
  • Ran a real benchmark-service integration smoke through BenchmarkServiceApp websockets: create Modal sandbox, call /ws/setup-task with sandbox_provider: {"type": "modal"}, upload/exec inside the benchmark setup hook, call /ws/evaluate-instance, download/exec inside the benchmark evaluation hook, and delete the sandbox.
  • Ran a real Modal nested-Docker smoke through this provider path with Resources(enable_docker=True): create docker:27-dind sandbox, start dockerd --iptables=false --ip-masq=false, run docker run --rm hello-world, and delete the sandbox.
  • Ran a one-task VCB runtime smoke on a Modal Docker-enabled sandbox using breathing_exercise_app: load the real VCB task spec, run VibeCodeBenchSetup, materialize a generated app, start it through VCB AppManager with docker compose up -d --build, verify http://localhost:18080 returns 200 with the task marker, stop compose, and delete the Modal sandbox. This validates VCB setup + Docker runtime on Modal; it intentionally does not claim full Supabase/browser-use LLM grading coverage.
  • All smokes passed with sandbox cleanup confirmed.

GitHub CI on this branch:

  • Re-running after this latest push.

@tthuwng tthuwng requested a review from JarettForzano June 10, 2026 19:01
@assert-app

assert-app Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review on Assert →

@socket-security

socket-security Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedmodal@​1.5.076100100100100

View full report

@tthuwng tthuwng force-pushed the ht/modal-adapter branch from 1b510be to 8bdb173 Compare June 23, 2026 00:21
@tthuwng tthuwng changed the base branch from jf/provider-selection-clean to main June 23, 2026 00:21
@tthuwng tthuwng marked this pull request as ready for review June 23, 2026 00:21

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@tthuwng tthuwng force-pushed the ht/modal-adapter branch 4 times, most recently from eb0585b to 9ee5336 Compare June 23, 2026 18:43
Fill in the Modal adapter behind the provider-selection interface on current main.\n\n- Add Modal SDK dependency and provider config for env-owned or request/header-provided credentials.\n- Implement create/get/delete/list, exec/streaming command, and file transfer operations.\n- Reject snapshot sources explicitly, document Docker/disk limitations, and keep deployment-owned Modal credentials as the default path.\n- Add provider-level retry coverage for transient Modal connection errors and not-found mapping for removed sandboxes.\n- Add focused Modal provider tests plus README usage/compatibility notes.
@tthuwng tthuwng force-pushed the ht/modal-adapter branch from 9ee5336 to 3b54d37 Compare June 23, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant