Skip to content

Add experimental capnweb RPC transport#538

Draft
aron-cf wants to merge 13 commits intomainfrom
capnweb-rpc
Draft

Add experimental capnweb RPC transport#538
aron-cf wants to merge 13 commits intomainfrom
capnweb-rpc

Conversation

@aron-cf
Copy link
Copy Markdown
Contributor

@aron-cf aron-cf commented Mar 30, 2026

Add capnweb RPC transport for container communication

The Sandbox SDK currently offers two transports — HTTP and WebSocket — for
communication between the Durable Object and the Bun container runtime. Both
serialize every operation as an HTTP request; WebSocket just multiplexes them
over a single connection to reduce sub-request count.

This PR introduces a third transport built on
capnweb, a Cap'n Proto-inspired
RPC library that runs over WebSocket. capnweb replaces request/response
serialization with typed RPC calls, which brings two immediate benefits:

  • Native streaming with backpressureReadableStream values flow
    directly through the RPC layer without SSE encoding or base64 overhead.
  • Direct service calls — the container exposes a typed SandboxRPCAPI
    that calls services directly, bypassing the HTTP handler/router layer entirely.

When SANDBOX_TRANSPORT=capnweb, all sandbox operations — commands, files,
processes, ports, git, code interpreter, desktop, backups, and file watching —
go through native RPC. The HTTP handler/router layer is not involved.

The first operation to exploit native streaming is writeFile, which now
accepts string | ReadableStream<Uint8Array>. Passing a stream pipes bytes
to disk with automatic backpressure, removing the previous 32 MiB buffering
constraint.

Note

The changeset is huge because the SandboxAPI has been implemented again in a separate file. This duplication is intended to make it easier to remove the http and websocket transports if this experiment is successful.

To run locally with worker:

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@538

And update your Dockerfile to use the new sandbox client.

FROM cloudflare/sandbox:0.0.0-pr-538-93fc939

Then:

npx wrangler dev --var SANDBOX_TRANSPORT=capnweb

How it works

Container side: The Bun server gains a /capnweb WebSocket upgrade
endpoint. Each connection gets its own capnweb RPC session backed by
SandboxRPCAPI extends RpcTarget, which has a typed method per operation.
Service errors are thrown as exceptions; capnweb propagates them back to the
caller automatically.

SDK side: ContainerConnection manages the capnweb WebSocket session and
exposes a typed rpc() handle — the client-side mirror of SandboxRPCAPI.
RPCSandboxClient wraps ContainerConnection and exposes the same
sub-client fields as the existing SandboxClient (commands, files,
processes, ports, git, utils, interpreter, backup, desktop,
watch), so sandbox.ts switches between the two without any changes at
the call sites.

Transport selection is controlled by the existing SANDBOX_TRANSPORT
environment variable, now accepting "capnweb" in addition to "http" and
"websocket".

Changes

# Commit Key files
1 Add capnweb dependency package.json ×2
2 Expose SessionManager in DI container container.ts
3 Add native RPC API and /capnweb endpoint sandbox-api.ts, server.ts
4 Streaming file writes + shared type update file-service.ts, types.ts
5 Add ContainerConnection and RPCSandboxClient container-connection.ts, rpc-sandbox-client.ts
6 Wire capnweb into Sandbox DO sandbox.ts, sandbox-client.ts, local-mount-sync.ts
7 Unit tests container-connection.test.ts
8 E2E tests e2e/capnweb-transport.test.ts
9 Remove HTTP bridge dead code sandbox-api.ts, container-connection.ts, sandbox-client.ts

Testing

Unit tests cover ContainerConnection in isolation — initial state,
safe/repeated disconnect, connection failure propagation, and concurrent
connect() deduplication.

Existing unit tests are unchanged. The capnweb path does not touch any
existing HTTP or WebSocket code.

E2E tests deploy a real worker with SANDBOX_TRANSPORT=capnweb and
exercise command execution, streaming output, file write/read, directory
listing, and session isolation through the full DO → container RPC path.

Follow-up

The HTTP transport, domain client classes (CommandClient, FileClient,
etc.), HTTP handlers, and router remain in place for the http and
websocket transport modes. A follow-up PR can remove them (~8 000 lines)
once those modes are fully retired.

See #539 for a first pass.


Open with Devin

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 30, 2026

⚠️ No Changeset found

Latest commit: 93fc939

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Agent added 9 commits March 30, 2026 11:37
Introduces SandboxRPCAPI extending capnweb's RpcTarget, which calls
container services directly without going through the HTTP router.

Wires a /capnweb WebSocket upgrade path in the Bun server and
initialises a capnweb RPC session per connection. An HTTP bridge
(httpFetch/httpFetchStream) is also exposed so the SDK can route
its existing transport calls over capnweb during the transitional
stages before all methods call native RPC directly.
Adds FileService.writeFileStream() which pipes a ReadableStream<Uint8Array>
directly to disk via Bun's file writer, avoiding buffering the full file
in memory.

Updates the writeFile signature in ISandbox and ExecutionSession (shared
types) to accept string | ReadableStream<Uint8Array>, enabling callers to
pass a stream without a separate API method.
Extends TransportMode to include 'capnweb'. Implements CapnwebTransport
extending BaseTransport — a capnweb WebSocket RPC transport that connects
either via a direct WebSocket or through a Durable Object stub.fetch()
upgrade, matching the pattern used by WebSocketTransport.

Wires CapnwebTransport into createTransport() and exports it from the
transport barrel alongside HttpTransport and WebSocketTransport.
ContainerConnection manages a single capnweb RPC session and exposes a
typed ContainerRPCAPI interface — the client-side mirror of the container's
SandboxRPCAPI.

RPCSandboxClient implements the same field-level interface as SandboxClient
(.commands, .files, .processes, etc.) so sandbox.ts can switch between
the two transports without changes at the call sites.
Reads SANDBOX_TRANSPORT=capnweb, constructs a ContainerConnection and
RPCSandboxClient, and connects to /capnweb inside the container. The
connection is torn down alongside the client on sandbox sleep.

SandboxClient.writeFileStream() is added for the capnweb-only streaming
path; sandbox.writeFile() routes to it when passed a ReadableStream.

LocalMountSyncManager's client field is widened to accept the union
type SandboxClient | RPCSandboxClient.
Covers factory wiring, initial state, safe disconnect (including multiple
calls), concurrent connect() deduplication, and connection failure
propagation. Uses mocked internals for connection-dependent paths,
following the pattern established in ws-transport.test.ts.
Exercises command execution, streaming output, file write/read, directory
listing, and session isolation end-to-end through the capnweb bridge.
Tests run against a live Cloudflare container and also serve as regression
coverage for core operations when run with the default HTTP transport.
Agent added 4 commits March 30, 2026 11:44
Delete the httpFetch/httpFetchStream bridge methods from SandboxRPCAPI
and the ContainerRPCAPI interface, and drop the router dependency from
SandboxRPCAPIDeps. These methods were a transitional shim for
CapnwebTransport to route requests through the HTTP handler layer;
they are unreachable now that RPCSandboxClient calls native RPC directly
for every operation.

Remove CapnwebTransport entirely. When SANDBOX_TRANSPORT=capnweb, sandbox.ts
creates a ContainerConnection and RPCSandboxClient directly, so
CapnwebTransport is never instantiated. Remove it from the transport
factory, barrel exports, and its unit test file.

Remove the dead capnweb branch from createSandboxClient() and the
writeFileStream method from SandboxClient, both of which were only
reachable via the now-removed transport.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 30, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@538

commit: 93fc939

@github-actions
Copy link
Copy Markdown
Contributor

🐳 Docker Images Published

Variant Image
Default cloudflare/sandbox:0.0.0-pr-538-93fc939
Python cloudflare/sandbox:0.0.0-pr-538-93fc939-python
OpenCode cloudflare/sandbox:0.0.0-pr-538-93fc939-opencode
Musl cloudflare/sandbox:0.0.0-pr-538-93fc939-musl
Desktop cloudflare/sandbox:0.0.0-pr-538-93fc939-desktop

Usage:

FROM cloudflare/sandbox:0.0.0-pr-538-93fc939

Version: 0.0.0-pr-538-93fc939


📦 Standalone Binary

For arbitrary Dockerfiles:

COPY --from=cloudflare/sandbox:0.0.0-pr-538-93fc939 /container-server/sandbox /sandbox
ENTRYPOINT ["/sandbox"]

Download via GitHub CLI:

gh run download 23743039600 -n sandbox-binary

Extract from Docker:

docker run --rm cloudflare/sandbox:0.0.0-pr-538-93fc939 cat /container-server/sandbox > sandbox && chmod +x sandbox

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

// Ensure parent directory exists
const dir = targetPath.substring(0, targetPath.lastIndexOf('/'));
if (dir) {
await exec(`mkdir -p ${dir}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing shellEscape on directory path in writeFileStream allows command injection

In writeFileStream, the dir variable derived from targetPath is interpolated directly into the shell command mkdir -p ${dir} without calling shellEscape(). Every other method in this file that passes a path to a shell command uses shellEscape() (see lines 150, 427, 559, 560, 656, 657, 746, 846, 943, 1063, 1348 in packages/sandbox-container/src/services/file-service.ts). The security.validatePath() check at line 1133 only validates that the path is non-empty and has no null bytes — it does NOT sanitize shell metacharacters. A path containing shell metacharacters (e.g. spaces, semicolons, backticks, $()) could lead to unintended command execution.

Suggested change
await exec(`mkdir -p ${dir}`);
await exec(`mkdir -p ${shellEscape(dir)}`);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +426 to +428
this.ws = ws;
this.stub = newWebSocketRpcSession<ContainerRPCAPI>(ws);
this.connected = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 ContainerConnection has no WebSocket close handler, preventing auto-reconnection

In doConnect(), after the WebSocket is established and this.connected is set to true, no close or error event listener is registered on the WebSocket. If the WebSocket closes (e.g., container restart, network error), this.connected remains true and this.stub remains non-null. Consequently, isConnected() at packages/sandbox/src/container-connection.ts:345-347 returns true with a dead connection, and rpc() at line 339 skips reconnection and returns the stale stub. All subsequent RPC calls will fail with cryptic errors instead of triggering automatic reconnection.

Prompt for agents
In packages/sandbox/src/container-connection.ts, inside the doConnect() method, after line 424 where ws.accept() is called and before setting this.ws = ws on line 426, register a close event handler on the WebSocket that resets the connection state. For Cloudflare Workers WebSocket API, use ws.addEventListener('close', ...) or ws.addEventListener('error', ...). The handler should set this.connected = false and this.stub = null so that the next call to rpc() triggers a fresh reconnection via connect(). Example:

ws.addEventListener('close', () => {
  this.connected = false;
  this.stub = null;
  this.ws = null;
  this.logger.debug('ContainerConnection WebSocket closed');
});

ws.addEventListener('error', () => {
  this.connected = false;
  this.stub = null;
  this.ws = null;
});

Place this before line 426 (this.ws = ws) so the handler is registered before the connection is marked as active.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant