AI Workbench is a self-hosted product surface for building, inspecting, and operating retrieval-backed AI applications on DataStax Astra. It gives teams one place to manage workspaces, knowledge bases, chunking / embedding / reranking services, document ingest, API keys, and retrieval experiments.
Under the product UI is a stable HTTP runtime. The default TypeScript
runtime ships in the same Docker image as the UI; alternative
language-native runtimes ("green boxes") live under
runtimes/ and expose the same /api/v1/* contract.
- Workspace command center. Workspaces isolate knowledge bases, execution services, documents, jobs, credentials, and API keys.
- Knowledge bases as first-class. A KB owns its Astra collection end-to-end and binds the chunking + embedding + (optional) reranking services that produce its content. The collection is auto-provisioned on create.
- Knowledge operations. Ingest raw text or files into a KB, track sync/async job state, and let the KB's bound services drive chunking and embedding.
- Retrieval playground. Run text, vector, hybrid, and rerank searches in the browser against real workspace data.
- Production-friendly controls. Start in memory, switch to file storage for single-node deployments, or use Astra Data API tables for a durable control plane.
- Technical spine intact. One
/api/v1/*contract, language-native runtimes, direct Astra SDK usage, secrets by reference, and fixture-enforced conformance.
┌───────────────────────────────────────────┐
│ WorkBench UI │
│ │
│ /playground /ingest (chats, MCP) │
└─────────────────────┬─────────────────────┘
│
BACKEND_URL
│
┌────────────────────────────┼────────────────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ TS runtime │ │ Python runtime │ … │ Other language │
│ (default, │ │ (FastAPI) │ │ runtimes │
│ embedded │ │ │ │ │
│ with UI) │ │ │ │ │
│ │ │ │ │ │
│ /api/v1/* │ │ /api/v1/* │ │ /api/v1/* │
└────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘
│ │ │
└──────────────── same HTTP contract ───────────────────┘
│
▼ (per-runtime Astra SDK)
┌──────────────────────────────────┐
│ Astra Data API │
│ Tables (control plane): │
│ wb_workspaces │
│ wb_config_knowledge_ │
│ bases_by_workspace │
│ wb_config_chunking/ │
│ embedding/reranking │
│ _service_by_workspace │
│ wb_rag_documents_ │
│ by_knowledge_base │
│ Collections (data plane): │
│ wb_vectors_<kb_id> │
│ (one per knowledge base) │
└──────────────────────────────────┘
See docs/architecture.md for the full model.
# Install root devDeps (Biome) + TS runtime deps.
npm ci && npm run install:ts
# Boot with the default in-memory control plane.
npm run dev # http://localhost:8080
# Hit it.
curl http://localhost:8080/healthz # {"status":"ok"}
curl http://localhost:8080/docs # Scalar-rendered API referenceThe npm run * scripts at root delegate into
runtimes/typescript/. You can also cd
into that directory and work there directly.
Switching to an Astra-backed control plane is a YAML change —
see docs/configuration.md. If you have
the astra CLI installed
and a profile configured, the runtime can auto-fill
ASTRA_DB_APPLICATION_TOKEN / ASTRA_DB_API_ENDPOINT at startup —
see docs/astra-cli.md.
All routes documented at /docs (Scalar UI) and
/api/v1/openapi.json (machine-readable).
| Method | Path | Purpose |
|---|---|---|
GET |
/ |
Service banner |
GET |
/healthz |
Liveness probe |
GET |
/readyz |
Readiness + workspace count |
GET |
/version |
Build metadata |
GET |
/docs |
Scalar-rendered API reference |
| Method | Path | Purpose |
|---|---|---|
GET / POST |
/api/v1/workspaces |
List / create workspaces |
GET / PATCH / DELETE |
/api/v1/workspaces/{w} |
Workspace CRUD (DELETE cascades) |
POST |
/api/v1/workspaces/{w}/test-connection |
Run a live workspace connection check |
GET / POST |
/api/v1/workspaces/{w}/knowledge-bases |
List / create knowledge bases (POST auto-provisions the underlying vector collection) |
GET / PATCH / DELETE |
/api/v1/workspaces/{w}/knowledge-bases/{kb} |
KB CRUD (DELETE drops the collection + cascades RAG documents) |
GET / POST |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents |
List / register a document in a KB |
GET / PATCH / DELETE |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d} |
Document metadata CRUD (DELETE cascades chunks in the KB's collection) |
GET |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d}/chunks |
List the chunks under a document |
POST |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest |
Sync ingest (chunk → embed → upsert → register Document) |
POST |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest?async=true |
Same pipeline, returns 202 + job pointer |
POST |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/records |
Upsert vector or text records (text → server-side $vectorize when supported, otherwise client-side embed) |
DELETE |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/records/{rid} |
Delete one |
POST |
/api/v1/workspaces/{w}/knowledge-bases/{kb}/search |
KB-scoped search (vector / text, optional hybrid + rerank) |
GET / POST / DELETE |
/api/v1/workspaces/{w}/chunking-services |
Chunking-service CRUD |
GET / POST / DELETE |
/api/v1/workspaces/{w}/embedding-services |
Embedding-service CRUD |
GET / POST / DELETE |
/api/v1/workspaces/{w}/reranking-services |
Reranking-service CRUD |
GET |
/api/v1/workspaces/{w}/jobs/{jobId} |
Poll an async-ingest job |
GET |
/api/v1/workspaces/{w}/jobs/{jobId}/events |
SSE stream of job updates until terminal state |
GET / POST |
/api/v1/workspaces/{w}/chats |
List / create Bobbie chats |
GET / PATCH / DELETE |
/api/v1/workspaces/{w}/chats/{c} |
Chat CRUD (DELETE cascades messages) |
GET |
/api/v1/workspaces/{w}/chats/{c}/messages |
Chat history, oldest-first |
POST |
/api/v1/workspaces/{w}/chats/{c}/messages |
Send a message; sync reply with retrieval-grounded HF chat-completion |
POST |
/api/v1/workspaces/{w}/chats/{c}/messages/stream |
Same flow as SSE — user-message + token deltas + terminal done/error |
GET / POST / PATCH / DELETE |
/api/v1/workspaces/{w}/agents |
User-defined agent CRUD (Bobbie + your own personas) |
GET / POST / PATCH / DELETE |
/api/v1/workspaces/{w}/agents/{a}/conversations |
Per-agent conversation CRUD |
POST |
/api/v1/workspaces/{w}/mcp |
Model Context Protocol façade (optional, mcp.enabled: true) — exposes the workspace as MCP tools for external agents |
GET / POST |
/api/v1/workspaces/{w}/api-keys |
List / issue workspace API keys |
DELETE |
/api/v1/workspaces/{w}/api-keys/{keyId} |
Revoke a workspace API key |
| Method | Path | Purpose |
|---|---|---|
GET |
/auth/config |
Tells the UI which credential surfaces to render |
GET |
/auth/login |
302 to the IdP's authorization endpoint (PKCE) |
GET |
/auth/callback |
Exchanges code for tokens, sets signed session cookie |
GET |
/auth/me |
Current session subject + access-token expiresAt + canRefresh |
POST |
/auth/refresh |
Silent refresh — swaps the cookie's refresh_token at the IdP without a redirect |
POST |
/auth/logout |
Clears the session cookie |
See docs/auth.md for the threat model and rollout
phases.
| Document | Purpose |
|---|---|
docs/overview.md |
Product overview, workflows, quickstart path |
docs/architecture.md |
System model, components, data flow |
docs/api-spec.md |
HTTP API contract narrative |
docs/auth.md |
/api/v1/* auth middleware, OIDC login, silent refresh, threat model |
docs/configuration.md |
workbench.yaml schema reference |
docs/production.md |
Production hardening checklist |
docs/workspaces.md |
Workspace model, scoping, cascade semantics |
docs/green-boxes.md |
Multi-runtime "green box" architecture |
docs/playground.md |
Playground UX, text/vector dispatch, hybrid + rerank, ingest dialog |
docs/chat.md |
Chat with Bobbie: HuggingFace-backed, multi-KB-grounded, SSE token streaming |
docs/agents.md |
User-defined agents: extend Bobbie with custom personas + per-agent conversation CRUD |
docs/mcp.md |
Model Context Protocol façade — expose a workspace as MCP tools for external agents |
docs/astra-cli.md |
astra-cli auto-detection of Astra credentials at runtime startup |
docs/conformance.md |
Cross-runtime contract testing |
docs/cross-replica-jobs.md |
Design note for cross-replica job pub/sub + in-flight resume (proposed) |
docs/roadmap.md |
Phased delivery plan and open questions |
docs/examples/workbench.yaml |
Annotated sample config |
apps/web/README.md |
Web UI quickstart, bundle layout, test commands |
runtimes/README.md |
Index of language-native runtimes |
conformance/README.md |
Conformance harness overview |
site/README.md |
Landing page + docs site (VitePress, deployed to GitHub Pages on every push that touches docs/**) |
ai-workbench/
├── package.json # Root orchestration + Biome
├── biome.json # Shared lint/format config
├── runtimes/ # N language-native runtimes (green boxes)
│ ├── README.md
│ ├── typescript/ # Default runtime — embedded with the UI
│ │ ├── src/
│ │ │ ├── root.ts # Process entry point
│ │ │ ├── app.ts # Hono app factory
│ │ │ ├── config/ # workbench.yaml loader + schema
│ │ │ ├── control-plane/ # Backend-agnostic store (memory/file/astra)
│ │ │ ├── drivers/ # Vector-store drivers (mock/astra)
│ │ │ ├── astra-client/ # astra-db-ts wrapper for wb_* tables
│ │ │ ├── secrets/ # SecretResolver + env/file providers
│ │ │ └── routes/
│ │ │ ├── operational.ts
│ │ │ └── api-v1/
│ │ ├── tests/ # Vitest suite (460+ tests)
│ │ ├── scripts/
│ │ ├── examples/workbench.yaml
│ │ ├── package.json
│ │ ├── tsconfig.json
│ │ └── Dockerfile
│ ├── python/ # Python runtime (FastAPI, scaffold)
│ │ ├── src/workbench/
│ │ ├── tests/
│ │ ├── pyproject.toml
│ │ └── README.md
│ └── java/ # Java runtime (Spring Boot, scaffold)
│ ├── src/main/java/com/datastax/aiworkbench/
│ ├── src/test/java/com/datastax/aiworkbench/
│ ├── build.gradle.kts
│ └── README.md
├── conformance/ # Cross-runtime contract harness
│ ├── scenarios.json
│ ├── scenarios.md
│ ├── fixtures/ # Expected normalized HTTP responses
│ ├── mock-astra/ # Deterministic Astra stand-in
│ ├── normalize.mjs
│ └── runner.mjs
└── docs/ # Narrative documentation
TBD.