Skip to content

datastax/ai-workbench

Repository files navigation

AI Workbench

AI Workbench is a self-hosted product surface for building, inspecting, and operating retrieval-backed AI applications on DataStax Astra. It gives teams one place to manage workspaces, knowledge bases, chunking / embedding / reranking services, document ingest, API keys, and retrieval experiments.

Under the product UI is a stable HTTP runtime. The default TypeScript runtime ships in the same Docker image as the UI; alternative language-native runtimes ("green boxes") live under runtimes/ and expose the same /api/v1/* contract.

At a glance

  • Workspace command center. Workspaces isolate knowledge bases, execution services, documents, jobs, credentials, and API keys.
  • Knowledge bases as first-class. A KB owns its Astra collection end-to-end and binds the chunking + embedding + (optional) reranking services that produce its content. The collection is auto-provisioned on create.
  • Knowledge operations. Ingest raw text or files into a KB, track sync/async job state, and let the KB's bound services drive chunking and embedding.
  • Retrieval playground. Run text, vector, hybrid, and rerank searches in the browser against real workspace data.
  • Production-friendly controls. Start in memory, switch to file storage for single-node deployments, or use Astra Data API tables for a durable control plane.
  • Technical spine intact. One /api/v1/* contract, language-native runtimes, direct Astra SDK usage, secrets by reference, and fixture-enforced conformance.

Architecture

                   ┌───────────────────────────────────────────┐
                   │               WorkBench UI                │
                   │                                           │
                   │   /playground   /ingest   (chats, MCP)    │
                   └─────────────────────┬─────────────────────┘
                                         │
                                    BACKEND_URL
                                         │
            ┌────────────────────────────┼────────────────────────────┐
            ▼                            ▼                            ▼
   ┌──────────────────┐       ┌──────────────────┐         ┌──────────────────┐
   │  TS runtime      │       │  Python runtime  │   …     │  Other language  │
   │  (default,       │       │  (FastAPI)       │         │  runtimes        │
   │   embedded       │       │                  │         │                  │
   │   with UI)       │       │                  │         │                  │
   │                  │       │                  │         │                  │
   │  /api/v1/*       │       │  /api/v1/*       │         │  /api/v1/*       │
   └────────┬─────────┘       └────────┬─────────┘         └────────┬─────────┘
            │                          │                            │
            └──────────────── same HTTP contract ───────────────────┘
                                       │
                                       ▼ (per-runtime Astra SDK)
                        ┌──────────────────────────────────┐
                        │   Astra Data API                 │
                        │     Tables (control plane):      │
                        │       wb_workspaces              │
                        │       wb_config_knowledge_       │
                        │         bases_by_workspace       │
                        │       wb_config_chunking/        │
                        │         embedding/reranking      │
                        │         _service_by_workspace    │
                        │       wb_rag_documents_          │
                        │         by_knowledge_base        │
                        │     Collections (data plane):    │
                        │       wb_vectors_<kb_id>         │
                        │       (one per knowledge base)   │
                        └──────────────────────────────────┘

See docs/architecture.md for the full model.

Quickstart

# Install root devDeps (Biome) + TS runtime deps.
npm ci && npm run install:ts

# Boot with the default in-memory control plane.
npm run dev                            # http://localhost:8080

# Hit it.
curl http://localhost:8080/healthz     # {"status":"ok"}
curl http://localhost:8080/docs        # Scalar-rendered API reference

The npm run * scripts at root delegate into runtimes/typescript/. You can also cd into that directory and work there directly.

Switching to an Astra-backed control plane is a YAML change — see docs/configuration.md. If you have the astra CLI installed and a profile configured, the runtime can auto-fill ASTRA_DB_APPLICATION_TOKEN / ASTRA_DB_API_ENDPOINT at startup — see docs/astra-cli.md.

Current HTTP surface

All routes documented at /docs (Scalar UI) and /api/v1/openapi.json (machine-readable).

Operational (unversioned)

Method Path Purpose
GET / Service banner
GET /healthz Liveness probe
GET /readyz Readiness + workspace count
GET /version Build metadata
GET /docs Scalar-rendered API reference

/api/v1/*

Method Path Purpose
GET / POST /api/v1/workspaces List / create workspaces
GET / PATCH / DELETE /api/v1/workspaces/{w} Workspace CRUD (DELETE cascades)
POST /api/v1/workspaces/{w}/test-connection Run a live workspace connection check
GET / POST /api/v1/workspaces/{w}/knowledge-bases List / create knowledge bases (POST auto-provisions the underlying vector collection)
GET / PATCH / DELETE /api/v1/workspaces/{w}/knowledge-bases/{kb} KB CRUD (DELETE drops the collection + cascades RAG documents)
GET / POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/documents List / register a document in a KB
GET / PATCH / DELETE /api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d} Document metadata CRUD (DELETE cascades chunks in the KB's collection)
GET /api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d}/chunks List the chunks under a document
POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest Sync ingest (chunk → embed → upsert → register Document)
POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest?async=true Same pipeline, returns 202 + job pointer
POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/records Upsert vector or text records (text → server-side $vectorize when supported, otherwise client-side embed)
DELETE /api/v1/workspaces/{w}/knowledge-bases/{kb}/records/{rid} Delete one
POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/search KB-scoped search (vector / text, optional hybrid + rerank)
GET / POST / DELETE /api/v1/workspaces/{w}/chunking-services Chunking-service CRUD
GET / POST / DELETE /api/v1/workspaces/{w}/embedding-services Embedding-service CRUD
GET / POST / DELETE /api/v1/workspaces/{w}/reranking-services Reranking-service CRUD
GET /api/v1/workspaces/{w}/jobs/{jobId} Poll an async-ingest job
GET /api/v1/workspaces/{w}/jobs/{jobId}/events SSE stream of job updates until terminal state
GET / POST /api/v1/workspaces/{w}/chats List / create Bobbie chats
GET / PATCH / DELETE /api/v1/workspaces/{w}/chats/{c} Chat CRUD (DELETE cascades messages)
GET /api/v1/workspaces/{w}/chats/{c}/messages Chat history, oldest-first
POST /api/v1/workspaces/{w}/chats/{c}/messages Send a message; sync reply with retrieval-grounded HF chat-completion
POST /api/v1/workspaces/{w}/chats/{c}/messages/stream Same flow as SSE — user-message + token deltas + terminal done/error
GET / POST / PATCH / DELETE /api/v1/workspaces/{w}/agents User-defined agent CRUD (Bobbie + your own personas)
GET / POST / PATCH / DELETE /api/v1/workspaces/{w}/agents/{a}/conversations Per-agent conversation CRUD
POST /api/v1/workspaces/{w}/mcp Model Context Protocol façade (optional, mcp.enabled: true) — exposes the workspace as MCP tools for external agents
GET / POST /api/v1/workspaces/{w}/api-keys List / issue workspace API keys
DELETE /api/v1/workspaces/{w}/api-keys/{keyId} Revoke a workspace API key

/auth/* (browser OIDC login, optional)

Method Path Purpose
GET /auth/config Tells the UI which credential surfaces to render
GET /auth/login 302 to the IdP's authorization endpoint (PKCE)
GET /auth/callback Exchanges code for tokens, sets signed session cookie
GET /auth/me Current session subject + access-token expiresAt + canRefresh
POST /auth/refresh Silent refresh — swaps the cookie's refresh_token at the IdP without a redirect
POST /auth/logout Clears the session cookie

See docs/auth.md for the threat model and rollout phases.

Documentation

Document Purpose
docs/overview.md Product overview, workflows, quickstart path
docs/architecture.md System model, components, data flow
docs/api-spec.md HTTP API contract narrative
docs/auth.md /api/v1/* auth middleware, OIDC login, silent refresh, threat model
docs/configuration.md workbench.yaml schema reference
docs/production.md Production hardening checklist
docs/workspaces.md Workspace model, scoping, cascade semantics
docs/green-boxes.md Multi-runtime "green box" architecture
docs/playground.md Playground UX, text/vector dispatch, hybrid + rerank, ingest dialog
docs/chat.md Chat with Bobbie: HuggingFace-backed, multi-KB-grounded, SSE token streaming
docs/agents.md User-defined agents: extend Bobbie with custom personas + per-agent conversation CRUD
docs/mcp.md Model Context Protocol façade — expose a workspace as MCP tools for external agents
docs/astra-cli.md astra-cli auto-detection of Astra credentials at runtime startup
docs/conformance.md Cross-runtime contract testing
docs/cross-replica-jobs.md Design note for cross-replica job pub/sub + in-flight resume (proposed)
docs/roadmap.md Phased delivery plan and open questions
docs/examples/workbench.yaml Annotated sample config
apps/web/README.md Web UI quickstart, bundle layout, test commands
runtimes/README.md Index of language-native runtimes
conformance/README.md Conformance harness overview
site/README.md Landing page + docs site (VitePress, deployed to GitHub Pages on every push that touches docs/**)

Project layout

ai-workbench/
├── package.json                      # Root orchestration + Biome
├── biome.json                        # Shared lint/format config
├── runtimes/                         # N language-native runtimes (green boxes)
│   ├── README.md
│   ├── typescript/                   # Default runtime — embedded with the UI
│   │   ├── src/
│   │   │   ├── root.ts               # Process entry point
│   │   │   ├── app.ts                # Hono app factory
│   │   │   ├── config/               # workbench.yaml loader + schema
│   │   │   ├── control-plane/        # Backend-agnostic store (memory/file/astra)
│   │   │   ├── drivers/              # Vector-store drivers (mock/astra)
│   │   │   ├── astra-client/         # astra-db-ts wrapper for wb_* tables
│   │   │   ├── secrets/              # SecretResolver + env/file providers
│   │   │   └── routes/
│   │   │       ├── operational.ts
│   │   │       └── api-v1/
│   │   ├── tests/                    # Vitest suite (460+ tests)
│   │   ├── scripts/
│   │   ├── examples/workbench.yaml
│   │   ├── package.json
│   │   ├── tsconfig.json
│   │   └── Dockerfile
│   ├── python/                       # Python runtime (FastAPI, scaffold)
│   │   ├── src/workbench/
│   │   ├── tests/
│   │   ├── pyproject.toml
│   │   └── README.md
│   └── java/                         # Java runtime (Spring Boot, scaffold)
│       ├── src/main/java/com/datastax/aiworkbench/
│       ├── src/test/java/com/datastax/aiworkbench/
│       ├── build.gradle.kts
│       └── README.md
├── conformance/                      # Cross-runtime contract harness
│   ├── scenarios.json
│   ├── scenarios.md
│   ├── fixtures/                     # Expected normalized HTTP responses
│   ├── mock-astra/                   # Deterministic Astra stand-in
│   ├── normalize.mjs
│   └── runner.mjs
└── docs/                             # Narrative documentation

License

TBD.

About

A modern workbench for document ingestion powered by Astra DB

Resources

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors