diff --git a/README.md b/README.md
index 3834054..1399bf3 100644
--- a/README.md
+++ b/README.md
@@ -2,8 +2,9 @@
 
 AI Workbench is a self-hosted product surface for building, inspecting,
 and operating retrieval-backed AI applications on **DataStax Astra**.
-It gives teams one place to manage workspaces, catalogs, vector stores,
-document ingest, saved queries, API keys, and retrieval experiments.
+It gives teams one place to manage workspaces, knowledge bases,
+chunking / embedding / reranking services, document ingest, API keys,
+and retrieval experiments.
 
 Under the product UI is a stable HTTP runtime. The default TypeScript
 runtime ships in the same Docker image as the UI; alternative
@@ -12,11 +13,15 @@ language-native runtimes ("green boxes") live under
 
 ## At a glance
 
-- **Workspace command center.** Workspaces isolate catalogs, vector
-  stores, documents, saved queries, jobs, credentials, and API keys.
-- **Knowledge operations.** Ingest raw text or files into catalogs,
-  track sync/async job state, and bind content to the vector store that
-  powers retrieval.
+- **Workspace command center.** Workspaces isolate knowledge bases,
+  execution services, documents, jobs, credentials, and API keys.
+- **Knowledge bases as first-class.** A KB owns its Astra collection
+  end-to-end and binds the chunking + embedding + (optional)
+  reranking services that produce its content. The collection is
+  auto-provisioned on create.
+- **Knowledge operations.** Ingest raw text or files into a KB,
+  track sync/async job state, and let the KB's bound services drive
+  chunking and embedding.
 - **Retrieval playground.** Run text, vector, hybrid, and rerank
   searches in the browser against real workspace data.
 - **Production-friendly controls.** Start in memory, switch to file
@@ -51,17 +56,21 @@ language-native runtimes ("green boxes") live under
             └──────────────── same HTTP contract ───────────────────┘
                                        │
                                        ▼ (per-runtime Astra SDK)
-                        ┌─────────────────────────────┐
-                        │   Astra Data API            │
-                        │     Tables (control plane): │
-                        │       wb_workspaces         │
-                        │       wb_catalog_by_ws      │
-                        │       wb_vector_store_by_ws │
-                        │       wb_documents_by_cat   │
-                        │     Collections (data       │
-                        │       plane): one per       │
-                        │       vector store          │
-                        └─────────────────────────────┘
+                        ┌──────────────────────────────────┐
+                        │   Astra Data API                 │
+                        │     Tables (control plane):      │
+                        │       wb_workspaces              │
+                        │       wb_config_knowledge_       │
+                        │         bases_by_workspace       │
+                        │       wb_config_chunking/        │
+                        │         embedding/reranking      │
+                        │         _service_by_workspace    │
+                        │       wb_rag_documents_          │
+                        │         by_knowledge_base        │
+                        │     Collections (data plane):    │
+                        │       wb_vectors_<kb_id>         │
+                        │       (one per knowledge base)   │
+                        └──────────────────────────────────┘
 ```
 
 See [`docs/architecture.md`](docs/architecture.md) for the full model.
@@ -109,26 +118,21 @@ All routes documented at `/docs` (Scalar UI) and
 | `GET / POST` | `/api/v1/workspaces` | List / create workspaces |
 | `GET / PUT / DELETE` | `/api/v1/workspaces/{w}` | Workspace CRUD (DELETE cascades) |
 | `POST` | `/api/v1/workspaces/{w}/test-connection` | Resolve configured workspace credential refs |
-| `GET / POST` | `/api/v1/workspaces/{w}/catalogs` | List / create catalogs |
-| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/catalogs/{c}` | Catalog CRUD (DELETE cascades to documents + saved queries) |
-| `GET / POST` | `/api/v1/workspaces/{w}/catalogs/{c}/documents` | List / create document metadata |
-| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/catalogs/{c}/documents/{d}` | Document metadata CRUD (DELETE cascades chunks via the bound vector store) |
-| `GET` | `/api/v1/workspaces/{w}/catalogs/{c}/documents/{d}/chunks` | List the chunks under a document (id, chunkIndex, text, payload) |
-| `POST` | `/api/v1/workspaces/{w}/catalogs/{c}/documents/search` | Catalog-scoped search (vector / text, optional hybrid + rerank) |
-| `POST` | `/api/v1/workspaces/{w}/catalogs/{c}/ingest` | Sync ingest (chunk → embed → upsert → register Document) |
-| `POST` | `/api/v1/workspaces/{w}/catalogs/{c}/ingest?async=true` | Same pipeline, returns 202 + job pointer |
-| `GET / POST` | `/api/v1/workspaces/{w}/catalogs/{c}/queries` | List / create saved queries |
-| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/catalogs/{c}/queries/{q}` | Saved-query CRUD |
-| `POST` | `/api/v1/workspaces/{w}/catalogs/{c}/queries/{q}/run` | Replay a saved query through catalog-scoped search |
+| `GET / POST` | `/api/v1/workspaces/{w}/knowledge-bases` | List / create knowledge bases (POST auto-provisions the underlying vector collection) |
+| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}` | KB CRUD (DELETE drops the collection + cascades RAG documents) |
+| `GET / POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents` | List / register a document in a KB |
+| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d}` | Document metadata CRUD (DELETE cascades chunks in the KB's collection) |
+| `GET` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d}/chunks` | List the chunks under a document |
+| `POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest` | Sync ingest (chunk → embed → upsert → register Document) |
+| `POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest?async=true` | Same pipeline, returns 202 + job pointer |
+| `POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/records` | Upsert vector or text records (text → server-side `$vectorize` when supported, otherwise client-side embed) |
+| `DELETE` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/records/{rid}` | Delete one |
+| `POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/search` | KB-scoped search (vector / text, optional hybrid + rerank) |
+| `GET / POST / DELETE` | `/api/v1/workspaces/{w}/chunking-services` | Chunking-service CRUD |
+| `GET / POST / DELETE` | `/api/v1/workspaces/{w}/embedding-services` | Embedding-service CRUD |
+| `GET / POST / DELETE` | `/api/v1/workspaces/{w}/reranking-services` | Reranking-service CRUD |
 | `GET` | `/api/v1/workspaces/{w}/jobs/{jobId}` | Poll an async-ingest job |
 | `GET` | `/api/v1/workspaces/{w}/jobs/{jobId}/events` | SSE stream of job updates until terminal state |
-| `GET / POST` | `/api/v1/workspaces/{w}/vector-stores` | List / create vector-store descriptors (POST provisions the collection too) |
-| `GET` | `/api/v1/workspaces/{w}/vector-stores/discoverable` | List data-plane collections not yet wrapped in a descriptor |
-| `POST` | `/api/v1/workspaces/{w}/vector-stores/adopt` | Wrap an existing collection in a descriptor without re-provisioning |
-| `GET / PUT / DELETE` | `/api/v1/workspaces/{w}/vector-stores/{v}` | Descriptor CRUD (DELETE drops the collection) |
-| `POST` | `/api/v1/workspaces/{w}/vector-stores/{v}/records` | Upsert vector or text records (text → server-side `$vectorize` when supported, otherwise client-side embed) |
-| `DELETE` | `/api/v1/workspaces/{w}/vector-stores/{v}/records/{rid}` | Delete one |
-| `POST` | `/api/v1/workspaces/{w}/vector-stores/{v}/search` | Vector or text search; supports `hybrid`, `lexicalWeight`, `rerank` |
 | `GET / POST` | `/api/v1/workspaces/{w}/api-keys` | List / issue workspace API keys |
 | `DELETE` | `/api/v1/workspaces/{w}/api-keys/{keyId}` | Revoke a workspace API key |
 
diff --git a/apps/web/README.md b/apps/web/README.md
index c7d4c29..9ffde93 100644
--- a/apps/web/README.md
+++ b/apps/web/README.md
@@ -6,13 +6,13 @@ consumes `/api/v1/workspaces` on the default TypeScript runtime.
 ## Status
 
 **Shipped.** First-run onboarding wizard, workspace list / detail /
-edit / destructive delete, full CRUD over catalogs, vector-store
-descriptors, and workspace-scoped API keys. Async ingest from the
-browser (file upload → chunk → embed → upsert) with live progress
-streamed via SSE. Saved-query CRUD per catalog, runnable from the UI.
-Playground for ad-hoc text / vector / hybrid / rerank queries against
-any vector store. OIDC login + silent refresh and paste-a-token
-fallback are both wired through the same auth layer.
+edit / destructive delete, full CRUD over knowledge bases,
+chunking / embedding / reranking services, and workspace-scoped API
+keys. Async ingest from the browser (file upload → chunk → embed →
+upsert) with live progress streamed via SSE. Playground for ad-hoc
+text / vector / hybrid / rerank queries against any knowledge base.
+OIDC login + silent refresh and paste-a-token fallback are both wired
+through the same auth layer.
 
 HCD and OpenRAG kinds are visible in the onboarding picker but
 intentionally non-selectable ("Coming soon" badge) — the runtime
@@ -95,38 +95,27 @@ navigation shows the shared loader while the chunk streams.
 |---|---|
 | `/` | Workspaces list. Redirects to `/onboarding` when empty. |
 | `/onboarding` | Two-step wizard — pick a backend kind, then fill details. HCD / OpenRAG tiles render but are non-selectable. |
-| `/workspaces/:uid` | Detail + edit + destructive delete (type-to-confirm). Hosts the catalogs, vector-stores, and API-keys panels for this workspace. |
-| `/workspaces/:uid/catalogs/:catalogId` | Catalog explorer — sortable / filterable document table with file-type badges, sizes, statuses, and a click-through detail dialog. Multi-file / folder ingest queue lives here. |
-| `/playground` | Ad-hoc text / vector / hybrid / rerank queries against a workspace's vector stores. See [`docs/playground.md`](../../docs/playground.md). |
+| `/workspaces/:uid` | Detail + edit + destructive delete (type-to-confirm). Hosts the knowledge-bases, services, and API-keys panels for this workspace. |
+| `/workspaces/:uid/knowledge-bases/:kbId` | Knowledge-base explorer — sortable / filterable document table with file-type badges, sizes, statuses, and a click-through detail dialog. Multi-file / folder ingest queue lives here. |
+| `/playground` | Ad-hoc text / vector / hybrid / rerank queries against a workspace's knowledge bases. See [`docs/playground.md`](../../docs/playground.md). |
 
 The workspace detail page composes four panels (collapsible cards):
 
 | Panel | What it does |
 |---|---|
-| Catalogs | List + create + delete catalogs. Each row expands to a quick-look document preview and houses the saved-queries section. The "Open" button on every row jumps to the catalog explorer for the full table; "Ingest" pops the multi-file / folder upload queue. |
-| Vector stores | List + create + delete vector-store descriptors. Create flow provisions the underlying collection on the bound driver. |
+| Knowledge bases | List + create + delete knowledge bases. Create flow auto-provisions the underlying vector collection sized to the bound embedding service. The "Open" button on every row jumps to the KB explorer for the full document table; "Ingest" pops the multi-file / folder upload queue. |
+| Services | List + create + delete chunking, embedding, and reranking service definitions. Services are reusable across knowledge bases in the same workspace. |
 | API keys | List + issue + revoke workspace-scoped `wb_live_*` keys. Fresh keys are shown once, then masked. |
 | Detail / edit | The kind-aware edit form (kind is read-only after create) and the destructive delete dialog. |
 
-The catalog explorer adds:
+The KB explorer adds:
 
 - A document table with sortable columns (name, size, chunks, status, ingestedAt) and an inline filename/source-id filter.
 - Color-coded `FileTypeBadge` (Markdown violet, structured-data emerald, tabular amber, code blue, etc.) and pill-shaped `DocumentStatusBadge` (animated glyph for in-flight states).
-- Per-row trash button that pops a confirm dialog and runs the cascade-delete: the bound vector store's chunks are wiped before the document row is dropped, so deleted documents don't surface in playground searches.
+- Per-row trash button that pops a confirm dialog and runs the cascade-delete: the KB's chunks are wiped before the document row is dropped, so deleted documents don't surface in playground searches.
 - Click-through metadata dialog showing the full Document record, the failure message verbatim when status is `failed`, **and** the chunks the runtime extracted (chunk index, id, and snippet text — text comes from the reserved `chunkText` payload key the ingest pipeline stamps).
 - An ingest queue dialog accepting drag-drop, multi-file picker, or a folder picker (`webkitdirectory`). Files run sequentially through async ingest with a per-row live progress bar — sequential rather than parallel so embedding-provider rate limits stay predictable and a misbehaving file doesn't tank the others.
 
-The vector-stores panel on the workspace detail also exposes an
-**Adopt existing** button. It opens a dialog listing collections
-that already live in the workspace's data plane but aren't yet
-wrapped in a workbench descriptor (created by another tool, by
-hand, by an older workbench install whose state was wiped). One
-click adopts the collection — the runtime reads the live vector /
-lexical / rerank options off the data plane and stamps a matching
-descriptor without re-provisioning. Mock workspaces always see
-the empty state since the mock driver has no notion of "external"
-collections.
-
 ## Stack
 
 - **Vite + React 19 + TypeScript** — standard modern baseline.
@@ -137,7 +126,7 @@ collections.
 - **React Hook Form + Zod** for forms; the same Zod schemas that
   describe API shapes drive form validation, so the UI and backend
   can't disagree about request shape.
-- **React Router** for the five routes (`/`, `/onboarding`, `/workspaces/:uid`, `/workspaces/:uid/catalogs/:catalogId`, `/playground`).
+- **React Router** for the five routes (`/`, `/onboarding`, `/workspaces/:uid`, `/workspaces/:uid/knowledge-bases/:kbId`, `/playground`).
 - **Sonner** for toasts.
 - **Lucide React** for icons.
 
@@ -162,11 +151,10 @@ apps/web/
 │   │   └── utils.ts                 ← cn() + formatDate()
 │   ├── hooks/
 │   │   ├── useWorkspaces.ts         ← list/get/create/update/delete
-│   │   ├── useCatalogs.ts           ← catalog CRUD
-│   │   ├── useDocuments.ts          ← per-catalog document list
-│   │   ├── useVectorStores.ts       ← vector-store descriptor CRUD
+│   │   ├── useKnowledgeBases.ts     ← knowledge-base CRUD
+│   │   ├── useServices.ts           ← chunking/embedding/reranking service CRUD
+│   │   ├── useDocuments.ts          ← per-KB document list
 │   │   ├── useIngest.ts             ← async ingest + SSE progress
-│   │   ├── useSavedQueries.ts       ← saved-query CRUD + /run
 │   │   ├── usePlaygroundSearch.ts   ← /search dispatch + result hits
 │   │   ├── useApiKeys.ts            ← workspace API-key mutations
 │   │   ├── useAuthToken.ts          ← reactive bearer-token hook
@@ -190,22 +178,19 @@ apps/web/
 │   │       ├── TestConnectionPanel.tsx
 │   │       ├── ApiKeysPanel.tsx
 │   │       ├── CreateApiKeyDialog.tsx
-│   │       ├── CatalogsPanel.tsx    ← catalog list + per-row docs preview
-│   │       ├── CreateCatalogDialog.tsx
+│   │       ├── KnowledgeBasesPanel.tsx ← KB list + per-row docs preview
+│   │       ├── CreateKnowledgeBaseDialog.tsx
+│   │       ├── ServicesPanel.tsx    ← chunking/embedding/reranking services
 │   │       ├── DocumentTable.tsx    ← sortable doc table for the explorer
 │   │       ├── DocumentDetailDialog.tsx
 │   │       ├── DocumentStatusBadge.tsx
 │   │       ├── FileTypeBadge.tsx
-│   │       ├── IngestQueueDialog.tsx ← multi-file / folder ingest queue
-│   │       ├── SavedQueriesSection.tsx
-│   │       ├── VectorStoresPanel.tsx
-│   │       ├── CreateVectorStoreDialog.tsx
-│   │       └── AdoptCollectionDialog.tsx ← discover + adopt existing collections
+│   │       └── IngestQueueDialog.tsx ← multi-file / folder ingest queue
 │   └── pages/
 │       ├── WorkspacesPage.tsx
 │       ├── OnboardingPage.tsx
 │       ├── WorkspaceDetailPage.tsx
-│       ├── CatalogExplorerPage.tsx
+│       ├── KnowledgeBaseExplorerPage.tsx
 │       └── PlaygroundPage.tsx
 ```
 
@@ -218,7 +203,8 @@ apps/web/
   `provider:path` shape inline and drops empty rows before submit.
   The runtime rejects raw secrets with `400` anyway.
 - **Destructive delete requires typing the workspace name.** Cascade
-  is real — catalogs, vector-store collections, and documents all go.
+  is real — knowledge bases, their underlying vector collections,
+  service definitions, and documents all go.
 - **Empty state → onboarding redirect.** First-run users never see a
   bare "no workspaces" screen; they land directly in the wizard.
 - **List order is deterministic.** The runtime sorts by `createdAt`
@@ -246,11 +232,11 @@ apps/web/
 | `npm test` | Unit + component tests under `src/**/*.{test,spec}.{ts,tsx}` (vitest + jsdom + RTL). Fast — no browser. |
 | `npm run test:watch` | Same in watch mode. |
 | `npm run test:coverage` | Same as `npm test` but with v8 coverage. **Gates `src/lib/**` at lines: 50, statements: 50, branches: 80, functions: 20.** Components are exercised end-to-end through Playwright; locking thresholds on them prematurely pushes toward shallow tests. |
-| `npm run test:e2e` | Playwright golden-path spec. Builds the runtime + SPA, boots the runtime against the bundled `examples/workbench.yaml` (memory backend, auth disabled), drives Chromium through the onboarding → vector-store → upsert → playground flow. Reuses an existing `:8080` server in dev; CI starts a fresh one. |
+| `npm run test:e2e` | Playwright golden-path spec. Builds the runtime + SPA, boots the runtime against the bundled `examples/workbench.yaml` (memory backend, auth disabled), drives Chromium through the onboarding → services → knowledge-base → upsert → playground flow. Reuses an existing `:8080` server in dev; CI starts a fresh one. |
 | `npm run test:e2e:ui` | Same in Playwright's UI mode for debugging. |
 | `npm run e2e:install` | One-time: `playwright install chromium --with-deps`. |
 
-E2E specs deliberately stay on the **vector** lane. The route's `resolveQuery()` always builds an `Embedder` for any text query (so hybrid search has a vector handle); with a mock embedding descriptor the production embedder factory throws `embedding_unavailable`. Vector input bypasses that path entirely. Adding text-search coverage to the E2E suite needs either a real provider key in CI or a runtime override that lets a fake embedder run alongside production code — both deferred.
+E2E specs deliberately stay on the **vector** lane. The route's `resolveQuery()` always builds an `Embedder` for any text query (so hybrid search has a vector handle); with a mock embedding-service config the production embedder factory throws `embedding_unavailable`. Vector input bypasses that path entirely. Adding text-search coverage to the E2E suite needs either a real provider key in CI or a runtime override that lets a fake embedder run alongside production code — both deferred.
 
 ## House rules
 
diff --git a/docs/api-spec.md b/docs/api-spec.md
index 0c0f1c8..e5265fb 100644
--- a/docs/api-spec.md
+++ b/docs/api-spec.md
@@ -42,8 +42,9 @@ Every nested resource carries its parent UIDs in the path:
 
 ```
 /api/v1/workspaces/{workspaceUid}
-/api/v1/workspaces/{workspaceUid}/catalogs/{catalogUid}
-/api/v1/workspaces/{workspaceUid}/vector-stores/{vectorStoreUid}
+/api/v1/workspaces/{workspaceUid}/knowledge-bases/{knowledgeBaseUid}
+/api/v1/workspaces/{workspaceUid}/knowledge-bases/{kb}/documents/{documentUid}
+/api/v1/workspaces/{workspaceUid}/{chunking,embedding,reranking}-services/{uid}
 ```
 
 A request whose path references a non-existent workspace returns
@@ -94,19 +95,16 @@ human-readable and may change. Currently emitted:
 | 413 | `payload_too_large` | `/api/v1/workspaces/*` request body exceeded the runtime's 1 MB JSON body limit. |
 | 404 | `not_found` | Unknown route |
 | 404 | `workspace_not_found` | Workspace UID doesn't exist |
-| 404 | `catalog_not_found` | Catalog UID doesn't exist in workspace |
-| 404 | `vector_store_not_found` | Vector-store UID doesn't exist in workspace |
-| 404 | `document_not_found` | Document UID doesn't exist in the catalog |
+| 404 | `knowledge_base_not_found` | Knowledge-base UID doesn't exist in workspace |
+| 404 | `document_not_found` | Document UID doesn't exist in the knowledge base |
+| 404 | `chunking_service_not_found` / `embedding_service_not_found` / `reranking_service_not_found` | Service UID doesn't exist in workspace |
 | 404 | `job_not_found` | Job ID doesn't exist in the workspace |
-| 404 | `saved_query_not_found` | Saved query UID doesn't exist in the catalog |
-| 409 | `conflict` | Create with an already-taken UID |
-| 409 | `catalog_not_bound_to_vector_store` | Catalog-scoped search against a catalog whose `vectorStore` is `null` |
+| 409 | `conflict` | Create with an already-taken UID, or service deletion refused while a KB still references it |
 | 501 | `hybrid_not_supported` | Caller asked for hybrid search on a workspace kind whose driver doesn't implement `searchHybrid` |
 | 501 | `rerank_not_supported` | Caller asked for rerank on a workspace kind whose driver doesn't implement `rerank` |
-| 409 | `catalog_not_bound_to_vector_store` | Catalog-scoped search, ingest, or saved-query run against a catalog whose `vectorStore` is `null` |
-| 400 | `dimension_mismatch` | Supplied vector length doesn't match the vector-store descriptor |
-| 400 | `embedding_unavailable` | Text search/upsert fallback could not build an embedder for the descriptor |
-| 400 | `embedding_dimension_mismatch` | Embedder output dimension doesn't match the descriptor |
+| 400 | `dimension_mismatch` | Supplied vector length doesn't match the KB's bound embedding service |
+| 400 | `embedding_unavailable` | Text search/upsert fallback could not build an embedder for the KB's bound embedding service |
+| 400 | `embedding_dimension_mismatch` | Embedder output dimension doesn't match the bound embedding service |
 | 422 | `workspace_misconfigured` | Workspace is missing endpoint, token, keyspace, or similar driver-required config |
 | 500 | `internal_error` | Unhandled exception |
 | 503 | `control_plane_unavailable` | Backing store is unreachable |
@@ -264,7 +262,7 @@ omitted.
 `kind` is one of `astra | hcd | openrag | mock`. (`mock` stays a
 first-class option for CI and offline work.) Once set, `kind` is
 immutable — changing it would orphan any already-provisioned
-vector-store collections.
+KB collections.
 
 `endpoint` is the workspace's data-plane URL (for `astra` / `hcd`,
 the Astra Data API endpoint). Accepts either a literal URL or a
@@ -299,14 +297,15 @@ Patch one or more of `name`, `endpoint`, `credentialsRef`,
 
 ### `DELETE /api/v1/workspaces/{workspaceUid}`
 
-Cascades to the workspace's catalogs, vector-store descriptors, and
-documents. Before removing the control-plane rows, the runtime drops
-each underlying vector-store collection through the workspace's driver.
+Cascades to the workspace's knowledge bases, execution services,
+RAG documents, and API keys. Before removing the control-plane
+rows, the runtime drops each KB's underlying Astra collection
+through the workspace's driver.
 
 - **204** — deleted
 - **404** `workspace_not_found`
-- **503** `driver_unavailable` — workspace has vector stores but no
-  registered driver to drop their collections
+- **503** `driver_unavailable` — workspace has knowledge bases but
+  no registered driver to drop their collections
 
 ### `POST /api/v1/workspaces/{workspaceUid}/test-connection`
 
@@ -404,79 +403,97 @@ no-op that still returns `204`.
 
 ---
 
-## `/api/v1/workspaces/{workspaceUid}/catalogs`
+## `/api/v1/workspaces/{workspaceUid}/{chunking,embedding,reranking}-services`
+
+Workspace-scoped execution services. Knowledge bases compose one
+chunking + one embedding + (optionally) one reranking service at
+create time. The three surfaces share an identical CRUD shape; only
+the body fields differ.
 
 ### `GET`
 
-List catalogs in the workspace.
+List services in the workspace.
 
-- **200** — paginated `Catalog` records
+- **200** — paginated `ChunkingService` / `EmbeddingService` /
+  `RerankingService` records (sorted by `createdAt` ascending,
+  `*ServiceId` as tie-breaker)
 - **404** `workspace_not_found`
 
-A `Catalog`:
-
-```json
-{
-  "workspace": "…",
-  "uid": "…",
-  "name": "support",
-  "description": null,
-  "vectorStore": "…",
-  "createdAt": "…",
-  "updatedAt": "…"
-}
-```
-
 ### `POST`
 
-Create a catalog. `vectorStore` is optional and refers to a vector
-store in the same workspace (N:1 — multiple catalogs may share a
-single vector store).
+Create a service. The runtime generates a UID if `uid` is omitted.
+Required fields by kind:
 
-**Request**
+| Kind | Required |
+|---|---|
+| chunking | `name`, `engine` |
+| embedding | `name`, `provider`, `modelName`, `embeddingDimension` |
+| reranking | `name`, `provider`, `modelName` |
+
+Optional fields cover endpoint config (`endpointBaseUrl`,
+`endpointPath`, `requestTimeoutMs`, `authType`, `credentialRef`),
+provider/engine tuning, and supported language/content tags. See
+the OpenAPI spec for the full per-kind shape.
 
 ```json
-{ "name": "support", "vectorStore": "<vector-store-uid>" }
+{
+  "name": "openai-3-small",
+  "provider": "openai",
+  "modelName": "text-embedding-3-small",
+  "embeddingDimension": 1536,
+  "distanceMetric": "cosine",
+  "endpointBaseUrl": "https://api.openai.com/v1",
+  "credentialRef": "env:OPENAI_API_KEY",
+  "supportedLanguages": ["en", "fr"],
+  "supportedContent": ["text"]
+}
 ```
 
-- **201** — the created `Catalog`
+`supportedLanguages` and `supportedContent` arrive as arrays and are
+returned deduplicated + sorted on the wire. (Astra-row layer keeps
+them as `SET<TEXT>`; the converter normalises at the boundary.)
+
+- **201** — the created record (with the generated `*ServiceId`)
+- **400** `validation_error` — schema failure
 - **404** `workspace_not_found`
-- **404** `vector_store_not_found` — `vectorStore` points at a missing descriptor
 - **409** `conflict` — `uid` collision
 
-### `GET /{catalogUid}` / `PUT /{catalogUid}` / `DELETE /{catalogUid}`
+### `GET /{serviceId}` / `PUT /{serviceId}` / `DELETE /{serviceId}`
 
-Fetch / patch / delete. `DELETE` cascades to the catalog's documents.
+Fetch / patch / delete. `PUT` accepts every field from create
+(all optional). Strict bodies — unknown keys return `400`.
+
+`DELETE` is **refused with `409 conflict` while any KB still
+references the service**. Drop or rebind the dependent KBs first.
+The error message names the offending KB so operators can navigate
+straight to it.
 
 ---
 
-## `/api/v1/workspaces/{workspaceUid}/vector-stores`
+## `/api/v1/workspaces/{workspaceUid}/knowledge-bases`
 
 ### `GET`
 
-List vector-store descriptors in the workspace.
+List knowledge bases in the workspace.
 
-- **200** — paginated `VectorStore` records
+- **200** — paginated `KnowledgeBase` records
 - **404** `workspace_not_found`
 
-A `VectorStore` descriptor:
+A `KnowledgeBase`:
 
 ```json
 {
-  "workspace": "…",
-  "uid": "…",
-  "name": "support-vectors",
-  "vectorDimension": 1536,
-  "vectorSimilarity": "cosine",
-  "embedding": {
-    "provider": "openai",
-    "model": "text-embedding-3-small",
-    "endpoint": null,
-    "dimension": 1536,
-    "secretRef": "env:OPENAI_API_KEY"
-  },
-  "lexical":   { "enabled": false, "analyzer": null, "options": {} },
-  "reranking": { "enabled": false, "provider": null, "model": null, "endpoint": null, "secretRef": null },
+  "workspaceId": "…",
+  "knowledgeBaseId": "…",
+  "name": "support-docs",
+  "description": "customer support knowledge base",
+  "status": "active",
+  "embeddingServiceId": "…",
+  "chunkingServiceId": "…",
+  "rerankingServiceId": null,
+  "language": "en",
+  "vectorCollection": "wb_vectors_<kb_id>",
+  "lexical": { "enabled": false, "analyzer": null, "options": {} },
   "createdAt": "…",
   "updatedAt": "…"
 }
@@ -484,93 +501,50 @@ A `VectorStore` descriptor:
 
 ### `POST`
 
-Create a descriptor **and** provision the underlying Data API
-Collection via the workspace's driver. Transactional — if collection
-provisioning fails, the descriptor row is rolled back so the control
-plane and data plane never drift.
-
-`vectorSimilarity` defaults to `cosine`; `lexical` and `reranking`
-default to `{ enabled: false, ... }` if omitted.
+Create a KB **and** auto-provision its underlying Astra collection.
+Transactional — if collection provisioning fails, the KB row is
+rolled back so the control plane and data plane never drift.
 
-**Required fields:** `name`, `vectorDimension`, `embedding`.
+`vectorCollection` is generated as `wb_vectors_<kb_id>` (hyphen-
+stripped) by default; supply your own to adopt a pre-existing
+collection.
 
-- **201** — the created `VectorStore` (collection now exists)
-- **404** `workspace_not_found`
-- **409** `conflict`
-- **422** `workspace_misconfigured` — workspace is missing `endpoint` or `credentialsRef.token` required by its driver
-- **503** `driver_unavailable` — no driver registered for the workspace's `kind`
-
-### `GET /discoverable`
-
-List collections that exist in the workspace's data plane but aren't
-yet wrapped in a workbench descriptor — useful for adopting
-collections created by another tool, by hand, or by an older
-workbench install whose control-plane state was lost. Returns `[]`
-for drivers that don't expose external collections (the mock
-driver).
+**Request**
 
 ```json
-[
-  {
-    "name": "legacy_openai_coll",
-    "vectorDimension": 1536,
-    "vectorSimilarity": "cosine",
-    "embedding": { "provider": "openai", "model": "text-embedding-3-small" },
-    "lexicalEnabled": true,
-    "rerankEnabled": false,
-    "rerankProvider": null,
-    "rerankModel": null
-  }
-]
+{
+  "name": "support-docs",
+  "description": "customer support",
+  "embeddingServiceId": "…",
+  "chunkingServiceId": "…",
+  "rerankingServiceId": null,
+  "language": "en"
+}
 ```
 
-- **200** — list of `AdoptableCollection`s (already-adopted
-  collections are filtered out)
-- **404** `workspace_not_found`
+`embeddingServiceId` and `chunkingServiceId` are required. Both
+must reference services that exist in the same workspace.
+
+- **201** — the created `KnowledgeBase` (collection now exists)
+- **404** `workspace_not_found` / `embedding_service_not_found` /
+  `chunking_service_not_found` / `reranking_service_not_found`
+- **409** `conflict` — `uid` collision
+- **422** `workspace_misconfigured` — workspace is missing
+  `endpoint` or `credentialsRef.token` required by its driver
 - **503** `driver_unavailable` — no driver registered for the
   workspace's `kind`
 
-### `POST /adopt`
-
-Wrap an existing data-plane collection in a workbench descriptor
-without re-provisioning it. The route reads the live collection's
-vector / lexical / rerank options off the data plane and stamps a
-descriptor matching them; the descriptor's `name` equals the
-collection's name (which is already a valid Astra identifier by
-construction).
-
-**Request:**
+### `GET /{knowledgeBaseUid}` / `PUT /{knowledgeBaseUid}` / `DELETE /{knowledgeBaseUid}`
 
-```json
-{ "collectionName": "legacy_openai_coll" }
-```
+`GET` reads the record. `PUT` accepts a partial — `name`,
+`description`, `status`, `rerankingServiceId`, `language`, `lexical`
+are mutable; **`embeddingServiceId` and `chunkingServiceId` are
+immutable post-create** and the schema is `.strict()`, so accidentally
+including them in a body returns `400`. `DELETE` drops the underlying
+Astra collection first, then the KB row, then cascades RAG document
+rows.
 
-- **201** — the created `VectorStore` descriptor
-- **404** `collection_not_found` — the named collection isn't on
-  the data plane (or the driver no longer reports it)
-- **409** `collection_already_adopted` — a descriptor with that name
-  already exists in this workspace
-- **503** `adopt_not_supported` — driver doesn't expose
-  `listAdoptable`
-
-Vectorless / vector-only collections (no `$vectorize` service
-configured) get a placeholder `embedding: { provider: "external",
-model: "external", … }` — clients still need to supply vectors at
-upsert / search time. Create a new vector store when you need a
-different provider or dimension; descriptors intentionally mirror the
-underlying collection and are immutable after creation.
-
-### `GET /{vectorStoreUid}` / `PUT /{vectorStoreUid}` / `DELETE /{vectorStoreUid}`
-
-`GET` reads the descriptor. `PUT` accepts an empty patch and returns
-the existing descriptor, but rejects any field changes with
-`409 conflict` because dimensions, similarity, embedding, lexical,
-rerank, and collection naming are physical collection properties.
-`DELETE` drops the underlying Data API Collection **and** removes the descriptor.
-If any catalog still references the vector store, `DELETE` returns
-`409 conflict`; clear or move those catalog bindings first.
-
-### `POST /{vectorStoreUid}/records` — upsert records
+### `POST /{knowledgeBaseUid}/records` — upsert records
 
 **Request** — each record carries exactly one of `vector` or `text`:
 
@@ -587,15 +561,15 @@ If any catalog still references the vector store, `DELETE` returns
 - `records` — 1..500 items per request.
 - `id` is the application's identifier; re-upsert replaces the prior
   value.
-- `vector.length` must equal the descriptor's `vectorDimension`.
+- `vector.length` must equal the bound embedding service's
+  `embeddingDimension`.
 - **Text dispatch** mirrors search: the route tries
   `driver.upsertByText()` for all-text batches (Astra `$vectorize`
   inserts for collections with a service block). On
   `NotSupportedError` the runtime embeds each text record via the
-  vector store's `embedding` config and retries through plain
-  `upsert`. Mixed batches always embed client-side so the whole
-  batch stays in one transactional call. See
-  [`docs/playground.md`](playground.md).
+  KB's bound embedding service and retries through plain `upsert`.
+  Mixed batches always embed client-side so the whole batch stays
+  in one transactional call.
 
 **Response 200**
 
@@ -603,62 +577,48 @@ If any catalog still references the vector store, `DELETE` returns
 { "upserted": 2 }
 ```
 
-- **400** `validation_error` — a record has neither or both of `vector`/`text`
-- **400** `dimension_mismatch` — at least one vector has the wrong length
-- **400** `embedding_unavailable` — text records + descriptor's embedding config can't be resolved
-- **400** `embedding_dimension_mismatch` — provider returned a vector whose length doesn't match the descriptor
-- **404** `workspace_not_found` / `vector_store_not_found`
-
-### `DELETE /{vectorStoreUid}/records/{recordId}`
+- **400** `validation_error` — record has neither/both of `vector`/`text`
+- **400** `dimension_mismatch` — vector length doesn't match the
+  bound embedding service's `embeddingDimension`
+- **400** `embedding_unavailable` / `embedding_dimension_mismatch`
+- **404** `workspace_not_found` / `knowledge_base_not_found`
 
-Delete a single record. `recordId` is the application's `id` (not a
-UUID — any non-empty string).
+### `DELETE /{knowledgeBaseUid}/records/{recordId}`
 
-**Response 200**
+Delete a single record. `recordId` is the application's `id` (any
+non-empty string).
 
 ```json
-{ "deleted": true }    // or false, if the record wasn't present
+{ "deleted": true }
 ```
 
-### `POST /{vectorStoreUid}/search` — vector or text search
-
-**Request** — exactly one of `vector` or `text`:
+### `POST /{knowledgeBaseUid}/search` — vector or text search
 
-```json
-{
-  "vector": [0.01, -0.02, ...],
-  "topK": 10,
-  "filter": { "tag": "keep" },
-  "includeEmbeddings": false
-}
-```
+**Request** — exactly one of `vector` or `text`, plus optional
+`hybrid` / `lexicalWeight` / `rerank`:
 
 ```json
 {
-  "text": "winter sweater in blue",
-  "topK": 10
+  "text": "how do refunds work?",
+  "topK": 5,
+  "filter": { "section": "billing" },
+  "hybrid": true,
+  "lexicalWeight": 0.3,
+  "rerank": true
 }
 ```
 
 - `topK` defaults to 10, clamped to `[1, 1000]`.
-- `filter` is shallow-equal on payload keys. Backends with richer
-  filter languages may accept more; the portable subset is
-  shallow-equal.
-- `includeEmbeddings: true` returns the stored vector on each hit.
-
-**Text dispatch**: the route tries the driver's `searchByText()`
-first — for Astra collections whose descriptor names a supported
-vectorize provider (`openai`, `azureOpenAI`, `cohere`, `jinaAI`,
-`mistral`, `nvidia`, `voyageAI`) and carries a `secretRef`, the
-driver opens a collection handle with the resolved API key as
-`embeddingApiKey` and issues `find(sort: { $vectorize: text })`.
-The runtime never sees or transmits the vector. Legacy
-collections (no `service` block) return a "vectorize not
-configured" error; the driver catches it and rethrows as
-`NotSupportedError`, after which the runtime falls back to a
-client-side embedding (built from the vector store's `embedding`
-config via the Vercel AI SDK) and runs a normal vector search.
-See [`docs/playground.md`](playground.md) for the mental model.
+- `filter` is shallow-equal on payload keys.
+- `hybrid: true` runs the driver's vector + lexical lane (defaults
+  to the KB's `lexical.enabled`). Requires `text`.
+- `rerank: true` reorders hits through the KB's bound reranking
+  service. Defaults to `true` when `rerankingServiceId` is non-null.
+  Requires `text`.
+
+The route synthesises a driver-facing descriptor from the KB plus
+its bound services (see `kb-descriptor.ts`) so the dispatch layer
+stays unchanged.
 
 **Response 200** — array of hits, sorted by `score` descending:
 
@@ -669,7 +629,8 @@ See [`docs/playground.md`](playground.md) for the mental model.
 ]
 ```
 
-Score semantics match the descriptor's `vectorSimilarity`:
+Score semantics match the bound embedding service's
+`distanceMetric`:
 
 | Metric | Score |
 |---|---|
@@ -677,37 +638,32 @@ Score semantics match the descriptor's `vectorSimilarity`:
 | `dot` | Raw dot product; unbounded |
 | `euclidean` | `1 / (1 + distance)` so higher = closer |
 
-- **400** `validation_error` — neither or both of `vector`/`text`
-- **400** `dimension_mismatch` — supplied vector length mismatched
-- **400** `embedding_unavailable` — text search but the vector
-  store's `embedding` config can't be resolved (missing secret,
-  unknown provider, ...)
-- **400** `embedding_dimension_mismatch` — provider returned a
-  vector whose length doesn't match the store's declared dim
-- **404** `workspace_not_found` / `vector_store_not_found`
+- **400** `validation_error` — neither/both of `vector`/`text`,
+  or `hybrid`/`rerank` without `text`
+- **400** `dimension_mismatch` / `embedding_unavailable` /
+  `embedding_dimension_mismatch`
+- **404** `workspace_not_found` / `knowledge_base_not_found`
+- **501** `hybrid_not_supported` / `rerank_not_supported`
 
----
+### `GET /{knowledgeBaseUid}/documents`
 
-## `/api/v1/workspaces/{workspaceUid}/catalogs/{catalogUid}/documents`
+List RAG documents in the KB.
 
-Document **metadata** CRUD. A `Document` is a named entry in a
-catalog — the metadata row the in-process ingest pipeline attaches
-vectors to. `PUT` updates metadata only; content changes go through
-`POST /ingest` (sync) or `POST /ingest?async=true` (returns 202 with
-a job pointer), both documented further down.
+- **200** — paginated `RagDocument` records
+- **404** `workspace_not_found` / `knowledge_base_not_found`
 
-A `Document`:
+A `RagDocument`:
 
 ```json
 {
-  "workspace": "…",
-  "catalogUid": "…",
-  "documentUid": "…",
+  "workspaceId": "…",
+  "knowledgeBaseId": "…",
+  "documentId": "…",
   "sourceDocId": null,
   "sourceFilename": "readme.md",
   "fileType": "text/markdown",
   "fileSize": 1024,
-  "md5Hash": null,
+  "contentHash": "sha256:…",
   "chunkTotal": null,
   "ingestedAt": null,
   "updatedAt": "…",
@@ -717,61 +673,45 @@ A `Document`:
 }
 ```
 
-`status` is one of `pending | chunking | embedding | writing | ready |
-failed`. The in-process ingest pipeline (sync + async) is the
-canonical writer of `status` / `errorMessage` / `chunkTotal` /
-`ingestedAt`. Clients can also set these directly via `PUT` so an
-external ingest driver can own the lifecycle if it prefers.
+`status` is one of `pending | chunking | embedding | writing | ready
+| failed`. The KB ingest pipeline is the canonical writer of
+`status` / `errorMessage` / `chunkTotal` / `ingestedAt`. Clients
+can also set these directly via `PUT` if they own the lifecycle
+externally.
 
-### `GET`
+### `POST /{knowledgeBaseUid}/documents`
 
-List documents in the catalog.
-
-- **200** — paginated `Document` records
-- **404** `workspace_not_found` / `catalog_not_found`
-
-### `POST`
-
-Register a document in the catalog.
-
-**Request** — all fields optional except uniqueness of `uid` within
-the catalog:
+Register a document in the KB without running the ingest pipeline.
 
 ```json
 {
   "sourceFilename": "readme.md",
   "fileType": "text/markdown",
   "fileSize": 1024,
+  "contentHash": "sha256:…",
   "metadata": { "source": "upload" }
 }
 ```
 
-- **201** — the created `Document` (`status` defaults to `pending`,
-  `metadata` defaults to `{}`)
-- **404** `workspace_not_found` / `catalog_not_found`
-- **409** `conflict` — `uid` collision within the same catalog
-
-### `GET /{documentUid}` / `PUT /{documentUid}` / `DELETE /{documentUid}`
+- **201** — the created `RagDocument` (`status` defaults to
+  `pending`, `metadata` defaults to `{}`)
+- **404** `workspace_not_found` / `knowledge_base_not_found`
+- **409** `conflict` — `uid` collision within the same KB
 
-Fetch / patch / delete. `PUT` accepts every field from the create body
-(all optional) and updates only the fields present. Cross-catalog
-access — requesting a document from a catalog it does not belong to —
-returns `404 document_not_found`.
+### `GET /{knowledgeBaseUid}/documents/{documentUid}` / `PUT /{documentUid}` / `DELETE /{documentUid}`
 
-`DELETE` cascades into the bound vector store: the document's chunks
-(matched by `payload.documentUid`) are removed before the document
-row is dropped, so a successful delete leaves no traces in
-catalog-scoped search. Drivers exposing `deleteRecords` use a single
-bulk call; older drivers fall back to a `listRecords` + per-row
-delete loop. Catalogs with no `vectorStore` binding skip the cascade
-and only drop the row.
+Fetch / patch / delete. `PUT` accepts every field from create (all
+optional). `DELETE` cascades into the KB's collection: chunks
+matched by `payload.documentUid` are removed before the row is
+dropped, so a successful delete leaves no traces in KB-scoped
+search. Drivers exposing `deleteRecords` use a single bulk call;
+older drivers fall back to a `listRecords` + per-row delete loop.
 
-### `GET /{documentUid}/chunks`
+### `GET /{knowledgeBaseUid}/documents/{documentUid}/chunks`
 
 Lists the chunks the ingest pipeline extracted from this document.
-Reads raw records out of the catalog's bound vector store filtered
-on `documentUid`, sorts by the `chunkIndex` payload key, and
-returns:
+Reads raw records out of the KB's collection filtered on
+`documentUid`, sorts by the `chunkIndex` payload key, and returns:
 
 ```json
 [
@@ -780,7 +720,7 @@ returns:
     "chunkIndex": 0,
     "text": "First paragraph about apples.",
     "payload": {
-      "catalogUid": "…",
+      "knowledgeBaseUid": "…",
       "documentUid": "…",
       "chunkIndex": 0,
       "chunkText": "First paragraph about apples.",
@@ -795,104 +735,19 @@ Query params:
 - `limit` (1–1000, default 1000) — caps the number of chunks
   returned.
 
-The ingest pipeline stamps the chunk's text into the reserved
-`chunkText` payload key, so the response always carries the source
-text — even on collections with no `$vectorize` round-trip.
-Records ingested before the `chunkText` key landed return
-`text: null`.
-
 - **200** — array of chunks, sorted by `chunkIndex` ascending
-- **404** `workspace_not_found` / `catalog_not_found` /
-  `document_not_found` / `vector store_not_found`
-- **409** `catalog_not_bound_to_vector_store`
+- **404** `workspace_not_found` / `knowledge_base_not_found` /
+  `document_not_found`
 - **501** `list_records_not_supported` — driver doesn't expose
   `listRecords`
 
-### `POST /search`
-
-Catalog-scoped vector / text search. Delegates to the vector store
-bound at `catalog.vectorStore`, merging `catalogUid = catalog.uid`
-into the effective filter so records outside the catalog are
-invisible.
-
-**Request** — identical envelope to
-`POST /vector-stores/{vectorStoreUid}/search`. Either `vector` OR `text` is
-required; never both.
-
-```json
-{
-  "text": "how do refunds work?",
-  "topK": 5,
-  "filter": { "section": "billing" },
-  "hybrid": true,
-  "lexicalWeight": 0.3,
-  "rerank": true
-}
-```
-
-**Response** — `200` array of `SearchHit`, highest score first.
-
-**Scope merging.** The server sets `filter.catalogUid` to the path's
-catalog UID unconditionally. Any caller-supplied `catalogUid` is
-overridden — a search can never escape its catalog. Other filter
-keys merge normally.
-
-**Hybrid + rerank lanes.**
-
-- `hybrid: true` runs the driver's combined vector + lexical lane.
-  Defaults to the bound store's `lexical.enabled`. Requires `text` —
-  the lexical signal has nothing to score against without it.
-  `lexicalWeight` (0..1, default 0.5) tunes how much the lexical
-  score contributes vs. the vector score.
-- `rerank: true` post-processes the retrieval hits through the
-  driver's reranker. Defaults to the bound store's
-  `reranking.enabled`. Also requires `text`.
-
-Drivers can support either, both, or neither.
-
-- `mock` — supports both when the descriptor's `embedding.provider`
-  is `"mock"`. Hybrid and rerank are two separate phases in the
-  dispatcher.
-- `astra` — supports hybrid natively via `findAndRerank` (astra-
-  db-ts's built-in API). Requires the descriptor to opt into both
-  `lexical.enabled: true` **and** `reranking.enabled: true` — the
-  collection is provisioned with a lexical index and reranker
-  service at create time. Standalone `rerank` is **not** exposed on
-  Astra because the Data API combines retrieval + reranking in one
-  call; callers that want rerank set `hybrid: true`. `lexicalWeight`
-  is ignored on Astra — the reranker owns the blend. A
-  `rerank: true` request against an Astra workspace therefore
-  returns 501 unless paired with `hybrid: true`.
-
-**Errors**
-
-- **400** `validation_error` — `vector` / `text` presence rules,
-  including "hybrid: true requires text" and "rerank: true requires
-  text"
-- **400** `embedding_unavailable` — the fallback embedder could not be
-  built (text path only)
-- **400** `embedding_dimension_mismatch` — provider returned a vector
-  whose length doesn't match the bound store's declared dim
-- **404** `workspace_not_found` / `catalog_not_found`
-- **404** `vector_store_not_found` — the binding exists but the
-  referenced store no longer does (stale binding)
-- **409** `catalog_not_bound_to_vector_store` — `catalog.vectorStore`
-  is `null`
-- **501** `hybrid_not_supported` / `rerank_not_supported` — the
-  workspace kind's driver doesn't implement the requested lane
-
-Text records written through `POST /ingest` carry a `catalogUid`
-stamp on every chunk payload — that's what lets this route scope
-correctly. The route also works against any records that carry a
-matching `catalogUid` regardless of how they arrived.
-
-### `POST /ingest`
+### `POST /{knowledgeBaseUid}/ingest`
 
 Synchronous end-to-end ingest. Chunks the input text, embeds every
-chunk (server-side via `$vectorize` where the bound store supports
-it, otherwise client-side via the descriptor's `embedding` config),
-upserts the chunks into the bound vector store, and creates a
-`Document` metadata row with `status: ready` + `chunkTotal`.
+chunk through the KB's bound embedding service (server-side via
+`$vectorize` where the driver supports it, otherwise client-side),
+upserts the chunks into the KB's collection, and creates a
+`RagDocument` row with `status: ready` + `chunkTotal`.
 
 **Request**
 
@@ -905,11 +760,11 @@ upserts the chunks into the bound vector store, and creates a
 }
 ```
 
-All fields except `text` are optional. `chunker` overrides the
-runtime defaults for this call only. `metadata` is merged onto every
-chunk's payload; the reserved keys `catalogUid`, `documentUid`, and
-`chunkIndex` are always set by the runtime and will override any
-caller-supplied values. `text` is capped at 200,000 characters.
+`chunker` overrides the runtime defaults for this call only.
+`metadata` is merged onto every chunk's payload; the reserved keys
+`knowledgeBaseUid`, `documentUid`, `chunkIndex`, and `chunkText` are
+always set by the runtime and override any caller-supplied values.
+`text` is capped at 200,000 characters.
 
 **Response 201**
 
@@ -920,39 +775,22 @@ caller-supplied values. `text` is capped at 200,000 characters.
 }
 ```
 
-**Chunk payloads.** Every chunk upserted to the vector store carries:
+**Chunk payloads.** Every chunk upserted carries:
 
-- `catalogUid` — the catalog's UID (used by `/documents/search`)
-- `documentUid` — the UID of the `Document` row this ingest created
+- `knowledgeBaseUid` — the KB's UID (used by `/search`)
+- `documentUid` — the UID of the `RagDocument` row this ingest created
 - `chunkIndex` — 0-based position within the source document
-- `chunker.id` — the chunker impl that produced the slice
-  (`recursive-char:1` today)
+- `chunkText` — the chunk's raw text (read back through `/chunks`)
 - Plus every caller-supplied `metadata` key
 
-**Errors**
-
-- **400** `validation_error` — missing/empty `text`, bad chunker
-  config, or the Zod schema otherwise fails
-- **400** `embedding_unavailable` — client-side embedding fallback
-  could not build an embedder (missing secret, etc.)
-- **400** `embedding_dimension_mismatch` — embedder dimension
-  disagrees with the bound store
-- **404** `workspace_not_found` / `catalog_not_found`
-- **404** `vector_store_not_found` — stale binding (catalog points at
-  a deleted store)
-- **409** `catalog_not_bound_to_vector_store` — `catalog.vectorStore`
-  is `null`
-
 **Failure semantics.** When chunking or upsert throws, the
-`Document` row is marked `status: failed` with `errorMessage` before
-the error is re-raised. Operators can inspect the row via
-`GET /documents/{documentUid}`.
+`RagDocument` row is marked `status: failed` with `errorMessage`
+before the error is re-raised.
 
-### `POST /ingest?async=true`
+### `POST /{knowledgeBaseUid}/ingest?async=true`
 
-Same request body as the sync variant. The pipeline runs in the
-background; the response returns immediately with a job pointer so
-the UI doesn't block on long uploads.
+Same body. The pipeline runs in the background; the response
+returns immediately with a job pointer.
 
 **Response 202**
 
@@ -962,7 +800,7 @@ the UI doesn't block on long uploads.
     "workspace": "…",
     "jobId": "…",
     "kind": "ingest",
-    "catalogUid": "…",
+    "knowledgeBaseUid": "…",
     "documentUid": "…",
     "status": "pending",
     "processed": 0,
@@ -976,19 +814,13 @@ the UI doesn't block on long uploads.
 }
 ```
 
-Errors are the same set as the sync path — validation /
-embedding / not-found / 409. A 4xx means the request was rejected
-outright; nothing was enqueued and no job row exists.
-
-Once a job is running, failures are captured into the job record
-(`status: failed`, `errorMessage` populated) and the document row
-(also `status: failed`). The HTTP response has already been sent by
-then.
+Errors are the same set as the sync path. A 4xx means the request
+was rejected outright; nothing was enqueued and no job row exists.
 
-**Progress callbacks.** The background worker reports
-`{processed, total}` via `JobStore.update`. Today it fires once
-before upsert (`processed: 0`) and once after (`processed: total`);
-later slices can emit per-batch updates without a contract change.
+Once the job is running, failures are captured into the job record
+(`status: failed`, `errorMessage` populated) and the document row.
+The `runKbIngestJob` worker resolves the KB descriptor on every
+call so renames or service swaps mid-flight don't drift.
 
 ---
 
@@ -1028,13 +860,15 @@ persistent job backends.
 | `workspace` | uuid | Owning workspace |
 | `jobId` | uuid | |
 | `kind` | `"ingest"` | Discriminator — more kinds arrive with more async ops |
-| `catalogUid` | uuid or null | Set for ingest jobs |
+| `knowledgeBaseUid` | uuid or null | Set for ingest jobs |
 | `documentUid` | uuid or null | Set for ingest jobs |
 | `status` | `"pending"` \| `"running"` \| `"succeeded"` \| `"failed"` | Terminal: succeeded, failed |
 | `processed` | int | Units completed |
 | `total` | int or null | Units expected (null if unknown) |
 | `result` | object or null | Kind-specific summary on success (ingest: `{ chunks: N }`) |
 | `errorMessage` | string or null | Populated on `failed` |
+| `leasedBy` | string or null | Replica id holding the lease on a `running` job (cross-replica resume) |
+| `leasedAt` | iso-8601 or null | Last heartbeat from the lease holder |
 | `createdAt` | iso-8601 | |
 | `updatedAt` | iso-8601 | |
 
@@ -1057,106 +891,26 @@ resume-worker promotes it to `failed`. Callers that need
 restart-resume today should treat any `running` job older than a
 heartbeat threshold as failed and resubmit.
 
----
-
-## `/api/v1/workspaces/{workspaceUid}/catalogs/{catalogUid}/queries`
-
-Saved search recipes scoped to a catalog. Each `SavedQuery` carries a
-`text` plus optional `topK` and `filter`, and is replayed through the
-catalog-scoped search path by `POST /{queryUid}/run`.
-
-Deleting a workspace or catalog cascades to its saved queries (every
-backend — memory, file, astra).
-
-A `SavedQuery`:
-
-```json
-{
-  "workspace": "…",
-  "catalogUid": "…",
-  "queryUid": "…",
-  "name": "refunds",
-  "description": "billing questions",
-  "text": "how do refunds work?",
-  "topK": 5,
-  "filter": { "section": "billing" },
-  "createdAt": "…",
-  "updatedAt": "…"
-}
-```
-
-Text-only by design — saved vectors are rarely the right abstraction
-and serialize heavily. Callers wanting vector-form queries write the
-search body directly against `POST /documents/search`.
-
-### `GET`
-
-List saved queries in the catalog.
-
-- **200** — paginated `SavedQuery` records
-- **404** `workspace_not_found` / `catalog_not_found`
-
-### `POST`
-
-Create a saved query. `uid` is optional.
-
-```json
-{
-  "name": "refunds",
-  "description": "billing questions",
-  "text": "how do refunds work?",
-  "topK": 5,
-  "filter": { "section": "billing" }
-}
-```
-
-- **201** — the created `SavedQuery`
-- **404** `workspace_not_found` / `catalog_not_found`
-- **409** `conflict` — `uid` collision within the same catalog
-
-### `GET /{queryUid}` / `PUT /{queryUid}` / `DELETE /{queryUid}`
-
-Fetch / patch / delete. `PUT` accepts every field from create (all
-optional). Deleting a non-existent query returns
-`404 saved_query_not_found`.
-
-### `POST /{queryUid}/run`
-
-Execute a saved query and return the hits. The catalog's UID is
-merged into the effective filter — a saved filter carrying a
-different `catalogUid` is silently overridden, so a saved query can
-never escape its catalog.
-
-**Response 200** — array of `SearchHit` (same shape as
-`/documents/search`).
-
-**Errors**
-
-- **400** `embedding_unavailable` / `embedding_dimension_mismatch`
-  (client-side embedding fallback path)
-- **404** `workspace_not_found` / `catalog_not_found` /
-  `saved_query_not_found` / `vector_store_not_found`
-- **409** `catalog_not_bound_to_vector_store`
-
----
 
 ## Planned routes
 
 These do not exist yet. Shapes may shift before they land.
 
-The Phase 2 routes (saved queries CRUD + `/run`, async ingest, jobs
-poll + SSE) and the Phase 3 playground dispatch (text/vector via the
-existing `POST .../search` route) shipped in #53–#60 and are
-documented above.
-
-### Phase 4+ — Chats, MCP
+### Stage 2 — agents, conversations, messages
 
-Reserved:
+The schema for `wb_agentic_*` and `wb_config_llm_service_*` /
+`wb_config_mcp_tools_*` is provisioned at boot but not yet wired
+through the runtime. The route shapes are reserved:
 
-- `/api/v1/workspaces/{w}/chats/…`
-- `/api/v1/workspaces/{w}/mcp/…`
+- `/api/v1/workspaces/{w}/llm-services` — CRUD
+- `/api/v1/workspaces/{w}/mcp-tools` — CRUD
+- `/api/v1/workspaces/{w}/agents` — CRUD; an agent composes one LLM
+  + a list of MCP tools + a list of knowledge bases
+- `/api/v1/workspaces/{w}/agents/{a}/conversations` — CRUD; nested
+  `messages` resource
+- `/api/v1/workspaces/{w}/agents/{a}/run` — execution loop
 
-Contracts finalized as those phases approach.
+See [`roadmap.md`](roadmap.md) for the phase plan.
 
 ---
 
diff --git a/docs/architecture.md b/docs/architecture.md
index e58fe74..44ad4a1 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -1,43 +1,49 @@
 # Architecture
 
 AI Workbench is a polyglot HTTP runtime sitting in front of Astra DB.
-It exposes a stable `/api/v1/*` contract for workspaces, document
-catalogs, vector-store descriptors, and (in later phases) documents,
-ingestion, and search. Each **language-native implementation of the
-runtime** is a "green box"; the default TypeScript green box is
-embedded with the UI, and alternatives live under
-[`runtimes/`](../runtimes/README.md).
+It exposes a stable `/api/v1/*` contract for workspaces, knowledge
+bases, execution services (chunking / embedding / reranking),
+documents, ingestion, and search. Each **language-native
+implementation of the runtime** is a "green box"; the default
+TypeScript green box is embedded with the UI, and alternatives live
+under [`runtimes/`](../runtimes/README.md).
 
 ## Design principles
 
-1. **One HTTP contract, N runtimes.** Workspaces, catalogs, and
-   vector-store descriptors are defined by the HTTP API — not by any
-   one runtime's internals. Every language green box honors the same
-   contract, enforced by
+1. **One HTTP contract, N runtimes.** Workspaces, knowledge bases,
+   execution services, and RAG documents are defined by the HTTP API
+   — not by any one runtime's internals. Every language green box
+   honors the same contract, enforced by
    [fixture-based conformance tests](./conformance.md).
 2. **Thin, boring runtime core.** The runtime is an HTTP server + a
    pluggable control-plane store. Complexity lives in pluggable
-   services (chunking, embedding, reranking in later phases).
+   services bound to a knowledge base (chunking, embedding,
+   reranking).
 3. **Workspaces are runtime data, not config.** `workbench.yaml`
    picks which control-plane backend to use; workspaces themselves
    are mutable records managed via the HTTP API.
-4. **Driver-based control plane.** `memory` for CI and demos, `file`
+4. **A KB owns its collection end-to-end.** Creating a knowledge
+   base auto-provisions the underlying Astra collection
+   (`wb_vectors_<kb_id>`), sized to the bound embedding service's
+   dimension; deleting the KB drops the collection. The control
+   plane and data plane never diverge.
+5. **Driver-based control plane.** `memory` for CI and demos, `file`
    for single-node self-hosted, `astra` for production. Same
    contract.
-5. **Astra-native where real.** The `astra` backend uses
+6. **Astra-native where real.** The `astra` backend uses
    [`@datastax/astra-db-ts`](https://github.com/datastax/astra-db-ts)
    directly. The Python runtime uses
    [`astrapy`](https://github.com/datastax/astrapy). No wrapper
    libraries in between.
-6. **Secrets by reference.** Credentials live behind
+7. **Secrets by reference.** Credentials live behind
    `SecretRef` pointers (`env:FOO` / `file:/path`) resolved at use
    time by a pluggable provider. No raw secrets in config, records,
    or logs.
-7. **Immutable records.** Every update returns a new object. The
+8. **Immutable records.** Every update returns a new object. The
    in-memory backend holds `Map<uid, Record>`; the file backend
    rewrites atomically; the astra backend does `$set` updates
    through the Data API.
-8. **Contract-first for new surfaces.** The HTTP API is versioned
+9. **Contract-first for new surfaces.** The HTTP API is versioned
    (`/api/v1/…`) and documented in [`api-spec.md`](api-spec.md) and
    the generated OpenAPI at `/api/v1/openapi.json`.
 
@@ -93,21 +99,30 @@ All three pass the same shared contract suite in
 ### Vector-store drivers (`runtimes/typescript/src/drivers/`)
 
 Data-plane counterparts to the control-plane store. Where
-`ControlPlaneStore` owns **descriptors**, the `VectorStoreDriver`
-owns **actual vectors** on a per-workspace backend.
+`ControlPlaneStore` owns **records** (workspaces, KBs, services,
+RAG documents), the `VectorStoreDriver` owns **actual vectors** in
+the per-KB Astra collection.
 
 | File | Purpose |
 |---|---|
-| [`vector-store.ts`](../runtimes/typescript/src/drivers/vector-store.ts) | Driver interface — `createCollection`, `dropCollection`, `upsert`, `deleteRecord`, `search`, plus optional `searchByText`, `upsertByText`, `searchHybrid`, `rerank`, `listAdoptable` (adopt-existing), `listRecords` (chunks under a document), `deleteRecords` (delete-document cascade) |
+| [`vector-store.ts`](../runtimes/typescript/src/drivers/vector-store.ts) | Driver interface — `createCollection`, `dropCollection`, `upsert`, `deleteRecord`, `search`, plus optional `searchByText`, `upsertByText`, `searchHybrid`, `rerank`, `listRecords` (chunks under a document), `deleteRecords` (delete-document cascade) |
 | [`mock/store.ts`](../runtimes/typescript/src/drivers/mock/store.ts) | In-memory driver; used by workspaces with `kind: "mock"` and by the conformance suite |
 | [`astra/store.ts`](../runtimes/typescript/src/drivers/astra/store.ts) | Data API Collections via `astra-db-ts`; per-workspace `DataAPIClient` cache, lazy init |
 | [`registry.ts`](../runtimes/typescript/src/drivers/registry.ts) | Dispatches based on `workspace.kind`; unknown kinds surface as `503 driver_unavailable` |
 | [`factory.ts`](../runtimes/typescript/src/drivers/factory.ts) | Wires the registry at startup from the `SecretResolver` |
 
-`POST /api/v1/workspaces/{w}/vector-stores` is the transactional
-entry point: it writes the descriptor, calls the driver to create
-the collection, and rolls back the descriptor on failure so the
-control plane and data plane never diverge.
+The route layer in
+[`api-v1/kb-descriptor.ts`](../runtimes/typescript/src/routes/api-v1/kb-descriptor.ts)
+materialises a driver-facing descriptor on the fly from a KB plus
+its bound embedding/reranking services. Drivers and the search /
+upsert dispatch surfaces consume this synthesised shape unchanged —
+they don't need to know KBs exist.
+
+`POST /api/v1/workspaces/{w}/knowledge-bases` is the transactional
+entry point: it writes the KB row, calls the driver to create the
+collection, and rolls back the row on failure so the control plane
+and data plane never diverge. `DELETE` reverses this — drop the
+collection first, then the row.
 
 Both drivers pass the same 8-assertion
 [driver contract suite](../runtimes/typescript/tests/drivers/contract.ts). The Astra
@@ -117,7 +132,7 @@ gated on `ASTRA_DB_*` env vars and lives in a follow-up.
 
 ### Astra client (`runtimes/typescript/src/astra-client/`)
 
-Thin layer over `astra-db-ts` scoped to the four `wb_*` tables:
+Thin layer over `astra-db-ts` scoped to the `wb_*` tables:
 
 - [`table-definitions.ts`](../runtimes/typescript/src/astra-client/table-definitions.ts) —
   Data API Table DDL.
@@ -129,7 +144,7 @@ Thin layer over `astra-db-ts` scoped to the four `wb_*` tables:
   narrow structural interface used by the astra store (lets tests
   inject fakes).
 - [`client.ts`](../runtimes/typescript/src/astra-client/client.ts) — `openAstraClient()`:
-  creates the four tables idempotently at init and returns a
+  creates the tables idempotently at init and returns a
   `TablesBundle`.
 
 The Python runtime has a symmetric internal layer that wraps
@@ -154,8 +169,13 @@ talking to workspace-scoped backends.
 |---|---|---|
 | [`operational.ts`](../runtimes/typescript/src/routes/operational.ts) | (unversioned) | `/`, `/healthz`, `/readyz`, `/version` |
 | [`api-v1/workspaces.ts`](../runtimes/typescript/src/routes/api-v1/workspaces.ts) | `/api/v1/workspaces` | Workspace CRUD |
-| [`api-v1/catalogs.ts`](../runtimes/typescript/src/routes/api-v1/catalogs.ts) | `/api/v1/workspaces/{w}/catalogs` | Catalog CRUD |
-| [`api-v1/vector-stores.ts`](../runtimes/typescript/src/routes/api-v1/vector-stores.ts) | `/api/v1/workspaces/{w}/vector-stores` | Descriptor CRUD |
+| [`api-v1/knowledge-bases.ts`](../runtimes/typescript/src/routes/api-v1/knowledge-bases.ts) | `/api/v1/workspaces/{w}/knowledge-bases` | KB CRUD (POST auto-provisions collection) |
+| [`api-v1/kb-data-plane.ts`](../runtimes/typescript/src/routes/api-v1/kb-data-plane.ts) | `…/knowledge-bases/{kb}/{records,search}` | Upsert / delete record / search |
+| [`api-v1/kb-documents.ts`](../runtimes/typescript/src/routes/api-v1/kb-documents.ts) | `…/knowledge-bases/{kb}/{documents,ingest}` | Document metadata, sync + async ingest, chunk listing |
+| [`api-v1/kb-descriptor.ts`](../runtimes/typescript/src/routes/api-v1/kb-descriptor.ts) | — | `resolveKb()` — synthesises a driver-facing descriptor from a KB + bound services |
+| [`api-v1/{chunking,embedding,reranking}-services.ts`](../runtimes/typescript/src/routes/api-v1/) | `…/{chunking,embedding,reranking}-services` | Service CRUD |
+| [`api-v1/jobs.ts`](../runtimes/typescript/src/routes/api-v1/jobs.ts) | `/api/v1/workspaces/{w}/jobs` | Job poll + SSE stream |
+| [`api-v1/api-keys.ts`](../runtimes/typescript/src/routes/api-v1/api-keys.ts) | `/api/v1/workspaces/{w}/api-keys` | Per-workspace API-key management |
 | [`api-v1/helpers.ts`](../runtimes/typescript/src/routes/api-v1/helpers.ts) | — | Error mapping (invoked from app-level `onError`) |
 
 Route handlers validate with Zod (via `@hono/zod-openapi`) and
@@ -166,28 +186,64 @@ envelope.
 
 ## Data model
 
-Four `wb_*` Data API tables backed by CQL-style schemas. The exact
-DDL lives in
+Data API tables backed by CQL-style schemas. The exact DDL lives in
 [`runtimes/typescript/src/astra-client/table-definitions.ts`](../runtimes/typescript/src/astra-client/table-definitions.ts);
 here's the logical shape:
 
 ```
-wb_workspaces                  PK (uid)
-    uid, name, url, kind, credentials_ref, keyspace, created_at, updated_at
-
-wb_catalog_by_workspace        PK ((workspace), uid)
-    name, description, vector_store, created_at, updated_at
+wb_workspaces                                 PK (uid)
+    uid, name, endpoint, kind, credentials_ref, keyspace,
+    created_at, updated_at
 
-wb_vector_store_by_workspace   PK ((workspace), uid)
-    name, vector_dimension, vector_similarity,
-    embedding_{provider,model,endpoint,dimension,secret_ref},
+wb_config_knowledge_bases_by_workspace        PK ((workspace_id), knowledge_base_id)
+    name, description, status,
+    embedding_service_id, chunking_service_id, reranking_service_id,
+    language, vector_collection,
     lexical_{enabled,analyzer,options},
-    reranking_{enabled,provider,model,endpoint,secret_ref},
     created_at, updated_at
 
-wb_documents_by_catalog        PK ((workspace, catalog_uid), document_uid)
-    source_*, file_*, md5_hash, chunk_total, ingested_at, updated_at,
+wb_config_chunking_service_by_workspace       PK ((workspace_id), chunking_service_id)
+    name, description, status,
+    engine, engine_version, strategy,
+    {min,max}_chunk_size, chunk_unit,
+    overlap_size, overlap_unit, preserve_structure,
+    language, max_payload_size_kb,
+    enable_ocr, extract_tables, extract_figures, reading_order,
+    endpoint_*, request_timeout_ms, auth_type, credential_ref,
+    created_at, updated_at
+
+wb_config_embedding_service_by_workspace      PK ((workspace_id), embedding_service_id)
+    name, description, status,
+    provider, model_name, embedding_dimension, distance_metric,
+    max_batch_size, max_input_tokens,
+    supported_languages SET<TEXT>, supported_content SET<TEXT>,
+    endpoint_*, request_timeout_ms, auth_type, credential_ref,
+    created_at, updated_at
+
+wb_config_reranking_service_by_workspace      PK ((workspace_id), reranking_service_id)
+    name, description, status,
+    provider, engine, model_name, model_version,
+    max_candidates, scoring_strategy,
+    score_normalized, return_scores, max_batch_size,
+    supported_languages SET<TEXT>, supported_content SET<TEXT>,
+    endpoint_*, request_timeout_ms, auth_type, credential_ref,
+    created_at, updated_at
+
+wb_rag_documents_by_knowledge_base            PK ((workspace_id, knowledge_base_id), document_id)
+    source_*, file_*, content_hash, chunk_total,
+    ingested_at, updated_at,
     status, error_message, metadata
+
+wb_rag_documents_by_knowledge_base_and_status (secondary index, by status)
+wb_rag_documents_by_content_hash              (dedup lookup)
+
+wb_jobs_by_workspace                          PK ((workspace), job_id)
+    kind, knowledge_base_uid, document_uid, status,
+    processed, total, result_json, error_message,
+    leased_by, leased_at, ingest_input_json,
+    created_at, updated_at
+
+wb_api_key_by_workspace, wb_api_key_lookup    (per-workspace tokens)
 ```
 
 **`kind`** on workspaces is one of `astra | hcd | openrag | mock`. It
@@ -196,11 +252,26 @@ later, when a single runtime routes requests to different
 data-plane backends per workspace). The runtime's own control plane
 is separate — chosen via `workbench.yaml`.
 
-**`wb_vector_store_by_workspace` is a DESCRIPTOR row**, not the
-vector data. The actual Data API Collection holding vectors is a
-separate object, provisioned transactionally by the workspace's
-vector-store driver (see the *Vector-store drivers* section above)
-when the descriptor is created.
+**Knowledge bases own their collection.** `vector_collection` on
+the KB row is the auto-provisioned Astra collection name
+(`wb_vectors_<kb_id>`, hyphen-stripped). The actual vector data
+lives in that Data API Collection, provisioned transactionally
+when the KB is created and dropped when it's deleted.
+
+**Reserved chunk-payload keys.** The KB-scoped ingest pipeline
+stamps `knowledgeBaseUid`, `documentUid`, `chunkIndex`, and
+`chunkText` onto every chunk's payload so KB-scoped search and the
+chunk listing endpoint can filter / display them without a
+secondary lookup.
+
+**Stage 2 schema.** Three additional tables —
+`wb_config_llm_service_by_workspace`,
+`wb_config_mcp_tools_by_workspace`,
+`wb_agentic_agents_by_workspace`,
+`wb_agentic_conversations_by_agent`,
+`wb_agentic_messages_by_conversation` — are provisioned at boot but
+are not yet wired through the runtime. They land with the agent
+execution loop (roadmap Stage 2).
 
 ## Isolation and scoping
 
@@ -210,14 +281,23 @@ when the descriptor is created.
   returning nested resources. Requests against a non-existent
   workspace return `404 workspace_not_found`.
 - Cascade delete:
-  - `DELETE /api/v1/workspaces/{w}` → drops the workspace, its
-    catalogs, its vector-store descriptors, its documents, and the
-    underlying vector-store collections.
-  - `DELETE /api/v1/workspaces/{w}/catalogs/{c}` → drops the
-    catalog and its documents.
-- **Catalog → vector-store binding is N:1** (multiple catalogs may
-  share one underlying collection). This was a deliberate relaxation
-  from an earlier draft's strict 1:1 constraint.
+  - `DELETE /api/v1/workspaces/{w}` → drops the workspace, all
+    knowledge bases (and their underlying collections), all
+    execution services, all RAG documents, all API keys.
+  - `DELETE /api/v1/workspaces/{w}/knowledge-bases/{kb}` → drops
+    the underlying Astra collection first, then the KB row, then
+    cascades RAG document rows.
+- **Service → KB binding is N:1.** A KB binds exactly one
+  embedding service, one chunking service, and (optionally) one
+  reranking service. Multiple KBs can share the same service. A
+  service deletion is refused (409) while any KB still references
+  it.
+- **Service references are immutable post-create.** The
+  `embeddingServiceId` and `chunkingServiceId` on a KB are pinned
+  at creation time — vectors and chunks on disk are bound to the
+  models that produced them. Re-embedding requires a new KB; the
+  PUT schema is `.strict()` so accidentally including those keys
+  in an update body returns 400.
 
 ## Request flow (reference)
 
@@ -248,12 +328,14 @@ Client ──► POST /api/v1/workspaces  body={name, kind}
    c.json(record, 201)
 ```
 
-The catalog ingest pipeline (Phase 2b — shipped) extends the same
-shape with calls to a `Chunker`, an `Embedder`, and the catalog's
-bound vector store, plus a `Document` row that tracks ingest
-status. Synchronous and async (`?async=true`) variants live at
-`POST /catalogs/{c}/ingest`; the async path returns 202 with a job
-pointer and updates progress through the `JobStore` until terminal.
+The KB ingest pipeline extends the same shape with calls to a
+`Chunker`, an `Embedder`, and the KB's auto-provisioned vector
+collection (resolved through `resolveKb`), plus a `RagDocument`
+row that tracks ingest status. Synchronous and async
+(`?async=true`) variants live at
+`POST /knowledge-bases/{kb}/ingest`; the async path returns 202
+with a job pointer and updates progress through the `JobStore`
+until terminal.
 
 ## Conformance
 
diff --git a/docs/configuration.md b/docs/configuration.md
index 9df2429..810f0b0 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -4,9 +4,9 @@ Runtime behavior is driven by a single YAML file, conventionally named
 `workbench.yaml`. The runtime loads it at startup and validates it
 against a strict schema.
 
-**Workspaces, catalogs, and vector stores are not in config.** They're
-runtime data, mutable via the HTTP API. `workbench.yaml` decides two
-things:
+**Workspaces, knowledge bases, and execution services are not in
+config.** They're runtime data, mutable via the HTTP API.
+`workbench.yaml` decides two things:
 
 1. Where that data is persisted (the **control-plane backend**).
 2. Optionally, which **seed workspaces** to load into the memory
@@ -83,7 +83,7 @@ Production deployments should start from
 
 ### `controlPlane`
 
-Picks where workspaces, catalogs, vector-store descriptors, and
+Picks where workspaces, knowledge bases, execution services, and RAG
 documents are persisted. Discriminated on `driver`.
 
 #### `memory` (default)
@@ -132,7 +132,7 @@ multi-writer-safe.
 |-------|------|----------|-------|
 | `endpoint` | URL | yes | Astra Data API endpoint |
 | `tokenRef` | SecretRef | yes | Pointer to the application token (`env:…` / `file:…`) |
-| `keyspace` | string | no (default `workbench`) | Keyspace hosting the four `wb_*` tables |
+| `keyspace` | string | no (default `workbench`) | Keyspace hosting the `wb_*` control-plane tables |
 | `jobPollIntervalMs` | int (50–60000) | `500` | Cross-replica job-subscriber poll interval in ms. Each subscribed `(workspace, jobId)` pair is re-read at this cadence so SSE clients on a different replica from the worker still see updates. Same-replica updates fan out instantly; the poller is a no-op when no one is subscribed. Raise for cost-sensitive deployments where second-scale staleness is fine; lower for hot SSE paths. Astra-only — `memory` and `file` are single-replica by definition. |
 | `jobsResume` | object | off | Cross-replica orphan-sweeper config. See below. |
 
diff --git a/docs/conformance.md b/docs/conformance.md
index 9a2f176..aa73749 100644
--- a/docs/conformance.md
+++ b/docs/conformance.md
@@ -24,9 +24,10 @@ conformance/
 ├── scenarios.md                ← narrative counterpart
 ├── fixtures/                   ← expected normalized responses
 │   ├── workspace-crud-basic.json
-│   ├── catalog-under-workspace.json
-│   ├── vector-store-definition.json
-│   └── vector-store-upsert-and-search.json
+│   ├── workspace-kind-is-immutable.json
+│   ├── workspace-credentials-must-be-secret-ref.json
+│   ├── workspace-test-connection-mock.json
+│   └── workspace-api-key-lifecycle.json
 ├── mock-astra/
 │   └── server.ts               ← stand-in Astra endpoint (Node)
 ├── normalize.mjs               ← shape-agnostic placeholder scrubber
@@ -71,38 +72,33 @@ Current scenarios:
 | Slug | Covers |
 |---|---|
 | `workspace-crud-basic` | Workspace POST / GET / PUT / DELETE lifecycle |
-| `catalog-under-workspace` | Catalogs scoped per workspace |
-| `vector-store-definition` | Vector-store descriptor create + read |
-| `vector-store-upsert-and-search` | Phase 1b data plane — upsert, search with payload filter, single-record delete (and re-delete noop) |
-| `catalog-vector-store-reference-integrity` | Catalog bindings must reference existing same-workspace vector stores; referenced vector stores delete with `409 conflict` |
-| `document-crud-basic` | Document metadata CRUD + cross-catalog isolation |
 | `workspace-kind-is-immutable` | Workspace `kind` cannot be changed after creation |
 | `workspace-credentials-must-be-secret-ref` | Raw credential values are rejected before reaching the SecretResolver |
 | `workspace-test-connection-mock` | Mock workspace connection probe response shape |
 | `workspace-api-key-lifecycle` | API-key issue, list, revoke, list lifecycle |
-| `catalog-ingest-basic` | Sync ingest — chunk + embed + upsert + Document row, plus `409 catalog_not_bound_to_vector_store` on an unbound catalog |
-| `catalog-scoped-document-search` | Search merges `catalogUid` into the filter; foreign-catalog records stay invisible; unbound catalogs return 409 |
-| `catalog-saved-queries` | Saved-query CRUD + post-delete 404 |
-| `vector-store-text-dispatch-mock` | Driver-native `searchByText` on a `mock` workspace with `embedding.provider: mock` |
-| `vector-store-hybrid-and-rerank-mock` | `hybrid: true` + `rerank: true` + `lexicalWeight` lanes; `400 validation_error` for hybrid with a vector body |
-| `catalog-async-ingest-202` | 202 wire shape for `?async=true` (job snapshot at creation time is deterministic; eventual completion stays in runtime tests) |
-
-The runtime additionally tests the following routes through its
-own Vitest suite (timing- or driver-method-dependent, so they
-don't fit the cross-runtime fixture model):
-
-- `GET /catalogs/{c}/documents/{d}/chunks` — driver-side
+
+The corpus shrank during the catalog → knowledge-base refactor: every
+prior catalog / vector-store fixture was retired, and the
+knowledge-base equivalents have not yet been authored. They will land
+back as the new fixture set bakes in. Until then, the runtime
+exercises every KB / services / ingest / search route through its
+Vitest suite (`tests/knowledge-bases.test.ts`, `tests/ingest/`,
+plus the route-level tests under `tests/`).
+
+Routes that stay runtime-only by design (timing- or
+driver-method-dependent):
+
+- `GET /knowledge-bases/{kb}/documents/{d}/chunks` — driver-side
   `listRecords` filtered by `documentUid`
-- `GET /vector-stores/discoverable` + `POST /vector-stores/adopt`
-  — driver-side `listAdoptable`, mocked
-- `DELETE /catalogs/{c}/documents/{d}` — chunk-cascade via
+- `DELETE /knowledge-bases/{kb}/documents/{d}` — chunk-cascade via
   driver `deleteRecords`
 
 These move into conformance once a second runtime starts
 implementing them and the fixture format proves stable across
 drivers.
 
-More land as chat and MCP routes ship.
+More land as KB scenarios are reauthored and as chat / MCP routes
+ship.
 
 ## Fixtures
 
@@ -246,7 +242,7 @@ The conformance harness above runs against the deterministic
 harness lives at
 [`runtimes/typescript/scripts/smoke-astra.ts`](../runtimes/typescript/scripts/smoke-astra.ts)
 that boots the runtime in-process against a **real** Astra Data API
-and exercises the full workspace → vector-store → catalog →
+and exercises the full workspace → services → knowledge-base →
 sync ingest → async ingest → search → cleanup pipeline. Run
 locally with:
 
diff --git a/docs/cross-replica-jobs.md b/docs/cross-replica-jobs.md
index f75bae8..c19ef47 100644
--- a/docs/cross-replica-jobs.md
+++ b/docs/cross-replica-jobs.md
@@ -32,11 +32,11 @@ terminal state.
 ## Today's behavior
 
 The async-ingest path lives in
-[`runtimes/typescript/src/routes/api-v1/documents.ts`](../runtimes/typescript/src/routes/api-v1/documents.ts):
+[`runtimes/typescript/src/routes/api-v1/kb-documents.ts`](../runtimes/typescript/src/routes/api-v1/kb-documents.ts):
 
-1. `POST /catalogs/{c}/ingest?async=true` calls `jobs.create(...)`,
-   spawns `void runAsyncIngest({...})`, and returns 202 to the
-   caller with the job pointer.
+1. `POST /knowledge-bases/{kb}/ingest?async=true` calls
+   `jobs.create(...)`, spawns `void runAsyncIngest({...})`, and
+   returns 202 to the caller with the job pointer.
 2. The detached worker drives chunking → embedding → upsert,
    updating the job record via `jobsStore.update(...)` along the
    way. Failure modes flip the record to `failed` with a sanitized
diff --git a/docs/green-boxes.md b/docs/green-boxes.md
index 63f87ff..ac07d06 100644
--- a/docs/green-boxes.md
+++ b/docs/green-boxes.md
@@ -21,7 +21,7 @@ picks which one to target at deploy time via `BACKEND_URL`.
 
 | Runtime | Location | Status | Astra SDK |
 |---|---|---|---|
-| **TypeScript** (default) | [`runtimes/typescript/`](../runtimes/typescript/) | Operational through Phase 3 + auth (UI, playground, API keys, OIDC login + silent refresh, vector/text search, hybrid + rerank, sync/async ingest with pipeline resume after orphan reclaim, durable JobStore with cross-replica subscription polling + lease/heartbeat + orphan sweeper, saved queries, chunks listing, document delete cascade, adopt-existing-collection flow) | `@datastax/astra-db-ts` |
+| **TypeScript** (default) | [`runtimes/typescript/`](../runtimes/typescript/) | Operational through Phase 3 + auth (UI, playground, API keys, OIDC login + silent refresh, knowledge bases with auto-provisioned collections, chunking / embedding / reranking services, vector/text search, hybrid + rerank, sync/async ingest with pipeline resume after orphan reclaim, durable JobStore with cross-replica subscription polling + lease/heartbeat + orphan sweeper, chunks listing, document delete cascade) | `@datastax/astra-db-ts` |
 | **Python** | [`runtimes/python/`](../runtimes/python/) | FastAPI scaffold — routes return 501 until implemented | `astrapy` (pending) |
 | **Java** | [`runtimes/java/`](../runtimes/java/) | Spring Boot scaffold — routes return 501 until implemented | `astra-db-java` (pending) |
 
@@ -44,9 +44,11 @@ Every green box serves:
 | `GET /docs` | OpenAPI reference UI |
 | `GET /api/v1/openapi.json` | Machine-readable OpenAPI 3.1 doc |
 | `(CRUD)` `/api/v1/workspaces[/{uid}]` | Workspace lifecycle |
-| `(CRUD)` `/api/v1/workspaces/{w}/catalogs[/{uid}]` | Catalog lifecycle |
-| `(CRUD)` `/api/v1/workspaces/{w}/vector-stores[/{uid}]` | Descriptor lifecycle (POST also provisions the collection) |
-| `POST / DELETE / POST` | `/api/v1/workspaces/{w}/vector-stores/{v}/records`, `.../records/{rid}`, `.../search` | Data plane — upsert, delete, vector search |
+| `(CRUD)` `/api/v1/workspaces/{w}/{chunking,embedding,reranking}-services[/{uid}]` | Service-definition lifecycle |
+| `(CRUD)` `/api/v1/workspaces/{w}/knowledge-bases[/{uid}]` | KB lifecycle (POST auto-provisions the underlying vector collection; DELETE drops it) |
+| `POST / DELETE / POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/records`, `.../records/{rid}`, `.../search` | Data plane — upsert, delete, vector / hybrid search |
+| `(CRUD)` `/api/v1/workspaces/{w}/knowledge-bases/{kb}/documents[/{uid}]` | Document metadata + chunks listing under a KB |
+| `POST` | `/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest[?async=true]` | Sync / async ingest pipeline |
 
 Full contract details: [`api-spec.md`](api-spec.md).
 
diff --git a/docs/overview.md b/docs/overview.md
index cf80d6b..e637d5a 100644
--- a/docs/overview.md
+++ b/docs/overview.md
@@ -2,9 +2,10 @@
 
 AI Workbench is a self-hosted control center for building and operating
 retrieval-backed AI applications on DataStax Astra. It gives teams one
-place to connect workspaces, organize source material, create vector
-stores, ingest documents, test search behavior, and keep the same
-workflow portable across runtime implementations.
+place to connect workspaces, register chunking / embedding / reranking
+services, compose them into knowledge bases, ingest documents, test
+search behavior, and keep the same workflow portable across runtime
+implementations.
 
 The goal is not to make operators think about runtimes first. The goal
 is to help a team get from "we have documents and embeddings" to "we can
@@ -17,13 +18,14 @@ together a one-off admin app for every project.
   to manage.
 - **Connect Astra-backed stores** while keeping credentials outside
   records and config.
-- **Model catalogs** around the content domains your application queries.
+- **Define execution services** (chunking, embedding, reranking) once
+  per workspace and bind them into knowledge bases.
+- **Spin up knowledge bases** that auto-provision an Astra collection
+  sized to the bound embedding service.
 - **Ingest documents** through sync or async flows with job status and
   server-sent progress updates.
 - **Test retrieval quality** in the browser with text, vector, hybrid,
-  and rerank search paths.
-- **Save repeatable queries** so useful checks do not live only in a
-  developer's scratch file.
+  and rerank search paths against a chosen knowledge base.
 - **Run the same HTTP contract** from the default TypeScript runtime or
   another language-native runtime as the project evolves.
 
@@ -40,7 +42,7 @@ runtime and UI:
 | Need | Workbench surface |
 |---|---|
 | Bring up a retrieval environment quickly | One Docker image with the UI and default runtime |
-| Keep project data isolated | Workspace-scoped catalogs, vector stores, documents, jobs, and API keys |
+| Keep project data isolated | Workspace-scoped knowledge bases, services, documents, jobs, and API keys |
 | Avoid storing secrets in records | `SecretRef` pointers such as `env:OPENAI_API_KEY` and `file:/path` |
 | Inspect search behavior | Playground for text, vector, hybrid, and rerank queries |
 | Move from demo to production | Memory, file, and Astra-backed control-plane stores |
@@ -51,9 +53,10 @@ runtime and UI:
 AI Workbench has three connected surfaces:
 
 1. **Workspace management.** Create and configure the spaces that own
-   catalogs, vector stores, documents, saved queries, jobs, and API keys.
-2. **Knowledge operations.** Ingest content, track status, bind catalogs
-   to vector stores, and keep the operational state visible.
+   knowledge bases, execution services, documents, jobs, and API keys.
+2. **Knowledge operations.** Compose chunking + embedding + reranking
+   services into a knowledge base, ingest content into it, track job
+   status, and keep the operational state visible.
 3. **Retrieval playground.** Try real searches against real workspace
    data before wiring the same API into an application.
 
@@ -70,8 +73,9 @@ npm run dev
 ```
 
 Then open the bundled UI at `http://localhost:8080`, create a workspace,
-add a vector store, ingest content from the workspace detail page, and
-use the playground to inspect the results.
+register at least one chunking + embedding service, create a knowledge
+base that binds them, ingest content from the workspace detail page,
+and use the playground to inspect the results.
 
 The generated API reference is available from the running runtime at
 `http://localhost:8080/docs`, and the machine-readable contract is
diff --git a/docs/playground.md b/docs/playground.md
index f870a64..9d1d60e 100644
--- a/docs/playground.md
+++ b/docs/playground.md
@@ -1,10 +1,11 @@
 # Playground
 
 The playground is a browser scratchpad for running ad-hoc vector
-and text queries against a workspace's vector stores. It's the
-"aha moment" path for the product — after onboarding a workspace
-and upserting data (via API or an external ingester), open
-[`/playground`](../apps/web/README.md) to see what the store
+and text queries against a workspace's knowledge bases. It's the
+"aha moment" path for the product — after onboarding a workspace,
+registering a chunking + embedding service, creating a knowledge
+base that binds them, and ingesting some content, open
+[`/playground`](../apps/web/README.md) to see what the KB
 actually returns.
 
 No persistence. Nothing is saved between queries. If you want a
@@ -13,15 +14,15 @@ repeatable run, script it against the same HTTP API the UI uses.
 ## UI flow
 
 1. Pick a workspace.
-2. Pick one of its vector stores. The form unlocks.
+2. Pick one of its knowledge bases. The form unlocks.
 3. **Text tab** — type a query. The runtime embeds it (see
    [Dispatch](#dispatch) below) and runs an ANN search. Useful
-   when the store's `embedding` block points at a provider the
+   when the KB's bound embedding service points at a provider the
    runtime can reach (OpenAI today).
 4. **Vector tab** — paste a raw vector. The runtime sends it
    straight through to the driver. Useful for debugging, for
-   stores with no `embedding` config, or when you want to sanity-
-   check a specific coordinate.
+   KBs whose embedding service the runtime can't currently reach,
+   or when you want to sanity-check a specific coordinate.
 5. **Top-K** (1–25) and an **optional filter** (JSON object,
    shallow-equal over payload) round out the knobs.
 6. Hit Run. Results land in a table; each row expands to show the
@@ -29,7 +30,7 @@ repeatable run, script it against the same HTTP API the UI uses.
 
 ## Dispatch
 
-`POST /api/v1/workspaces/{w}/vector-stores/{vs}/search` accepts
+`POST /api/v1/workspaces/{w}/knowledge-bases/{kb}/search` accepts
 either `{ vector }` or `{ text }` (exactly one). When the request
 carries a vector it goes straight to `driver.search()`. Text
 queries pick one of two paths:
@@ -40,7 +41,7 @@ queries pick one of two paths:
    (e.g. Astra's `$vectorize`). Nothing about the vector reaches
    the runtime.
 2. **Client-side embedding** — otherwise, the runtime builds an
-   `Embedder` from the vector store's `embedding` config, embeds
+   `Embedder` from the KB's bound embedding-service config, embeds
    the text locally via the Vercel AI SDK, then does a normal
    vector search.
 
@@ -64,14 +65,15 @@ Vercel AI SDK:
 ```ts
 interface Embedder {
   readonly id: string;            // e.g. "openai:text-embedding-3-small"
-  readonly dimension: number;     // matched against the vector store's declared dim
+  readonly dimension: number;     // matched against the KB's declared dim
   embed(text: string): Promise<readonly number[]>;
   embedMany(texts: readonly string[]): Promise<readonly (readonly number[])[]>;
 }
 ```
 
-The factory (`EmbedderFactory.forConfig(config)`) takes a vector
-store's `EmbeddingConfig` and returns an `Embedder`. It resolves
+The factory (`EmbedderFactory.forConfig(config)`) takes an
+embedding-service `EmbeddingConfig` (resolved from the KB's
+`embeddingServiceId`) and returns an `Embedder`. It resolves
 the `secretRef` through the existing `SecretResolver`, then
 dispatches on `provider`. Today: OpenAI. Adding another provider
 (Cohere, Voyage, Bedrock, …) is one `npm install @ai-sdk/<prov>`
@@ -81,22 +83,22 @@ Errors surface as `EmbedderUnavailableError` (`400
 embedding_unavailable`) when the config is missing a secret or
 names an unsupported provider, and `embedding_dimension_mismatch`
 (`400`) when the provider returns a vector whose length doesn't
-match the vector store's declared dimension.
+match the KB's declared dimension.
 
 ## Astra vectorize
 
 Astra's Data API can do the embedding itself when a collection is
 created with a `vector.service` block. The driver detects this
-path from the descriptor's `embedding` config: when the provider
+path from the KB's embedding-service config: when the provider
 is one of `openai`, `azureOpenAI`, `cohere`, `jinaAI`, `mistral`,
 `nvidia`, `voyageAI` (allowlist in
 [`drivers/astra/vectorize.ts`](../runtimes/typescript/src/drivers/astra/vectorize.ts))
 **and** a `secretRef` is configured, the driver:
 
-1. At `createCollection` time, attaches
+1. At KB-create / `createCollection` time, attaches
    `{ provider, modelName }` to the collection's `vector.service`.
-   New collections under this runtime get server-side embedding by
-   default.
+   New KB collections under this runtime get server-side embedding
+   by default.
 2. At `searchByText` time, resolves the embedding secret, opens
    the collection handle with `embeddingApiKey: <resolved>`, and
    runs `find(sort: { $vectorize: text })`. The runtime never
@@ -125,9 +127,9 @@ Upsert uses the same dispatch:
 - `{id, vector, payload}` → `driver.upsert` (unchanged)
 - `{id, text, payload}` → `driver.upsertByText` first (Astra
   `$vectorize` on insertMany, mock driver's pseudo-embed when
-  the descriptor opts in). On `NotSupportedError` — unsupported
-  provider or legacy collection — the route embeds client-side
-  via the Vercel AI SDK and retries through plain `upsert`.
+  the KB opts in). On `NotSupportedError` — unsupported provider
+  or legacy collection — the route embeds client-side via the
+  Vercel AI SDK and retries through plain `upsert`.
 - Mixed batches → client-embed the text records, combine with the
   vector records, one transactional `upsert` call. (Splitting
   across `upsertByText` + `upsert` would break transactional
@@ -135,8 +137,9 @@ Upsert uses the same dispatch:
 
 ## Hybrid + rerank toggles
 
-The query form exposes two optional toggles when the bound vector
-store has the relevant capabilities enabled on its descriptor:
+The query form exposes two optional toggles when the bound knowledge
+base has the relevant capabilities enabled (lexical configured on
+the KB, reranking service bound):
 
 - **Hybrid** — flips `hybrid: true` on the search request. The
   driver runs a combined vector + lexical lane. On `astra` this
@@ -151,20 +154,21 @@ store has the relevant capabilities enabled on its descriptor:
   request body. Step is `0.05`. Honored on `mock`; ignored on
   `astra` (the reranker owns the blend, so any value the slider
   sends is dropped server-side).
-- **Rerank** — flips `rerank: true`. On `mock` this is a
-  standalone post-processing phase over the retrieval hits. On
-  `astra` standalone rerank is **not** exposed — pair `rerank`
-  with `hybrid: true` to get the combined Astra path; otherwise
-  the API returns 501.
-
-Both toggles default to the bound store's descriptor-level
-`lexical.enabled` / `reranking.enabled`. Drivers that lack the
-relevant method return 501 (`hybrid_not_supported` /
-`rerank_not_supported`); the UI surfaces these as a toast.
+- **Rerank** — flips `rerank: true`. Requires the KB to have a
+  `rerankingServiceId` bound. On `mock` this is a standalone
+  post-processing phase over the retrieval hits. On `astra`
+  standalone rerank is **not** exposed — pair `rerank` with
+  `hybrid: true` to get the combined Astra path; otherwise the
+  API returns 501.
+
+Both toggles default to the bound KB's `lexical.enabled` /
+`rerankingServiceId != null`. Drivers that lack the relevant
+method return 501 (`hybrid_not_supported` / `rerank_not_supported`);
+the UI surfaces these as a toast.
 
 ## Hits are chunks, not documents
 
-The vector store indexes at the chunk level. A document ingested
+The KB indexes at the chunk level. A document ingested
 with three paragraphs becomes three chunks; a search query can
 return all three as separate hits. The results table reflects that
 shape directly: each row shows the chunk's `chunkIndex` (its
@@ -173,49 +177,39 @@ shape directly: each row shows the chunk's `chunkIndex` (its
 row to expand the full payload and score.
 
 To browse chunks **under** a specific document — for inspection,
-not search — open the catalog explorer's document detail dialog
-(click any row in the documents table). The detail dialog lists
-the chunks under that document directly, sorted by `chunkIndex`,
-sourced from `GET /catalogs/{c}/documents/{d}/chunks`.
+not search — open the KB documents view and click any row in the
+documents table. The detail dialog lists the chunks under that
+document directly, sorted by `chunkIndex`, sourced from
+`GET /knowledge-bases/{kb}/documents/{d}/chunks`.
 
-## Catalog ingest from the workspace UI
+## Knowledge base ingest from the workspace UI
 
 Ingest now has a dedicated UI surface, complementing the data-plane
 `POST .../records` upsert path:
 
-- **Workspace detail → Catalogs → Ingest** (or **Open** → catalog
-  explorer → **Ingest**) opens a multi-file / folder queue. Drop
+- **Workspace detail → Knowledge Bases → Ingest** (or **Open** → KB
+  detail → **Ingest**) opens a multi-file / folder queue. Drop
   files (or pick a folder via the directory picker) and they
-  ingest sequentially through the bound vector store. The queue
-  accepts plain-text documents, data, config, and source files such
-  as Markdown, YAML, TOML, JSON, CSV, logs, SQL, and TypeScript.
-  Each row shows live progress for the active file and terminal
-  status for everything before it.
+  ingest sequentially through the KB's bound chunking + embedding
+  services. The queue accepts plain-text documents, data, config,
+  and source files such as Markdown, YAML, TOML, JSON, CSV, logs,
+  SQL, and TypeScript. Each row shows live progress for the active
+  file and terminal status for everything before it.
 - Async ingest jobs stream progress via the SSE
   `GET .../jobs/{jobId}/events` endpoint until a terminal state.
   The dialog renders the live `processed/total` counter and
   surfaces the final `status` + `errorMessage`.
 
 The playground stays a scratchpad — no ingest in the playground
-itself. Use the workspace UI to populate a catalog, then come back
+itself. Use the workspace UI to populate a KB, then come back
 to the playground to query it.
 
 ## Document delete cascade
 
-The catalog explorer's per-row trash button removes a document
-**and** its chunks. The runtime resolves the catalog → bound
-vector store and runs `deleteRecords` on the driver before
-dropping the document row, so deleted documents stop surfacing in
-catalog-scoped search hits immediately.
-
-## Saved queries
-
-Saved queries live under a **catalog**, not the playground. CRUD
-+ `POST /{q}/run` ship under
-`/api/v1/workspaces/{w}/catalogs/{c}/queries`; the workspace UI
-exposes a panel to create/edit/run them. The playground itself
-intentionally stays stateless — it's the scratchpad, saved
-queries are the "I want to keep this around" bucket.
+The KB documents view's per-row trash button removes a document
+**and** its chunks. The runtime runs `deleteRecords` on the KB's
+driver before dropping the document row, so deleted documents stop
+surfacing in KB-scoped search hits immediately.
 
 ## Future extensions
 
diff --git a/docs/roadmap.md b/docs/roadmap.md
index f0872bc..0cadc1d 100644
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
@@ -8,13 +8,14 @@ runnable artifact and a stable slice of the HTTP contract.
 | Phase | Scope | Status |
 |---|---|---|
 | 0 | Runtime bootstrap + docs | ✅ Shipped |
-| 1a | Control-plane CRUD (`/api/v1/workspaces`, `/catalogs`, `/vector-stores`) | ✅ Shipped |
-| 1b | Vector-store data plane (provisioning, upsert, search) | ✅ Shipped |
-| 2a | Document metadata CRUD (`/catalogs/{c}/documents`) | ✅ Shipped |
-| 2b | Ingest + catalog-scoped search + saved queries + cross-replica jobs + adopt + document chunks/delete cascade | ✅ Shipped |
+| 1a | Control-plane CRUD (`/api/v1/workspaces`, `/catalogs`, `/vector-stores`) | ✅ Shipped (later refactored — see Phase KB) |
+| 1b | Vector-store data plane (provisioning, upsert, search) | ✅ Shipped (later refactored — see Phase KB) |
+| 2a | Document metadata CRUD (`/catalogs/{c}/documents`) | ✅ Shipped (later refactored — see Phase KB) |
+| 2b | Ingest + catalog-scoped search + saved queries + cross-replica jobs + adopt + document chunks/delete cascade | ✅ Shipped (saved queries / adopt retired in Phase KB) |
 | 2c | Server-side embedding (Astra `$vectorize`) for search + upsert | ✅ Shipped |
 | 3 | Playground + UI | ✅ Shipped |
 | Auth | Middleware, API keys, OIDC verifier, browser login, silent refresh | ✅ Shipped (1–3c); 4 (RBAC) planned |
+| KB | Catalogs + vector-store descriptors → knowledge bases + chunking/embedding/reranking services | ✅ Shipped |
 | 4+ | Chats, MCP | Reserved |
 
 ## Phase 0 — Bootstrap ✅
@@ -269,11 +270,53 @@ workspace UI rather than the playground itself):
   the workspace UI. The playground itself remains a stateless
   scratchpad by design.
 
+## Phase KB — Knowledge bases & execution services ✅
+
+Refactored the catalog / vector-store / saved-query model into a
+single first-class concept: the **knowledge base**. A KB owns its
+Astra collection end-to-end and binds the chunking + embedding +
+(optional) reranking services that produce its content.
+
+Shipped:
+
+- **Knowledge bases.** New `wb_config_knowledge_bases_by_workspace`
+  table. KB create transactionally provisions the underlying
+  `wb_vectors_<kb_id>` collection through the workspace's driver,
+  using the bound embedding service to determine vector dimensions
+  and similarity. KB delete drops the collection and cascades RAG
+  documents.
+- **Execution services.** Three new tables —
+  `wb_config_chunking_service_by_workspace`,
+  `wb_config_embedding_service_by_workspace`,
+  `wb_config_reranking_service_by_workspace`. Multiple KBs may
+  share a service definition; deleting an in-use embedding /
+  chunking service is blocked with `409 conflict`.
+- **Service immutability for vector-determining bindings.** A KB's
+  `embeddingServiceId` and `chunkingServiceId` are pinned at create
+  time (the collection's dimensions follow the embedding service);
+  `rerankingServiceId` stays mutable.
+- **`resolveKb` synthesis layer.** Existing driver / dispatch /
+  ingest code keeps a vector-store-shaped descriptor view by
+  resolving a KB + its bound services on demand, so the data-plane
+  surface stayed stable across the refactor.
+- **Routes.** All catalog / vector-store / saved-query routes
+  retired in favor of:
+  - `/api/v1/workspaces/{w}/{chunking,embedding,reranking}-services`
+  - `/api/v1/workspaces/{w}/knowledge-bases[/{kb}]`
+  - `.../knowledge-bases/{kb}/{records,search,documents,ingest}`
+- **UI.** Catalogs panel + vector-stores panel removed; replaced
+  with `KnowledgeBasesPanel` and `ServicesPanel`. Playground now
+  picks a KB rather than a vector-store descriptor.
+
+Saved queries and the adopt-existing-collection flow were retired
+in this phase — the new shape doesn't need them, and re-adding
+either would land cleaner under the new model than as a port.
+
 ## Phase 4+ — Chats, MCP
 
 Reserved for integrating:
 
-- A chat harness that runs against a workspace's catalogs.
+- A chat harness that runs against a workspace's knowledge bases.
 - An MCP server view of the workspace for external agents.
 
 Contracts will be defined as those phases approach.
diff --git a/docs/workspaces.md b/docs/workspaces.md
index 731b99e..c0a57ff 100644
--- a/docs/workspaces.md
+++ b/docs/workspaces.md
@@ -1,8 +1,9 @@
 # Workspaces
 
 A **workspace** is the unit of isolation in AI Workbench — a named
-tenant that owns its own catalogs, vector-store descriptors,
-documents, saved queries, and async-ingest jobs.
+tenant that owns its own knowledge bases, execution services
+(chunking / embedding / reranking), RAG documents, async-ingest jobs,
+and API keys.
 
 Workspaces are **runtime records**, not config. They're created via
 `POST /api/v1/workspaces`, fetched via `GET /api/v1/workspaces/{uid}`,
@@ -39,11 +40,11 @@ DELETE /api/v1/workspaces/{uid}        → cascade delete
 
 `DELETE` cascades to:
 
-- Every catalog under the workspace.
-- Every vector-store descriptor under the workspace, after dropping its
-  underlying collection through the workspace's driver.
-- Every document under any of those catalogs.
-- Every saved query under any of those catalogs.
+- Every knowledge base under the workspace, after dropping each KB's
+  underlying vector collection through the workspace's driver.
+- Every RAG document registered against any of those knowledge bases.
+- Every chunking, embedding, and reranking service definition under the
+  workspace.
 - Every async-ingest job record scoped to the workspace.
 - Every workspace API key issued from the workspace.
 
@@ -51,11 +52,11 @@ DELETE /api/v1/workspaces/{uid}        → cascade delete
 
 - A request carrying workspace UID `A` can never read or mutate
   resources in workspace `B`. Nested routes call
-  `ControlPlaneStore.listCatalogs(workspace)` / `…getCatalog(workspace,
-  uid)` etc. and the store asserts the workspace exists before
-  returning anything.
+  `ControlPlaneStore.listKnowledgeBases(workspace)` /
+  `…getKnowledgeBase(workspace, uid)` etc. and the store asserts the
+  workspace exists before returning anything.
 - Logs carry `requestId`. Structured OTel attributes (workspaceUid,
-  catalogUid, jobId) are on the cross-cutting observability
+  knowledgeBaseUid, jobId) are on the cross-cutting observability
   workstream — see [`roadmap.md`](roadmap.md).
 
 ### `kind`
@@ -77,10 +78,10 @@ service.
 
 **`kind` is immutable after creation.** `PUT /api/v1/workspaces/{uid}`
 rejects a `kind` field with `400`. Changing a workspace's kind would
-orphan any vector-store collections already provisioned on the
-original backend — there's no safe way to transparently migrate them,
-so the runtime doesn't try. Delete and recreate the workspace if the
-backend needs to change.
+orphan any KB collections already provisioned on the original backend
+— there's no safe way to transparently migrate them, so the runtime
+doesn't try. Delete and recreate the workspace if the backend needs to
+change.
 
 ### `name` and `endpoint`
 
@@ -89,7 +90,7 @@ backend needs to change.
   display the name but disambiguate by uid when needed.
 - `endpoint` is the **data-plane URL** for this workspace's backend.
   For `astra` / `hcd` workspaces it's the Astra Data API endpoint
-  the vector-store driver dials (`https://<db>-<region>.apps.astra.datastax.com`).
+  the KB driver dials (`https://<db>-<region>.apps.astra.datastax.com`).
   Each Astra DB has its own endpoint — put one workspace per DB to
   route correctly.
 - `endpoint` accepts either a **literal URL** or a **SecretRef**
@@ -123,37 +124,48 @@ Every value in the map must match the `<provider>:<path>` shape —
 `400`. The runtime resolves refs through its `SecretResolver` at the
 moment the workspace's backend needs to be contacted.
 
-## Catalogs and vector stores
+## Knowledge bases and execution services
 
 A workspace owns:
 
-- **Vector-store descriptors** — the `wb_vector_store_by_workspace`
-  rows. Each declares dimensions, similarity, embedding config,
-  lexical config, reranking config. These are *descriptors*, not the
-  vector data itself — the underlying Data API Collection is
-  provisioned transactionally by the workspace's vector-store driver
-  when the descriptor is created.
-- **Catalogs** — named document collections, each optionally
-  `vectorStore`-bound to one of the workspace's descriptors.
+- **Knowledge bases** — the `wb_config_knowledge_bases_by_workspace`
+  rows. Each KB pins an embedding service (which determines the
+  dimensions and similarity metric of its vector collection) and a
+  chunking service, and may optionally bind a reranking service. A
+  KB's underlying Astra collection (`wb_vectors_<kb_id>`) is
+  provisioned transactionally when the KB is created and dropped when
+  it is deleted.
+- **Execution services** — three families of `wb_config_*_service_by_workspace`
+  rows describing the chunking, embedding, and reranking
+  implementations available to KBs in this workspace.
+
+### Knowledge base ↔ service binding (N:1)
+
+**Multiple knowledge bases may share one service definition.** A KB
+holds:
+
+- `embeddingServiceId` (required, **immutable** after KB create — the
+  vector collection's dimensions are pinned at provisioning time)
+- `chunkingServiceId` (required, immutable)
+- `rerankingServiceId` (optional, mutable — reranking is applied at
+  query time and can be added/removed without affecting stored
+  vectors)
 
-### Catalog ↔ vector-store binding (N:1)
-
-**Multiple catalogs may share one vector store.** This was a
-deliberate relaxation from an earlier draft's strict 1:1 constraint.
 The store enforces:
 
-- A catalog's `vectorStore` field (if non-null) must reference a
-  vector store in the same workspace.
-- `DELETE` a vector store is blocked with `409 conflict` while any
-  catalog references it. Clear or move the catalog binding first, then
-  delete the vector store.
+- A KB's `embeddingServiceId` and `chunkingServiceId` must reference
+  services in the same workspace.
+- `DELETE` on an embedding or chunking service is blocked with
+  `409 conflict` while any KB references it. Reassign or delete the
+  KBs first, then delete the service.
 
 The relationship:
 
 ```
-workspace ──► catalog  ──► vector-store descriptor  (N:1)
-                │
-                └──► documents
+workspace ──► knowledge base  ──► chunking service   (N:1)
+                │              ──► embedding service  (N:1)
+                │              ──► reranking service  (N:1, optional)
+                └──► RAG documents
 ```
 
 ## Seeding workspaces for local dev
@@ -178,21 +190,35 @@ current count of workspaces, not a list. Listing is at `GET
 
 ## Example session
 
-Create a mock workspace, add a catalog, list:
+Create a mock workspace, register a chunking + embedding service,
+create a KB binding them, list:
 
 ```bash
 WS_BODY='{"name":"demo","kind":"mock"}'
 WS_UID=$(curl -s -X POST http://localhost:8080/api/v1/workspaces \
   -H "content-type: application/json" -d "$WS_BODY" | jq -r .uid)
 
-CAT_BODY='{"name":"support"}'
-curl -s -X POST http://localhost:8080/api/v1/workspaces/$WS_UID/catalogs \
-  -H "content-type: application/json" -d "$CAT_BODY"
+CHUNK_BODY='{"name":"default-chunker","provider":"mock"}'
+CHUNK_UID=$(curl -s -X POST \
+  http://localhost:8080/api/v1/workspaces/$WS_UID/chunking-services \
+  -H "content-type: application/json" -d "$CHUNK_BODY" | jq -r .uid)
+
+EMBED_BODY='{"name":"default-embedder","provider":"mock","dimensions":1536,"similarity":"cosine"}'
+EMBED_UID=$(curl -s -X POST \
+  http://localhost:8080/api/v1/workspaces/$WS_UID/embedding-services \
+  -H "content-type: application/json" -d "$EMBED_BODY" | jq -r .uid)
+
+KB_BODY=$(jq -n --arg c "$CHUNK_UID" --arg e "$EMBED_UID" \
+  '{name:"support",chunkingServiceId:$c,embeddingServiceId:$e}')
+curl -s -X POST \
+  http://localhost:8080/api/v1/workspaces/$WS_UID/knowledge-bases \
+  -H "content-type: application/json" -d "$KB_BODY"
 
-curl -s http://localhost:8080/api/v1/workspaces/$WS_UID/catalogs
+curl -s http://localhost:8080/api/v1/workspaces/$WS_UID/knowledge-bases
 ```
 
-Delete the workspace — the catalog goes with it:
+Delete the workspace — the KB, its collection, the services, and any
+documents go with it:
 
 ```bash
 curl -X DELETE http://localhost:8080/api/v1/workspaces/$WS_UID
diff --git a/runtimes/README.md b/runtimes/README.md
index 71e5f75..c8bef99 100644
--- a/runtimes/README.md
+++ b/runtimes/README.md
@@ -22,7 +22,7 @@ Astra Data API (via language-native SDK: astrapy, astra-db-ts, …)
 
 | Runtime | Path | Status | Astra SDK |
 |---|---|---|---|
-| TypeScript | [`typescript/`](./typescript/) | Operational through Phase 3 + auth (UI, playground, API keys, OIDC login + silent refresh, vector/text search, hybrid + rerank, sync/async ingest with pipeline resume after orphan reclaim, durable JobStore with cross-replica subscription polling + lease/heartbeat + orphan sweeper, saved queries, chunks listing, document delete cascade, adopt-existing-collection flow) | `@datastax/astra-db-ts` |
+| TypeScript | [`typescript/`](./typescript/) | Operational through Phase 3 + auth (UI, playground, API keys, OIDC login + silent refresh, knowledge bases with auto-provisioned collections, chunking / embedding / reranking services, vector/text search, hybrid + rerank, sync/async ingest with pipeline resume after orphan reclaim, durable JobStore with cross-replica subscription polling + lease/heartbeat + orphan sweeper, chunks listing, document delete cascade) | `@datastax/astra-db-ts` |
 | Python | [`python/`](./python/) | Scaffold — routes return 501 until implemented | `astrapy` (pending) |
 | Java | [`java/`](./java/) | Scaffold (Spring Boot) — routes return 501 until implemented | `astra-db-java` (pending) |
 
diff --git a/runtimes/java/README.md b/runtimes/java/README.md
index bb6201b..6227b83 100644
--- a/runtimes/java/README.md
+++ b/runtimes/java/README.md
@@ -60,13 +60,14 @@ corresponds to one step in
 [`../../conformance/scenarios.md`](../../conformance/scenarios.md).
 Suggested order:
 
-1. `POST /api/v1/workspaces` — scenario 1 step 1. Add an `astra` package
-   that wraps `astra-db-java` for the `wb_*` tables, and wire it into
-   `WorkspaceController`.
-2. `GET` / `PUT` / `DELETE` for workspaces — completes scenario 1.
-3. Catalog routes — completes scenario 2.
-4. Vector-store descriptor routes — completes scenario 3.
-5. Vector-store data plane + documents — scenarios 4, 5.
+1. `POST /api/v1/workspaces` — scenario `workspace-crud-basic` step 1.
+   Add an `astra` package that wraps `astra-db-java` for the `wb_*`
+   tables, and wire it into `WorkspaceController`.
+2. `GET` / `PUT` / `DELETE` for workspaces — completes the workspace
+   scenarios.
+3. Chunking / embedding / reranking service CRUD.
+4. Knowledge-base CRUD with auto-provisioned vector collections.
+5. KB data plane + documents + ingest.
 
 Every time you flip a conformance test green, remove its `@Disabled`
 annotation in
@@ -151,11 +152,11 @@ runtimes/java/
 │   │   │   ├── web/
 │   │   │   │   └── RequestIdFilter.java                       ← X-Request-Id
 │   │   │   ├── model/                                         ← records mirroring TS types
-│   │   │   └── routes/
+│   │   │   └── routes/                                        ← scaffold; align with TS routes when implemented
 │   │   │       ├── OperationalController.java                 ← working: /healthz, /readyz, /version, /
 │   │   │       ├── WorkspaceController.java                   ← 501 stubs
-│   │   │       ├── CatalogController.java                     ← 501 stubs
-│   │   │       ├── VectorStoreController.java                 ← 501 stubs
+│   │   │       ├── ServicesController.java                    ← chunking/embedding/reranking — 501 stubs
+│   │   │       ├── KnowledgeBaseController.java               ← 501 stubs
 │   │   │       └── DocumentController.java                    ← 501 stubs
 │   │   └── resources/
 │   │       └── application.yml
diff --git a/runtimes/python/README.md b/runtimes/python/README.md
index 8e6160f..127f449 100644
--- a/runtimes/python/README.md
+++ b/runtimes/python/README.md
@@ -69,12 +69,15 @@ step in
 [`../../conformance/scenarios.md`](../../conformance/scenarios.md).
 Suggested order:
 
-1. `POST /api/v1/workspaces` — scenario 1 step 1. Plumb astrapy into
-   [`workbench/astra.py`](./src/workbench/) (new file) and wire the
-   route in [`workbench/routes/workspaces.py`](./src/workbench/routes/workspaces.py).
-2. `GET` / `PUT` / `DELETE` for workspaces — completes scenario 1.
-3. Catalog routes — completes scenario 2.
-4. Vector-store descriptor routes — completes scenario 3.
+1. `POST /api/v1/workspaces` — scenario `workspace-crud-basic` step 1.
+   Plumb astrapy into [`workbench/astra.py`](./src/workbench/) (new
+   file) and wire the route in
+   [`workbench/routes/workspaces.py`](./src/workbench/routes/workspaces.py).
+2. `GET` / `PUT` / `DELETE` for workspaces — completes the workspace
+   scenarios.
+3. Chunking / embedding / reranking service CRUD.
+4. Knowledge-base CRUD with auto-provisioned vector collections.
+5. KB data plane — records, search, documents, ingest.
 
 Each time you flip a conformance test green, remove its
 `@pytest.mark.xfail` decorator in
@@ -157,10 +160,10 @@ runtimes/python/
 │   ├── config.py                   ← env-var resolution (ASTRA_*, WORKBENCH_*)
 │   ├── errors.py                   ← ApiError + subclasses + HTTP mapping
 │   ├── models.py                   ← Pydantic models mirroring TS types
-│   └── routes/
+│   └── routes/                    ← scaffold; align with TS routes when implemented
 │       ├── workspaces.py
-│       ├── catalogs.py
-│       ├── vector_stores.py
+│       ├── services.py            ← chunking / embedding / reranking
+│       ├── knowledge_bases.py
 │       └── documents.py
 └── tests/
     ├── conftest.py                 ← FastAPI + mock-astra wiring