From 8925bf6fd7e6876577762785d6b664c33156c261 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Fri, 26 Jun 2026 22:32:03 -0400 Subject: [PATCH 01/16] Add design spec: DuckDB CPU attribution via dedicated executor MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Specs the rework of the DuckDB CPU metric to address review feedback on PR #5642 (caller-side measurement misses parallel worker-thread CPU, blocks async, and relies on incidental ref-struct thread-affinity). Approach: own DuckDB's execution threads (worker pool via external_threads + a dispatcher pool) and sum their CPU as an observable counter — correct for parallel work, async- and thread-safe by construction. Two-repo split: Kurrent.Quack first (executor + native task-scheduler interop + cross-thread CPU read), then KurrentDB (call-site migration + metric registration). Co-Authored-By: Claude Opus 4.8 --- ...026-06-26-duckdb-cpu-attribution-design.md | 148 ++++++++++++++++++ 1 file changed, 148 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md diff --git a/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md new file mode 100644 index 00000000000..7001182d0b6 --- /dev/null +++ b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md @@ -0,0 +1,148 @@ +# DuckDB CPU Attribution — Design Spec + +- **Status:** Proposed (for review) +- **Date:** 2026-06-26 +- **Supersedes:** the caller-side CPU metric in PR #5642 (`kurrentdb.duckdb.cpu.seconds`, `source=caller`) +- **Spans two repositories:** Kurrent.Quack (executor + native interop) and KurrentDB (call-site migration + metric registration) + +## 1. Background & problem + +We want operators to be able to answer: **"What fraction of this node's CPU is DuckDB vs the rest of KurrentDB?"** Process- and system-level CPU are already exported (`kurrentdb_proc_cpu`, `kurrentdb_sys_cpu`); the missing piece is the DuckDB share so the two can be compared. + +PR #5642 shipped a first attempt: `DuckDBCpuMetrics.Measure(activity)` returns a `ref struct` scope that reads the calling thread's CPU (`clock_gettime(CLOCK_THREAD_CPUTIME_ID)` on Linux/macOS, `GetThreadTimes` on Windows) at construction and again at `Dispose`, recording the delta into a counter. It is wrapped around the synchronous DuckDB sections of commit, checkpoint, index reads, and query setup. + +Code review identified three **fundamental** flaws, all rooted in one assumption — *"measure the calling thread across a synchronous span"*: + +1. **Parallel work is invisible.** DuckDB executes parallelizable queries (scans, filters, aggregations, sorts) on its own morsel-scheduler worker threads. `CLOCK_THREAD_CPUTIME_ID` on the calling thread never sees those threads, so the metric undercounts most severely exactly when DuckDB is busiest. This is not an edge case — it is the common case for analytical queries. +2. **It blocks async.** The measurement requires start and end on the same thread within one synchronous region. If these operations ever become genuinely asynchronous (await points), the scope cannot span them — so the metric would actively prevent a desirable refactor. +3. **Thread-affinity of `ref struct` is incidental, not guaranteed.** Today a `ref struct` cannot cross an `await` or be captured, so it stays on one thread in practice. The language does not *guarantee* this, and the C# `ref struct` rules are actively loosening. If start and end ever run on different threads, the per-thread CPU delta is silently wrong — no error, just bad data. + +No refinement of the `ref struct` rescues this; the measurement mechanism must change. + +## 2. Goals & non-goals + +**Goals** +- A **correct total** of DuckDB CPU consumed on the node, exported as a standard OpenTelemetry metric on `/metrics`. +- Correct in the presence of DuckDB's internal parallelism. +- Independent of caller threading and of whether callers are synchronous or asynchronous. +- Free of any "two readings on the same thread" assumption. + +**Non-goals** +- **Per-query or per-activity attribution.** Measuring at the thread level means we cannot say whether a given worker's morsel belongs to a "commit" or a "query". We deliberately trade the (badly-estimated) activity breakdown of the old metric for a correct total. If per-query attribution is wanted later, DuckDB's own query profiling (`CPU_TIME`) is the right tool and is out of scope here (see Appendix). +- Changing what counts as "DuckDB work" — it remains everything executed through the DuckDB engine. + +## 3. Chosen approach: a dedicated DuckDB executor + +Stop measuring the caller. Instead, **own the threads DuckDB executes on, and measure those threads.** + +A new **`DuckDBExecutor`** becomes the single place DuckDB runs. KurrentDB submits DuckDB operations to it and awaits results; the executor runs them on threads it creates, names, and measures. Because every byte of DuckDB CPU — parallel and serial — lands on owned threads, summing those threads' CPU yields the correct total, with no dependence on the caller's thread or sync/async shape. + +This is the "own the worker pool" direction; it is the only option that captures parallel work, covers all activities (queries, commits, checkpoints, background), and is inherently async- and thread-safe. + +## 4. Architecture + +The executor owns two **distinct** sets of named threads: + +- **Workers** (`N = SET threads`): each thread loops `duckdb_execute_tasks_state(sharedState)`, draining DuckDB's task/morsel queue. DuckDB is configured `SET threads = N; SET external_threads = N`, so these owned threads constitute DuckDB's *entire* parallel execution pool. +- **Dispatchers**: a bounded pool that runs the *issuing* side of an operation — the blocking `duckdb_query` / chunk-fetch / appender-flush call that drives the root pipeline. A caller submits an operation; a dispatcher executes it. + +**The separation is the load-bearing invariant.** A thread blocked inside a query/flush call cannot also pump the task queue. If every thread were busy issuing operations, no thread would process morsels and DuckDB would deadlock. Workers and dispatchers must therefore be separate sets. The corollary is the deadlock-freedom guarantee in §8: because morsels are always drained by workers (never blocked by dispatchers), in-flight operations always make progress regardless of dispatcher saturation. + +``` + submit(op) results + KurrentDB call sites ───────────▶ Dispatcher pool ───────▶ awaiting callers + (await Execute) (issue + drive root (any thread) + pipeline; owned, + named, measured) + │ + │ enqueues tasks + ▼ + DuckDB task queue + │ + ▼ + Worker pool (N threads + looping execute_tasks_state; + owned, named, measured) + + metric = Σ CPU(all owned threads) ← sampled on scrape +``` + +## 5. Execution model & API + +**API (Quack):** a single async entry point, approximately: + +```csharp +ValueTask Execute(DuckDBConnection conn, Func op, CancellationToken ct); +``` + +The caller awaits; the op is enqueued; a dispatcher runs `op(conn)`; the result completes the `ValueTask`; the caller resumes on its own context (which no longer matters for measurement). + +**Connection affinity is preserved for free.** DuckDB forbids *concurrent* use of one connection, not use from different threads. KurrentDB already gives each unit of work its own connection — the shared write connection is used serially by the index processor; reads use per-request pooled connections — so two concurrent ops are never submitted for the same connection. A dispatcher simply borrows the connection for the duration of the op. + +## 6. Call-site migration (KurrentDB) + +The following sites change from inline synchronous DuckDB calls to `await executor.Execute(...)`: + +- `DefaultIndexProcessor.Commit` and `UserIndexProcessor.Commit` / `Checkpoint` (appender flush). +- The reader path — `SecondaryIndexReaderBase.GetDbRecordsForwards/Backwards` and the category / event-type / user readers. These already sit under an async `ReadForwards` / `ReadBackwards`, so the async fits naturally. +- `QueryEngine.ExecuteAsync` and `GetArrowSchema`. +- The shutdown checkpoint in `DuckDBConnectionPoolLifetime.StopAsync`. + +**The one genuinely tricky site is streaming reads.** `QueryEngine`'s consumer pulls chunks in a `TryRead` loop, and each `TryFetch` is a DuckDB call that must run on an owned thread. Resolution: run the whole consume loop *inside one `Execute` submission* (the loop executes on a dispatcher), rather than marshalling each fetch individually — one submission per query, every fetch owned. Treated as its own work item. + +Most of the blast radius is converting a few `void`/synchronous DuckDB methods to async on call paths that are *already* async (`ReadForwards`, `ExecuteAsync`, the subscription loop). + +## 7. The metric + +- **Instrument:** an OpenTelemetry **observable counter** `kurrentdb.duckdb.cpu.seconds` (monotonic CPU-seconds), with an optional `role=worker|dispatcher` tag for diagnostics. The DuckDB CPU fraction on a dashboard is `rate(kurrentdb_duckdb_cpu_seconds_total[1m]) / rate(kurrentdb_proc_cpu[1m])` (the existing process-CPU metric named in §1). +- **Sampling:** on each scrape, sum every owned thread's *cumulative* CPU, read **by thread handle** (cross-thread, not "current thread"): + - **Linux:** `pthread_getcpuclockid(thread, &clockid)` then `clock_gettime(clockid)`. + - **Windows:** `GetThreadTimes(handle, …)` (kernel + user time) for any owned thread handle. + - **macOS:** `thread_info(mach_thread, THREAD_BASIC_INFO, …)`. Dev-only platform; degrade to no-op if unavailable (macOS does not implement `pthread_getcpuclockid`). + This generalizes the existing `ThreadCpuTime` shim from "the calling thread" to "any owned thread", and — crucially — reads correctly *even while a worker is parked inside `duckdb_execute_tasks_state`*, because the kernel keeps accounting that thread's CPU regardless. With ~8–16 owned threads sampled every 15s, the cost is negligible. +- **Ownership split:** Quack owns the threads and their handles and exposes their per-thread CPU (e.g., an enumeration of CPU-seconds per owned thread). KurrentDB registers the OTel observable counter that reads it and adds the `KurrentDB.DuckDB` meter to `metricsconfig.json`. OTel/config concerns stay in KurrentDB. +- **Testability:** keep an injectable CPU-time source (as the current implementation has) so the summing and tagging are deterministically unit-testable without relying on real per-thread accounting. + +## 8. Error handling, cancellation & lifecycle + +- **Op failures:** an exception thrown by `op(conn)` on a dispatcher is captured and faulted onto the returned `ValueTask`, surfacing to the awaiting caller exactly as a synchronous throw does today. +- **Cancellation:** the token wired through `Execute` triggers `connection.Interrupt` on the running query (today's `InterruptQueryOnCancellation` behavior); the op throws, the `ValueTask` faults, and the dispatcher is freed for the next op. DuckDB's `Interrupt` → `OperationCanceledException` mapping is retained. +- **No deadlock under load:** if every dispatcher is busy, further submissions queue (bounded) and wait. They cannot deadlock, because morsel processing happens on the worker pool, which is never blocked by dispatchers — so in-flight ops always complete and free dispatchers, draining the queue. +- **Thread lifecycle:** the executor owns its threads and task state (the creator owns teardown). Startup spawns workers (each looping `execute_tasks_state`) and dispatchers. Shutdown order: stop accepting new ops → drain in-flight → `duckdb_finish_execution(state)` so workers return from the native loop → join all threads → run the final checkpoint → dispose connections. The executor subsumes today's `DuckDBConnectionPoolLifetime` shutdown checkpoint. +- **Worker loss:** an uncaught native failure in a worker is logged and surfaced; remaining workers still drain the queue (degraded parallelism, not a hang). Dead threads are not resurrected — a crashing DuckDB worker indicates larger trouble. +- **Configuration:** `threads` (worker count; default to the existing core/RAM heuristic already used for `memory_limit`) and dispatcher count, both overridable. + +## 9. Testing + +**Quack (deterministic unit tests):** +- Injectable clock → CPU summing across owned threads is exact; the `role` tag is correct. +- **Headline test proving concern #1 is solved:** run a parallelizable query, then assert *total DuckDB CPU exceeds wall-clock elapsed*. That is only possible if multiple worker threads are counted — precisely what the caller-side approach could never show (it was bounded by single-thread wall time). +- **Async/thread-independence (concerns #2, #3):** drive an op whose continuation resumes on a different thread; assert the measurement is unaffected. +- **No deadlock under saturation:** submit more concurrent ops than dispatchers; assert all complete. +- **Lifecycle:** clean start/stop, `finish_execution` joins every thread, no leaked threads, shutdown checkpoint runs. +- **Failure & cancellation:** an op that throws faults the awaiting caller; a cancelled op interrupts DuckDB and frees its dispatcher. + +**KurrentDB (functional safety net):** the existing SecondaryIndexing integration tests — reads, subscriptions, FlightSQL, query engine — must pass unchanged through the executor; that proves the call-site migration did not alter behavior. Plus a smoke check that `kurrentdb.duckdb.cpu.seconds` appears on `/metrics`. + +## 10. Repository split, rollout & disposition of PR #5642 + +- **Kurrent.Quack (first):** add the `DuckDBExecutor` (worker + dispatcher pools), the native task-scheduler interop (`duckdb_create_task_state` / `execute_tasks_state` / `finish_execution` / `destroy_task_state`), the cross-thread per-OS CPU read, and the per-thread CPU enumeration. Ship as a new Quack version. +- **KurrentDB (second):** consume the new Quack version; migrate the DuckDB call sites (§6) to the executor; register the `kurrentdb.duckdb.cpu.seconds` observable counter and add the `KurrentDB.DuckDB` meter to `metricsconfig.json`; document the metric in `docs/server/diagnostics/metrics.md`. +- **PR #5642:** **remove** the caller-side CPU measurement (`DuckDBCpuMetrics`, `ThreadCpuTime`, the scope instrumentation and its tests). Per the review, we will not merge a metric whose headline value is wrong for parallel work. The `KurrentDB.DuckDB` meter name and the docs scaffolding may be retained as the landing point for this design. (If #5642 has other unrelated value it can keep it; otherwise it closes in favor of this work.) + +## 11. Risks & open questions + +- **Dispatcher pool sizing.** Too few dispatchers throttle concurrent reads; too many add scheduling overhead against a fixed worker pool. Needs a sensible default and load testing. (Does not affect correctness of the metric, only read latency.) +- **Worker / external-threads interaction under concurrent queries.** Multiple in-flight queries share the single worker pool via DuckDB's global task scheduler. Behavior is expected-standard but must be load- and soak-tested before this becomes the default execution path — it changes DuckDB's execution model from internal to external threads. +- **macOS per-thread CPU.** `pthread_getcpuclockid` is unsupported on macOS; the `thread_info`/mach path must be implemented or the metric degrades to no-op on macOS (acceptable: macOS is a development platform only). +- **Magnitude of the old blind spot.** The headline test (total CPU > wall-clock) will, for the first time, quantify how much CPU the previous caller-side metric was missing — useful validation that the rework was warranted. +- **Scope of the async migration.** Converting the streaming reader to run its consume loop on a dispatcher is the largest single change; it must preserve current read semantics (ordering, cancellation, snapshot capture). + +## Appendix: verified facts (DuckDB 1.5, as shipped) + +- **Task-scheduler C API present** in the shipped `libduckdb` 1.5 binaries (Linux/Windows/macOS): `duckdb_create_task_state`, `duckdb_execute_tasks`, `duckdb_execute_tasks_state`, `duckdb_execute_n_tasks_state`, `duckdb_finish_execution`, `duckdb_task_state_is_finished`, `duckdb_destroy_task_state`, `duckdb_execution_is_finished`. +- **`duckdb_execute_tasks_state` semantics** (DuckDB C API docs): "Execute DuckDB tasks on this thread. The thread will keep on executing tasks forever, until `duckdb_finish_execution` is called on the state. Multiple threads can share the same `duckdb_task_state`." This is exactly the owned-worker-pool primitive. +- **`external_threads` + `threads`** caps total parallelism on DuckDB's global task scheduler; combined with externally-provided threads it replaces the internal pool. +- **DuckDB threads do not name themselves**, so a sampler cannot reliably identify DuckDB's internal workers from outside — which is *why* owning (and naming) the threads is necessary, and why a thread-enumeration sampler was rejected. +- **DuckDB profiling `CPU_TIME`** (the per-query alternative, deliberately out of scope) "measures the CPU time spent on a query, specifically accumulating operator timings; it does not account for parsing or planning." It is per-query, requires profiling to be enabled (overhead), and does not cover non-query work (appender flush / commit / checkpoint). The C symbol `duckdb_get_profiling_info` is present in 1.5. +- **Kurrent.Quack today** exposes neither the task scheduler nor profiling; its `Threading` namespace is the buffered appender only. Both require new Quack work. From de1a91623e366f8f819bfcf673a3efbbccbaab26 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Wed, 1 Jul 2026 18:43:13 -0400 Subject: [PATCH 02/16] Address spec review: #5642 wording, dashboard formula, typo - #5642 is open (not merged): "shipped" -> "proposed", especially since the caller-side metric is being pulled per this spec. - kurrentdb_proc_cpu is an ObservableUpDownCounter (gauge), so rate() is wrong; compare the DuckDB CPU-seconds rate against the gauge directly. - "expected-standard" -> "expected to be standard". Co-Authored-By: Claude Opus 4.8 --- .../specs/2026-06-26-duckdb-cpu-attribution-design.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md index 7001182d0b6..27952cfd2a4 100644 --- a/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md +++ b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md @@ -9,7 +9,7 @@ We want operators to be able to answer: **"What fraction of this node's CPU is DuckDB vs the rest of KurrentDB?"** Process- and system-level CPU are already exported (`kurrentdb_proc_cpu`, `kurrentdb_sys_cpu`); the missing piece is the DuckDB share so the two can be compared. -PR #5642 shipped a first attempt: `DuckDBCpuMetrics.Measure(activity)` returns a `ref struct` scope that reads the calling thread's CPU (`clock_gettime(CLOCK_THREAD_CPUTIME_ID)` on Linux/macOS, `GetThreadTimes` on Windows) at construction and again at `Dispose`, recording the delta into a counter. It is wrapped around the synchronous DuckDB sections of commit, checkpoint, index reads, and query setup. +PR #5642 (open, not merged) proposed a first attempt: `DuckDBCpuMetrics.Measure(activity)` returns a `ref struct` scope that reads the calling thread's CPU (`clock_gettime(CLOCK_THREAD_CPUTIME_ID)` on Linux/macOS, `GetThreadTimes` on Windows) at construction and again at `Dispose`, recording the delta into a counter. It is wrapped around the synchronous DuckDB sections of commit, checkpoint, index reads, and query setup. Code review identified three **fundamental** flaws, all rooted in one assumption — *"measure the calling thread across a synchronous span"*: @@ -94,7 +94,7 @@ Most of the blast radius is converting a few `void`/synchronous DuckDB methods t ## 7. The metric -- **Instrument:** an OpenTelemetry **observable counter** `kurrentdb.duckdb.cpu.seconds` (monotonic CPU-seconds), with an optional `role=worker|dispatcher` tag for diagnostics. The DuckDB CPU fraction on a dashboard is `rate(kurrentdb_duckdb_cpu_seconds_total[1m]) / rate(kurrentdb_proc_cpu[1m])` (the existing process-CPU metric named in §1). +- **Instrument:** an OpenTelemetry **observable counter** `kurrentdb.duckdb.cpu.seconds` (monotonic CPU-seconds), with an optional `role=worker|dispatcher` tag for diagnostics. On a dashboard, `rate(kurrentdb_duckdb_cpu_seconds_total[1m])` yields DuckDB CPU in cores; compare it against the process-CPU signal in the same unit. Note `kurrentdb_proc_cpu` (§1) is a gauge (an `ObservableUpDownCounter` of instantaneous CPU usage), so it must **not** be wrapped in `rate()` — divide by the gauge directly once both are expressed as cores (e.g. `rate(kurrentdb_duckdb_cpu_seconds_total[1m]) / kurrentdb_proc_cpu`, adjusting for the gauge's scaling). - **Sampling:** on each scrape, sum every owned thread's *cumulative* CPU, read **by thread handle** (cross-thread, not "current thread"): - **Linux:** `pthread_getcpuclockid(thread, &clockid)` then `clock_gettime(clockid)`. - **Windows:** `GetThreadTimes(handle, …)` (kernel + user time) for any owned thread handle. @@ -133,7 +133,7 @@ Most of the blast radius is converting a few `void`/synchronous DuckDB methods t ## 11. Risks & open questions - **Dispatcher pool sizing.** Too few dispatchers throttle concurrent reads; too many add scheduling overhead against a fixed worker pool. Needs a sensible default and load testing. (Does not affect correctness of the metric, only read latency.) -- **Worker / external-threads interaction under concurrent queries.** Multiple in-flight queries share the single worker pool via DuckDB's global task scheduler. Behavior is expected-standard but must be load- and soak-tested before this becomes the default execution path — it changes DuckDB's execution model from internal to external threads. +- **Worker / external-threads interaction under concurrent queries.** Multiple in-flight queries share the single worker pool via DuckDB's global task scheduler. Behavior is expected to be standard, but must be load- and soak-tested before this becomes the default execution path — it changes DuckDB's execution model from internal to external threads. - **macOS per-thread CPU.** `pthread_getcpuclockid` is unsupported on macOS; the `thread_info`/mach path must be implemented or the metric degrades to no-op on macOS (acceptable: macOS is a development platform only). - **Magnitude of the old blind spot.** The headline test (total CPU > wall-clock) will, for the first time, quantify how much CPU the previous caller-side metric was missing — useful validation that the rework was warranted. - **Scope of the async migration.** Converting the streaming reader to run its consume loop on a dispatcher is the largest single change; it must preserve current read semantics (ordering, cancellation, snapshot capture). From f4eab4738c89351109acb9fc5cfcb463e571791a Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 13:52:10 -0400 Subject: [PATCH 03/16] Add implementation plan: DuckDB executor integration + /metrics wiring Co-Authored-By: Claude Fable 5 --- .../2026-07-01-duckdb-executor-integration.md | 577 ++++++++++++++++++ 1 file changed, 577 insertions(+) create mode 100644 docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md diff --git a/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md b/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md new file mode 100644 index 00000000000..cc664bf73ad --- /dev/null +++ b/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md @@ -0,0 +1,577 @@ +# DuckDB Executor Integration (KurrentDB) Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Route all KurrentDB DuckDB work through the Kurrent.Quack `DuckDBExecutor` (owned worker/dispatcher threads) and expose `kurrentdb.duckdb.cpu.seconds` as an observable counter on `/metrics`, replacing the caller-side metric proposed in PR #5642 per the approved spec (`docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md`, on the `spec/duckdb-cpu-attribution` branch / PR #5655). + +**Architecture:** A new `DuckDBExecutorLifetime` owns database open (path + `memory_limit` + `threads`/`external_threads` via the executor) and replaces `DuckDBConnectionPoolLifetime`'s shared-RW-pool + per-Kestrel-connection READ_ONLY pool model entirely. Processors keep long-lived connections (obtained from the executor) with their `BufferedView` appenders, but every DuckDB *call* runs on a dispatcher thread — pinned-connection calls via a small Quack addendum (`ExecuteOn`), everything else via `Execute`. The metric is an OpenTelemetry `ObservableCounter` summing `executor.SampleCpu()` per role at scrape time. + +**Tech Stack:** .NET 10, Kurrent.Quack `DuckDBExecutor` (feature branch `feat/duckdb-executor` in `/Users/tony/Documents/Kurrent.Quack`, not yet published — consumed via a local package feed until the prerelease exists), System.Diagnostics.Metrics + OTel Prometheus (existing pipeline). + +## Global Constraints + +- **Two repos:** Task 1 edits `/Users/tony/Documents/Kurrent.Quack` (branch `feat/duckdb-executor` — Tony owns push/PR; commits are fine, do NOT push). All other tasks edit `/Users/tony/Documents/KurrentDB` (branch `feat/duckdb-executor-integration`, off `origin/master` @ `b3c6d1c06`). +- **dotnet:** the PATH `dotnet` is SDK 8 and cannot build net10.0. Every dotnet command in BOTH repos: `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet ...` +- **Tests:** Quack: xunit.v3, `dotnet test test/Kurrent.Quack.Tests/Kurrent.Quack.Tests.csproj -- --filter-class "*Name*"`. KurrentDB: `dotnet test src/KurrentDB.SecondaryIndexing.Tests/KurrentDB.SecondaryIndexing.Tests.csproj` (xunit.v3/MTP, same `--filter-class` syntax after `--`). +- **Platform:** local machine is macOS — `ThreadCpuClock` reads 0 there by design; CPU *values* are CI-verified on Linux. All behavioral tests run fully on macOS. +- **KurrentDB conventions (CLAUDE.md):** no `= null` defaults on required params; no silent `?? fallback`; named booleans at call sites; tabs; file-scoped namespaces per project norms; per-event logs at Verbose, commits at Debug. +- **Executor API (final, from the Quack branch):** `DuckDBExecutor(string connectionString, int workerCount, int dispatcherCount)`; `ValueTask Execute(Func op, CancellationToken ct)`; `IReadOnlyList SampleCpu()` with `readonly record struct CpuSample(string Role, double CpuSeconds)` (roles `"worker"`/`"dispatcher"`); `ValueTask DisposeAsync()`; ctor throws `InvalidOperationException` if `current_setting('threads'/'external_threads')` didn't take effect; dispose-interrupt → `ObjectDisposedException`, caller-token interrupt → `OperationCanceledException`. +- **Gating:** Tasks 1–8 are executable NOW (no published package needed — Task 2 builds a local feed). Task 9 is GATED on the published Kurrent.Quack prerelease and flips the dependency; the KurrentDB PR cannot go green in CI before Task 9. +- Commit messages end with: `Co-Authored-By: Claude Fable 5 ` + +## Design decisions locked by exploration (origin/master @ b3c6d1c06) + +- **No per-connection setup hook is needed.** Both `IDuckDBSetup` implementations (`IndexingDbSchema`, SchemaRegistry's `SchemaDbSchema`) are `OneTimeOnly = true`; the `_repeated` mechanism has zero implementations. One-time setups run once via `executor.Execute` at lifetime construction — same semantics as today (they run once on one shared-pool connection and their effects are visible database-wide). +- **`SchemaDbSchema` must keep running** through the new lifetime (SchemaRegistry registers it into the same `IEnumerable`; its tables live in `kurrent.ddb`). +- **The per-Kestrel-connection READ_ONLY pool mechanism is deleted end-to-end**: `ConnectionScopedDuckDBConnectionPool`, the `ConnectionInterceptor` registration, `DuckDbConnectionPoolMiddleware` + `UseDuckDb()`, the `Pool` property on `ReadIndexEventsForward/Backward`, the `pool` params on `Enumerator.ReadIndexForwards/Backwards/IndexSubscription`, the `GetRequiredService()` in `Streams.Read.cs:210`, `SecondaryIndexReaderBase.GetPool`, and MiniNode's manual feature middleware (`MiniNode.cs:273-277`). Keep Kestrel `UseConnectionInterceptors()` in ClusterVNodeApp — FlightSQL's own `ConnectionState` interceptor still needs it. +- **Streaming queries block a dispatcher by design** (spec §6): `QueryEngine.ExecuteAsync` runs its whole consume loop inside one `Execute` op, blocking that dispatcher on the consumer (`.AsTask().GetAwaiter().GetResult()`); `dispatcherCount` is the concurrency knob. Quack's `WorkItem` already registers `InterruptQueryOnCancellation` and maps interrupts, so QueryEngine sheds its own registration and interrupt mapping. +- **Thread-count defaults:** workers `Math.Clamp(Environment.ProcessorCount / 2, 2, 16)`, dispatchers `Math.Clamp(Environment.ProcessorCount / 2, 2, 8)`, overridable via configuration keys `KurrentDB:DuckDB:WorkerThreads` / `KurrentDB:DuckDB:DispatcherThreads` (initial defaults pending the spec §11 soak). +- **`TryIndex` row-appends stay on the subscription thread.** They write to the in-memory buffered chunk, not the task scheduler; their CPU is negligible and documented as excluded. Flush/commit/checkpoint/reads/queries — all actual DuckDB calls — run on dispatchers. +- **PR #5642 closes** (spec §10): everything on that branch was the caller-side metric; this plan re-creates the two keepable lines (meter name in `metricsconfig.json`, serviceName-aware registration) fresh. + +--- + +### Task 1: Quack addendum — `OpenConnection` + pinned-connection `ExecuteOn` [Quack repo, NOW] + +**Files:** +- Modify: `src/Kurrent.Quack/DuckDBDispatcherPool.cs` +- Modify: `src/Kurrent.Quack/DuckDBExecutor.cs` +- Test: `test/Kurrent.Quack.Tests/DuckDBExecutorTests.cs` + +**Interfaces:** +- Consumes: existing `DuckDBDispatcherPool` internals (`_queue`, `_slotLocks`, `_active`, `_disposing`, `WorkItem`), `DuckDBExecutor._pool`. +- Produces: `DuckDBAdvancedConnection DuckDBExecutor.OpenConnection()` (caller-owned, non-pooled, from the executor's pool — caller disposes); `ValueTask DuckDBExecutor.ExecuteOn(DuckDBAdvancedConnection connection, Func op, CancellationToken ct)` and the same on `DuckDBDispatcherPool` — runs `op` on a dispatcher thread against the CALLER's connection (published to the interrupt slot; caller guarantees the connection is not used concurrently, which the processors do). + +- [ ] **Step 1: Failing test** (append to `DuckDBExecutorTests`): + +```csharp + [Fact] + public async Task ExecuteOnRunsOnDispatcherThreadWithTheProvidedConnection() { + await using var executor = new DuckDBExecutor(CreateRandomConnectionString(), workerCount: 2, dispatcherCount: 2); + using var pinned = executor.OpenConnection(); + + var (threadName, same) = await executor.ExecuteOn(pinned, conn => (Thread.CurrentThread.Name, ReferenceEquals(conn, pinned)), CancellationToken.None); + + Assert.StartsWith("duckdb-dispatcher-", threadName); + Assert.True(same); // the op ran against the caller's connection, not a rented one + } + + [Fact] + public async Task ExecuteOnAfterDisposeFailsWithObjectDisposed() { + var executor = new DuckDBExecutor(CreateRandomConnectionString(), workerCount: 2, dispatcherCount: 1); + var pinned = executor.OpenConnection(); + await executor.DisposeAsync(); + await Assert.ThrowsAsync(async () => + await executor.ExecuteOn(pinned, _ => 0, CancellationToken.None)); + pinned.Dispose(); + } +``` + +- [ ] **Step 2: Verify fail** — `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet test test/Kurrent.Quack.Tests/Kurrent.Quack.Tests.csproj -- --filter-class "*DuckDBExecutorTests*"` → compile error (`OpenConnection`/`ExecuteOn` missing). + +- [ ] **Step 3: Implement.** In `DuckDBDispatcherPool`, generalize the work item to carry an optional pinned connection and split `RunWithConnection`: + +```csharp + public ValueTask Execute(Func op, CancellationToken ct) + => Submit(new WorkItem(pinned: null, op, ct)); + + /// Runs the operation on a dispatcher thread against the caller-provided connection. The caller + /// must guarantee the connection is not used concurrently and owns its lifetime. + public ValueTask ExecuteOn(DuckDBAdvancedConnection connection, Func op, CancellationToken ct) { + ArgumentNullException.ThrowIfNull(connection); + return Submit(new WorkItem(connection, op, ct)); + } + + private ValueTask Submit(WorkItem item) { + if (_disposing || !_queue.Writer.TryWrite(item)) + item.Fail(new ObjectDisposedException(nameof(DuckDBDispatcherPool))); + return new ValueTask(item, item.Version); + } + + // RunWithConnection keeps its exact body but gains a sibling that skips the rent: + private void RunWithPinnedConnection(int index, DuckDBAdvancedConnection connection, Action body) { + lock (_slotLocks[index]) { + if (_disposing) + throw new ObjectDisposedException(nameof(DuckDBDispatcherPool)); // fail before any user code runs + Volatile.Write(ref _active[index], connection); + } + try { + body(connection); + } finally { + lock (_slotLocks[index]) Volatile.Write(ref _active[index], null); + } + } +``` + +`WorkItem` gains the `DuckDBAdvancedConnection? pinned` primary-ctor parameter; its `Run` picks the path (everything else — the catch filters, `InterruptQueryOnCancellation`, exactly-once completion — is unchanged): + +```csharp + public void Run(DuckDBDispatcherPool owner, int index) { + try { + ct.ThrowIfCancellationRequested(); + if (pinned is null) + owner.RunWithConnection(index, Body); + else + owner.RunWithPinnedConnection(index, pinned, Body); + } catch (DuckDBException e) when (e.ErrorType == DuckDBErrorType.Interrupt && ct.IsCancellationRequested) { + _core.SetException(new OperationCanceledException(ct)); + } catch (DuckDBException e) when (e.ErrorType == DuckDBErrorType.Interrupt && owner._disposing) { + _core.SetException(new ObjectDisposedException(nameof(DuckDBDispatcherPool))); + } catch (Exception e) { + _core.SetException(e); + } + + void Body(DuckDBAdvancedConnection connection) { + using var _ = connection.InterruptQueryOnCancellation(ct); + _core.SetResult(op(connection)); + } + } +``` + +In `DuckDBExecutor`: + +```csharp + /// Opens a caller-owned, non-pooled connection to the executor's database. The caller disposes it; + /// run all DuckDB calls on it via so they execute on measured dispatcher threads. + public DuckDBAdvancedConnection OpenConnection() => _pool.Open(); + + public ValueTask ExecuteOn(DuckDBAdvancedConnection connection, Func op, CancellationToken ct) + => _dispatchers.ExecuteOn(connection, op, ct); +``` + +- [ ] **Step 4: Verify pass** (focused), then full Quack suite: expect **78 = 76 passed + 2 skipped** on macOS. +- [ ] **Step 5: Commit** on `feat/duckdb-executor`: `feat: OpenConnection + pinned-connection ExecuteOn (KurrentDB integration addendum)` — do NOT push (Tony owns the remote). + +--- + +### Task 2: Local package feed [Quack repo, NOW — build step, no commit] + +- [ ] **Step 1:** `cd /Users/tony/Documents/Kurrent.Quack && DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet pack Kurrent.Quack.slnx -c Release -p:MinVerVersionOverride=0.0.0-local.1 -o .local-packages` +- [ ] **Step 2:** Verify: `ls .local-packages` → `Kurrent.Quack.0.0.0-local.1.nupkg` and `Kurrent.Quack.Arrow.0.0.0-local.1.nupkg` (both must exist — KurrentDB references both and `ConnectionHelpers` moved from Arrow to core, so they must move in lockstep). +- [ ] **Step 3:** No commit. Re-run this task's Step 1 with `-local.2` etc. whenever Task 1's code changes. + +--- + +### Task 3: KurrentDB consumes the local packages [NOW] + +**Files:** +- Create: `NuGet.config` (repo root, `/Users/tony/Documents/KurrentDB/NuGet.config`) — **not committed**: add the literal line `NuGet.config` to `.git/info/exclude`. +- Modify: `src/Directory.Packages.props:71-72` (Kurrent.Quack + Kurrent.Quack.Arrow versions). + +- [ ] **Step 1:** Write `NuGet.config`: + +```xml + + + + + + +``` + +(No `` — nuget.org stays via defaults.) Then `echo "NuGet.config" >> .git/info/exclude`. + +- [ ] **Step 2:** In `src/Directory.Packages.props` change both lines to `Version="0.0.0-local.1"`: + +```xml + + +``` + +- [ ] **Step 3:** Restore + build the affected projects: `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet build src/KurrentDB.SecondaryIndexing/KurrentDB.SecondaryIndexing.csproj` → succeeds (nothing uses the new API yet; this proves the feed works). +- [ ] **Step 4:** Commit ONLY `src/Directory.Packages.props`: `chore: consume Kurrent.Quack executor build (local feed placeholder version)` — the commit message must note the version is a placeholder flipped in the final task. + +--- + +### Task 4: `DuckDBExecutorLifetime` replaces `DuckDBConnectionPoolLifetime` [NOW] + +**Files:** +- Create: `src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs` +- Modify: `src/KurrentDB.Core/DuckDB/InjectionExtensions.cs` (rewrite `AddDuckDb`; delete the interceptor/middleware wiring and `UseDuckDb`) +- Delete: `src/KurrentDB.Core/DuckDB/DuckDBConnectionPoolLifetime.cs`, `src/KurrentDB.Core/DuckDB/DuckDbConnectionPoolMiddleware.cs` (contains `ConnectionScopedDuckDBConnectionPool`) +- Modify: `src/KurrentDB.Core/ClusterVNodeStartup.cs:141` (remove `app.UseDuckDb();`) and `:317` (new `AddDuckDb` args) +- Modify: `src/KurrentDB.Core.Testing/Helpers/MiniNode.cs:273-277` (delete the manual `ConnectionScopedDuckDBConnectionPool` middleware block; keep `Node.Startup.ConfigureServices/Configure` calls) + +**Interfaces:** +- Consumes: `DuckDBExecutor` (Quack), `IDuckDBSetup` (existing), `TFChunkDbConfig` (`InMemDb`, `Path`). +- Produces: `sealed class DuckDBExecutorLifetime : Disposable, IHostedService` with `DuckDBExecutor Executor { get; }`; DI: `AddDuckDb(this IServiceCollection services, string serviceName, int workerCount, int dispatcherCount)` registering the lifetime (hosted) + `DuckDBExecutor` singleton (`sp => sp.GetRequiredService().Executor`). **`DuckDBConnectionPool` is no longer registered** — remaining consumers migrate in Tasks 5–7 (the solution will not build again until Task 7 completes; run per-project builds as directed per task). + +- [ ] **Step 1: Implementation** (this task is infrastructure — its test is Task 8's fixture suite plus the per-project build gates; write the code first): + +```csharp +// src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs +// (KurrentDB license header) +using System; +using System.Collections.Generic; +using System.IO; +using System.Threading; +using System.Threading.Tasks; +using DotNext; +using Kurrent.Quack; +using KurrentDB.Core.TransactionLog.Chunks; +using KurrentDB.DuckDB; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Logging.Abstractions; + +namespace KurrentDB.Core.DuckDB; + +// Owns the DuckDB executor: database open (with thread + memory settings applied at open), +// one-time schema setups, the shutdown checkpoint, and executor disposal. +public sealed class DuckDBExecutorLifetime : Disposable, IHostedService { + private readonly ILogger _log; + [CanBeNull] private string _tempPath; + + public DuckDBExecutor Executor { get; } + + public DuckDBExecutorLifetime( + TFChunkDbConfig config, + IEnumerable setups, + int workerCount, + int dispatcherCount, + [CanBeNull] ILogger log) { + _log = log ?? NullLogger.Instance; + + var path = config.InMemDb ? GetTempPath() : $"{config.Path}/kurrent.ddb"; + var memoryMib = (int)(GC.GetGCMemoryInfo().TotalAvailableMemoryBytes / 1024 / 1024 * 0.25); // 25% of RAM, as before + var connectionString = $"Data Source={path};memory_limit={memoryMib}MB"; + + Executor = new DuckDBExecutor(connectionString, workerCount, dispatcherCount); + _log.LogInformation("DuckDB executor started at {path}: {workers} workers, {dispatchers} dispatchers, memory_limit {memory}MB", + path, workerCount, dispatcherCount, memoryMib); + + // One-time setups (IndexingDbSchema, SchemaDbSchema) — effects are database-wide, so running + // them once on any executor-owned connection preserves today's semantics exactly. + Executor.Execute(connection => { + foreach (var setup in setups) + setup.Execute(connection); + return 0; + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); + + return; + + string GetTempPath() { + _tempPath = Path.GetTempFileName(); + File.Delete(_tempPath); // DuckDB refuses a pre-existing empty file; recreate at the same path + return _tempPath; + } + } + + public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask; + + public async Task StopAsync(CancellationToken cancellationToken) { + _log.LogDebug("Checkpointing DuckDB"); + await Executor.Execute(connection => { + connection.Checkpoint(); + return 0; + }, cancellationToken); + } + + protected override void Dispose(bool disposing) { + if (disposing) { + Executor.DisposeAsync().AsTask().GetAwaiter().GetResult(); + if (_tempPath != null) { + try { + File.Delete(_tempPath); + } catch (IOException) { + // let the OS clean it up + } + } + } + base.Dispose(disposing); + } +} +``` + +`InjectionExtensions.cs` becomes (whole file; deletes the interceptor, middleware, `UseDuckDb`, and the file-class provider): + +```csharp +// (license header) +using System; +using System.Collections.Generic; +using System.Diagnostics.Metrics; +using Kurrent.Quack; +using KurrentDB.Core.TransactionLog.Chunks; +using KurrentDB.DuckDB; +using Microsoft.Extensions.DependencyInjection; +using Microsoft.Extensions.Logging; + +namespace KurrentDB.Core.DuckDB; + +public static class InjectionExtensions { + public static IServiceCollection AddDuckDb(this IServiceCollection services, string serviceName, int workerCount, int dispatcherCount) { + services.AddSingleton(sp => new DuckDBExecutorLifetime( + sp.GetRequiredService(), + sp.GetServices(), + workerCount, + dispatcherCount, + sp.GetService>())); + services.AddHostedService(sp => sp.GetRequiredService()); + services.AddSingleton(sp => sp.GetRequiredService().Executor); + services.AddSingleton(new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName)); // Task 5b wires the instrument + return services; + } +} +``` + +(Task 5b defines `DuckDBCpuMetrics`; to keep this task compiling on its own, add the class in THIS task as an empty shell `public class DuckDBCpuMetrics { public const string MeterName = "KurrentDB.DuckDB"; public DuckDBCpuMetrics(Meter meter, string serviceName) { } }` — Task 5b fills it. This is the one intentionally-thin seam between the two tasks.) + +`ClusterVNodeStartup.cs`: delete line 141 (`app.UseDuckDb();` and its comment); replace line 317 with: + +```csharp + services.AddDuckDb( + _metricsConfiguration.ServiceName, + workerCount: _configuration.GetValue("KurrentDB:DuckDB:WorkerThreads", Math.Clamp(Environment.ProcessorCount / 2, 2, 16)), + dispatcherCount: _configuration.GetValue("KurrentDB:DuckDB:DispatcherThreads", Math.Clamp(Environment.ProcessorCount / 2, 2, 8))); +``` + +`MiniNode.cs`: delete the `_webHost.Use(async (ctx, next) => { ... ConnectionScopedDuckDBConnectionPool ... });` block (lines 273-277) entirely; `Node.Startup.Configure(_webHost)` stays. + +- [ ] **Step 2: Build gate** — `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet build src/KurrentDB.Core/KurrentDB.Core.csproj` → **expected to FAIL** only at `Streams.Read.cs:210` (`GetRequiredService`) and `Enumerator.*` pool params — those are Task 6's files. If failures appear anywhere else, fix them here. Do NOT commit yet if Core doesn't build: Tasks 4–6 form one atomic commit train; commit at the end of this task only if Core builds after Step 3. +- [ ] **Step 3: To keep Task 4 independently committable**, apply the *mechanical deletion halves* of Task 6 that live in Core in this same commit: remove the `pool` parameter and `Pool` message property plumbing (`Enumerator.ReadIndex.cs:32,43-44,59`, `Enumerator.IndexSubscription.cs:32,50,~380`, `ClientMessage.IndexReads.cs:45,59,97,111`, `Streams.Read.cs:201-219` pool resolution and the three `pool:` arguments). Core now builds: verify with the Step-2 command → 0 errors. +- [ ] **Step 4: Commit** — `feat: DuckDB executor lifetime owns DB open; remove per-connection pool plumbing` + +--- + +### Task 5: Migrate SecondaryIndexing processors to pinned connections [NOW] + +**Files:** +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/ISecondaryIndexProcessor.cs` (`Commit()` → `ValueTask CommitAsync(CancellationToken)`) +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs` +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexProcessor.cs` +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngine.cs`, `UserIndexEngineSubscription.cs` (pass `DuckDBExecutor`; `db.Rent` sites → `Execute`) +- Modify: `src/KurrentDB.SecondaryIndexing/Subscriptions/DefaultIndexSubscription.cs:96-101` and `UserIndexSubscription.cs` (await `CommitAsync`) +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexBuilder.cs` (dispose path) + +**Interfaces:** +- Consumes: `DuckDBExecutor.OpenConnection()`, `ExecuteOn`, `Execute` (Task 1), DI `DuckDBExecutor` (Task 4). +- Produces: processors constructed with `DuckDBExecutor executor` instead of `DuckDBConnectionPool db`; `ValueTask CommitAsync(CancellationToken ct)` on `ISecondaryIndexProcessor` (and `UserIndexProcessor.CheckpointAsync`); `CaptureSnapshot` signatures unchanged (still take the caller's connection — readers own that, Task 6). + +The transformation, shown complete on `DefaultIndexProcessor` (apply the identical pattern to `UserIndexProcessor` — its extra members are listed after): + +- [ ] **Step 1:** Constructor: replace `DuckDBConnectionPool db` with `DuckDBExecutor executor`; store `private readonly DuckDBExecutor _executor;`. Replace `_connection = db.Open();` with `_connection = executor.OpenConnection();`. Replace the ctor-time read (`ReadLastIndexedRecord()` uses `_connection`) with a dispatcher-run call: + +```csharp + var (lastPosition, lastTimestamp) = executor + .ExecuteOn(_connection, ReadLastIndexedRecord, CancellationToken.None) + .AsTask().GetAwaiter().GetResult(); // ctor is synchronous; runs once at startup +``` + +and change `private (TFPos, DateTimeOffset) ReadLastIndexedRecord()` to take the connection: `private static (TFPos, DateTimeOffset) ReadLastIndexedRecord(DuckDBAdvancedConnection connection)` (same body, `connection.` instead of `_connection.`). + +- [ ] **Step 2:** Commit → async, flush on a dispatcher against the pinned connection: + +```csharp + public async ValueTask CommitAsync(CancellationToken ct) { + if (IsDisposingOrDisposed || !Interlocked.FalseToTrue(ref _committing)) + return; + + try { + using var duration = Tracker.StartCommitDuration(); + await _executor.ExecuteOn(_connection, c => { + _appender.Flush(); // appender is bound to _connection; runs on a measured dispatcher thread + return 0; + }, ct); + } catch (Exception e) { + _log.LogError(e, "Failed to commit records to index at log position {LogPosition}", LastIndexedPosition); + throw; + } finally { + Volatile.Write(ref _committing, false); + } + } +``` + +`Dispose(bool)` calls the blocking form once — `CommitAsync(CancellationToken.None).AsTask().GetAwaiter().GetResult();` — before unregister/dispose (dispose stays synchronous, as today). + +- [ ] **Step 3:** `UserIndexProcessor` additionally: `Checkpoint(position, timestamp)` → `ValueTask CheckpointAsync(TFPos position, DateTime timestamp, CancellationToken ct)` wrapping `UserIndexSql.SetCheckpoint(_connection, args)` in `_executor.ExecuteOn(_connection, ...)`; ctor `GetLastKnownRecord()` gets the same ExecuteOn treatment as Step 1; `_sql.CreateUserIndex(_connection)` in the ctor also runs via `ExecuteOn` (it's a DuckDB DDL call). +- [ ] **Step 4:** `UserIndexEngine`/`UserIndexEngineSubscription`: replace the `DuckDBConnectionPool db` parameters with `DuckDBExecutor executor` end-to-end; the two `db.Rent(out var connection)` sites (lines 109, 268) become `await executor.Execute(connection => { DeleteUserIndexTable(connection, name); return 0; }, ct)` (the catch-up cleanup loops over deleted indexes inside one Execute op). `StartUserIndex` passes `executor` into the processor and reader ctors. +- [ ] **Step 5:** Subscriptions: `DefaultIndexSubscription.ProcessEvents` batch commit becomes `await indexProcessor.CommitAsync(token);`; same in `UserIndexSubscription` (both its batch commit and its checkpoint call → `await ...CheckpointAsync(...)`). +- [ ] **Step 6: Build gate** — `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet build src/KurrentDB.SecondaryIndexing/KurrentDB.SecondaryIndexing.csproj` → FAILS only in readers/QueryEngine/StatsService (Task 6/7 files) if at all; processors, engine, subscriptions compile. (If the project can't build partially, proceed to Task 6 and commit Tasks 5+6 together — note it in the commit message.) +- [ ] **Step 7: Commit** — `feat: index processors run DuckDB work on executor dispatchers (pinned connections)` + +--- + +### Task 6: Migrate readers [NOW] + +**Files:** +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/SecondaryIndexReaderBase.cs` +- Modify: `src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexReader.cs`, `Category/CategoryIndexReader.cs`, `EventType/EventTypeIndexReader.cs`, `User/UserIndexReader.cs` +- Modify: `src/KurrentDB.SecondaryIndexing/SecondaryIndexingPlugin.cs` (DI: readers/QueryEngine get `DuckDBExecutor`; drop pool registrations) + +**Interfaces:** +- Consumes: `DuckDBExecutor.Execute` (rented connection per read op). +- Produces: `SecondaryIndexReaderBase(DuckDBExecutor executor, IReadIndex index)`; abstract `GetDbRecordsForwards/Backwards` signatures change from `(DuckDBConnectionPool db, string? id, long startPosition, int maxCount, bool excludeFirst)` to `(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst)` — the base does the executor call; subclasses receive an open connection and keep their `CaptureSnapshot(connection)` + query bodies minus the `Rent` wrapper. + +- [ ] **Step 1:** Base class: replace `GetPool`/`msg.Pool` usage (already deleted from the messages in Task 4 Step 3) with: + +```csharp + async ValueTask<(long, IReadOnlyList)> GetEventsForwards(long startPosition) { + var indexPrepares = await executor.Execute( + connection => GetDbRecordsForwards(connection, id, startPosition, msg.MaxCount, msg.ExcludeStart), + token); + var events = await reader.ReadRecords(indexPrepares, true, token); + return (indexPrepares.Count, events); + } +``` + +(same shape for backwards). Delete `GetPool` entirely. + +- [ ] **Step 2:** Each reader override drops its `using (db.Rent(out var connection))` line and keeps the body: + +```csharp + protected override List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { + var records = new List(maxCount); + using (processor.CaptureSnapshot(connection)) { + // existing ExecuteQuery<...>(...).CopyTo(records) body unchanged + } + return records; + } +``` + +- [ ] **Step 3:** Plugin DI (`SecondaryIndexingPlugin.ConfigureServices`): reader/`QueryEngine`/processor registrations resolve `DuckDBExecutor` (from Core DI) instead of `DuckDBConnectionPool`. +- [ ] **Step 4: Build gate** — SecondaryIndexing project builds except QueryEngine/StatsService (Task 7). Commit (or fold with Task 5 per its Step 6 note) — `feat: index readers execute on DuckDB dispatchers` + +--- + +### Task 7: Migrate QueryEngine, Rewriter, StatsService [NOW] + +**Files:** +- Modify: `src/KurrentDB.SecondaryIndexing/Query/QueryEngine.cs`, `QueryEngine.Rewriter.cs` +- Modify: `src/KurrentDB.SecondaryIndexing/Stats/StatsService.cs` + +**Interfaces:** +- Consumes: `DuckDBExecutor.Execute`; Quack's built-in cancellation-interrupt + interrupt→OCE mapping (so QueryEngine's own `InterruptQueryOnCancellation` and `DuckDBException(Interrupt)` catch are DELETED — Quack raises `OperationCanceledException` for token-interrupts already). +- Produces: `QueryEngine(DefaultIndexProcessor defaultIndex, UserIndexEngine userIndex, DuckDBExecutor executor)`. + +- [ ] **Step 1:** `ExecuteAsync` — the whole rent/snapshot/prepare/consume/cleanup pipeline moves inside ONE `Execute` op (spec §6: streaming blocks a dispatcher by design; `dispatcherCount` bounds concurrent streams): + +```csharp + public async ValueTask ExecuteAsync(ReadOnlyMemory preparedQuery, + TConsumer consumer, QueryExecutionOptions options, CancellationToken token) + where TConsumer : IQueryResultConsumer { + var parsedQuery = new PreparedQuery(preparedQuery.Span); + if (options.CheckIntegrity) + CheckIntegrity(in parsedQuery); + + await executor.Execute(connection => { + var snapshots = new PoolingBufferWriter { Capacity = parsedQuery.ViewCount + 1 }; + var statement = default(PreparedStatement); + var reader = default(QueryResultReader); + try { + CaptureSnapshots(in parsedQuery, connection, snapshots, token); + statement = new(connection, parsedQuery.Query); + consumer.Bind(new QueryBinder(in statement)); + reader = new(in statement, consumer.UseStreaming); + consumer.ConsumeAsync(reader, token).AsTask().GetAwaiter().GetResult(); // dispatcher blocks: by design + reader.ThrowOnError(); + return 0; + } finally { + reader?.Dispose(); + statement.Dispose(); + Disposable.Dispose(snapshots.WrittenMemory.Span); + snapshots.Dispose(); + } + }, token); + } +``` + +(`InterruptQueryOnCancellation` and the `catch (DuckDBException … Interrupt)` are gone — Quack owns both.) `GetArrowSchema` gets the same wrap (no consume loop). `QueryEngine.Rewriter.cs`'s two `sharedPool.Rent` sites become `executor.Execute(connection => …)` — `PrepareQuery` becomes `ValueTask> PrepareQueryAsync(...)`; update its callers (`FlightSqlServer.PlainQuery/PreparedStmt`, `QueryService` in `src/KurrentDB/Components/Query/QueryService.cs`, and the `ReadTests` integration test) to await it. + +- [ ] **Step 2:** `StatsService` — each method's `using var connection = _pool.Open(); using var snapshot = _defaultIndex.CaptureSnapshot(connection); …query…` becomes: + +```csharp + return await _executor.Execute(connection => { + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + // existing query body unchanged + }, ct); +``` + +(methods become async `ValueTask<…>`; update their callers in the UI stats components — `src/KurrentDB/Components/Stats/*.razor.cs` / `UiStatsService` — mechanically to await.) + +- [ ] **Step 3: Build gate** — full solution now builds: `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet build src/KurrentDB.sln` (or the host project `src/KurrentDB/KurrentDB.csproj` if the sln is too slow) → 0 errors. +- [ ] **Step 4: Commit** — `feat: QueryEngine and stats run on DuckDB dispatchers; Quack owns query cancellation` + +--- + +### Task 5b (fold into whichever of Tasks 4–7 lands last before tests): the metric [NOW] + +**Files:** +- Modify: `src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs` (the Task-4 shell becomes real) +- Modify: `src/KurrentDB/metricsconfig.json:5-13` (add `"KurrentDB.DuckDB"` to `Meters`) +- Test: `src/KurrentDB.SecondaryIndexing.Tests/Diagnostics/DuckDBCpuMetricsTests.cs` (new) + +- [ ] **Step 1: Failing test:** + +```csharp +// namespace KurrentDB.SecondaryIndexing.Tests.Diagnostics; xunit.v3 +public class DuckDBCpuMetricsTests { + [Fact] + public void observable_counter_reports_per_role_cpu_sums() { + using var meter = new Meter("test"); + var samples = new List { + new("worker", 1.5), new("worker", 0.5), new("dispatcher", 0.25), + }; + _ = new DuckDBCpuMetrics(meter, "kurrentdb", () => samples); + + List<(double Value, string Role)> observed = []; + using var listener = new MeterListener(); + listener.InstrumentPublished = (i, l) => { + if (i.Meter == meter && i.Name == "kurrentdb.duckdb.cpu.seconds") l.EnableMeasurementEvents(i); + }; + listener.SetMeasurementEventCallback((_, value, tags, _) => { + foreach (var t in tags) if (t.Key == "role") observed.Add((value, (string)t.Value!)); + }); + listener.Start(); + listener.RecordObservableInstruments(); + + Assert.Contains(observed, m => m is { Role: "worker", Value: 2.0 }); + Assert.Contains(observed, m => m is { Role: "dispatcher", Value: 0.25 }); + } +} +``` + +- [ ] **Step 2:** Implementation (`DuckDBCpuMetrics`): ctor `(Meter meter, string serviceName, Func> sampleCpu)`: + +```csharp + meter.CreateObservableCounter($"{serviceName}.duckdb.cpu.seconds", Observe, + description: "CPU time consumed by DuckDB's executor threads, in seconds"); + + IEnumerable> Observe() { + double workers = 0, dispatchers = 0; + foreach (var s in sampleCpu()) + if (s.Role == "worker") workers += s.CpuSeconds; else dispatchers += s.CpuSeconds; + yield return new(workers, new KeyValuePair("role", "worker")); + yield return new(dispatchers, new KeyValuePair("role", "dispatcher")); + } +``` + +`AddDuckDb` wires it: `services.AddSingleton(sp => new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName, () => sp.GetRequiredService().Executor.SampleCpu()));` plus `services.AddSingleton(...)`? No — the metrics object must be *instantiated*; resolve it eagerly from the lifetime's hosted `StartAsync` or register it as an activated singleton: `services.AddActivatedSingleton(...)` (net8+ API) so the instrument exists without a consumer. metricsconfig.json gains `"KurrentDB.DuckDB",` after `"KurrentDB.SecondaryIndexes",`. + +- [ ] **Step 3:** Verify test passes; commit — `feat: kurrentdb.duckdb.cpu.seconds observable counter over executor CPU samples` + +--- + +### Task 8: Test infrastructure + full verification [NOW] + +**Files:** +- Modify: `src/KurrentDB.SecondaryIndexing.Tests/Fixtures/DuckDbIntegrationTest.cs` (construct a `DuckDBExecutor` instead of a raw pool; processors/readers built from it) +- Modify: any fixture/test constructing processors/readers directly (`Indexes/DefaultIndexProcessorTests.cs`, `Indexes/DefaultIndexReaderTests/IndexTestBase.cs`, `IntegrationTests/ReadTests.cs` for `PrepareQueryAsync`) +- No changes needed to `SecondaryIndexingFixture`/`ClusterVNodeApp` (they boot the real node → new DI path exercises everything). + +- [ ] **Step 1:** `DuckDbIntegrationTest` swaps `new DuckDBConnectionPool(...)` for `new DuckDBExecutor($"Data Source={dbPath};", workerCount: 2, dispatcherCount: 2)` and runs the schema setup via `Executor.Execute(...)`; expose `protected readonly DuckDBExecutor Executor;` (tests that rented connections directly now go through `Executor.Execute`). +- [ ] **Step 2:** Run the full SecondaryIndexing suite: `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet test src/KurrentDB.SecondaryIndexing.Tests/KurrentDB.SecondaryIndexing.Tests.csproj` → all green (count will match master's suite plus the new metrics test). +- [ ] **Step 3:** Live smoke (recipe from dev-environment memory): build the host, run `--insecure --node-port 2119 --replication-port 1119 --enable-atom-pub-over-http` with `KURRENTDB__SECONDARYINDEXING__OPTIONS__COMMITBATCHSIZE=1`, write one event via AtomPub, then `curl -s http://localhost:2119/metrics | grep duckdb_cpu` → expect `kurrentdb_duckdb_cpu_seconds_total{...role="worker"...}` and `role="dispatcher"` series with **non-zero dispatcher values** (macOS reads 0 — so on macOS assert only presence of the series; values are CI/Linux). Also verify node startup log shows the executor line and clean shutdown checkpoints. +- [ ] **Step 4:** Commit — `test: fixtures build on DuckDB executor; live /metrics smoke verified` + +--- + +### Task 9: Flip to the published package + PR [GATED on Tony's Quack prerelease] + +- [ ] **Step 1:** Delete `NuGet.config` (and its `.git/info/exclude` line); set `src/Directory.Packages.props` Kurrent.Quack + Kurrent.Quack.Arrow to the real prerelease version (from Tony's tag); restore + full build + full SecondaryIndexing suite. +- [ ] **Step 2:** Commit — `chore: consume published Kurrent.Quack `; push branch; open PR against master titled "Route DuckDB through the owned-thread executor; add kurrentdb.duckdb.cpu.seconds" with the spec + Quack PR linked; verify CI (the SecondaryIndexing suite on Linux exercises real CPU values end-to-end). +- [ ] **Step 3:** Close PR #5642 with a comment: superseded by this PR per the approved spec (§10) — the caller-side measurement is not being merged; the meter name + config landed here instead. + +## Self-review notes + +- Spec coverage: §3–§7 (executor ownership, call-site migration incl. streaming-inside-one-op, metric) → Tasks 4–7, 5b; §8 lifecycle (shutdown checkpoint via executor, dispose order) → Task 4; §10 disposition of #5642 → Task 9; §11 dispatcher sizing → config knobs + defaults in Task 4 (soak still owed before GA — out of plan scope, tracked in memory). +- The `CommitAsync`/`CheckpointAsync`/`PrepareQueryAsync`/`StatsService` async ripples are enumerated with their caller lists; implementers must chase compiler errors within the named files only — any ripple beyond them is a finding to report, not silently fix. +- Known intentionally-accepted losses: READ_ONLY access-mode isolation for reads (all connections now RW; reads are SELECTs); per-Kestrel-connection pool isolation (dispatcherCount now bounds read concurrency); `TryIndex` buffered row-append CPU (unmeasured, negligible, documented). From cbace1d953e2d5f0acce21123c7b8a01b663bf23 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 14:26:26 -0400 Subject: [PATCH 04/16] docs(plan): note arm64 local build must not use /p:Platform=x64 Co-Authored-By: Claude Fable 5 --- docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md b/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md index cc664bf73ad..b1f1ebb639a 100644 --- a/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md +++ b/docs/superpowers/plans/2026-07-01-duckdb-executor-integration.md @@ -12,6 +12,7 @@ - **Two repos:** Task 1 edits `/Users/tony/Documents/Kurrent.Quack` (branch `feat/duckdb-executor` — Tony owns push/PR; commits are fine, do NOT push). All other tasks edit `/Users/tony/Documents/KurrentDB` (branch `feat/duckdb-executor-integration`, off `origin/master` @ `b3c6d1c06`). - **dotnet:** the PATH `dotnet` is SDK 8 and cannot build net10.0. Every dotnet command in BOTH repos: `DOTNET_ROOT=$HOME/.dotnet $HOME/.dotnet/dotnet ...` +- **⚠️ Build command on THIS machine (arm64 Mac):** do NOT pass `/p:Platform=x64` (CLAUDE.md's canonical CI command) — it makes the Roslyn source-generator load fail with `CS8034 ... assembly architecture is not compatible`. Build locally with `-c Release --framework=net10.0` ONLY (native arm64). x64 is a CI concern, not a local one. - **Tests:** Quack: xunit.v3, `dotnet test test/Kurrent.Quack.Tests/Kurrent.Quack.Tests.csproj -- --filter-class "*Name*"`. KurrentDB: `dotnet test src/KurrentDB.SecondaryIndexing.Tests/KurrentDB.SecondaryIndexing.Tests.csproj` (xunit.v3/MTP, same `--filter-class` syntax after `--`). - **Platform:** local machine is macOS — `ThreadCpuClock` reads 0 there by design; CPU *values* are CI-verified on Linux. All behavioral tests run fully on macOS. - **KurrentDB conventions (CLAUDE.md):** no `= null` defaults on required params; no silent `?? fallback`; named booleans at call sites; tabs; file-scoped namespaces per project norms; per-event logs at Verbose, commits at Debug. From fdc87d350487d7db1e68a74b73e5e7a6911c020f Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 14:26:27 -0400 Subject: [PATCH 05/16] chore: consume Kurrent.Quack executor build (local feed placeholder version) Version 0.0.0-local.1 is a local-feed placeholder (NuGet.config, git-excluded) pointing at a dotnet-pack of the Kurrent.Quack feat/duckdb-executor branch. Flipped to the published prerelease in the final integration task once the Quack PR is merged and tagged. Kurrent.Quack.Arrow moves in lockstep (ConnectionHelpers lives in core, referenced by Arrow). Co-Authored-By: Claude Fable 5 --- src/Directory.Packages.props | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/Directory.Packages.props b/src/Directory.Packages.props index 2928d24f570..cf508cf375f 100644 --- a/src/Directory.Packages.props +++ b/src/Directory.Packages.props @@ -68,8 +68,8 @@ - - + + From 67bf821b6c0fba4e825576bd200110cb01be7906 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 14:38:08 -0400 Subject: [PATCH 06/16] feat: DuckDB executor lifetime owns DB open; remove per-connection pool plumbing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces DuckDBConnectionPoolLifetime (shared pool + per-Kestrel-connection READ_ONLY pool via ConnectionInterceptor + RequestServices-swapping middleware) with DuckDBExecutorLifetime, which owns a single Kurrent.Quack DuckDBExecutor that runs all DuckDB work on its own worker/dispatcher threads. This is the foundation for measuring total DuckDB CPU via DuckDBExecutor.SampleCpu(). AddDuckDb now takes (serviceName, workerCount, dispatcherCount) and no longer registers DuckDBConnectionPool or the Kestrel interceptor/middleware. Also removes the Core-side pool plumbing that would otherwise dangle: the Pool property on ReadIndexEventsForward/Backward, the pool ctor params on Enumerator.ReadIndexForwards/Backwards/IndexSubscription, the RequestServices.GetRequiredService() resolution in Streams.Read.cs, and four additional pool: null / .Pool call sites found by the build gate (PublisherReadExtensions, PublisherSubscribeExtensions, SubscriptionsService's long-poll retry cloning). KurrentDB.Core builds green. KurrentDB.SecondaryIndexing intentionally does not build yet (its readers/QueryEngine/StatsService still reference DuckDBConnectionPool) — migrating them is Tasks 5-7. Co-Authored-By: Claude Fable 5 --- .../Helpers/MiniNode.cs | 6 - .../Bus/Extensions/PublisherReadExtensions.cs | 2 - .../PublisherSubscribeExtensions.cs | 1 - src/KurrentDB.Core/ClusterVNodeStartup.cs | 8 +- .../DuckDB/DuckDBConnectionPoolLifetime.cs | 134 ------------------ src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs | 14 ++ .../DuckDB/DuckDBExecutorLifetime.cs | 83 +++++++++++ .../DuckDB/DuckDbConnectionPoolMiddleware.cs | 35 ----- .../DuckDB/InjectionExtensions.cs | 63 ++------ .../Messages/ClientMessage.IndexReads.cs | 5 - .../Services/SubscriptionsService.cs | 2 +- .../Enumerator.IndexSubscription.cs | 5 - .../Enumerators/Enumerator.ReadIndex.cs | 5 - .../Services/Transport/Grpc/Streams.Read.cs | 21 ++- 14 files changed, 125 insertions(+), 259 deletions(-) delete mode 100644 src/KurrentDB.Core/DuckDB/DuckDBConnectionPoolLifetime.cs create mode 100644 src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs create mode 100644 src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs delete mode 100644 src/KurrentDB.Core/DuckDB/DuckDbConnectionPoolMiddleware.cs diff --git a/src/KurrentDB.Core.Testing/Helpers/MiniNode.cs b/src/KurrentDB.Core.Testing/Helpers/MiniNode.cs index 4aa8c1824d2..d817dcf13b4 100644 --- a/src/KurrentDB.Core.Testing/Helpers/MiniNode.cs +++ b/src/KurrentDB.Core.Testing/Helpers/MiniNode.cs @@ -22,7 +22,6 @@ using KurrentDB.Core.Bus; using KurrentDB.Core.Certificates; using KurrentDB.Core.Configuration.Sources; -using KurrentDB.Core.DuckDB; using KurrentDB.Core.Messages; using KurrentDB.Core.Services.Monitoring; using KurrentDB.Core.Services.Storage; @@ -270,11 +269,6 @@ public MiniNode(string pathname, builder.Services.AddSerilog(); Node.Startup.ConfigureServices(builder.Services); _webHost = builder.Build(); - _webHost.Use(async (ctx, next) => { - var factory = ctx.RequestServices.GetRequiredService(); - ctx.Features.Set(new(factory)); - await next(); - }); Node.Startup.Configure(_webHost); _started = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously); _adminUserCreated = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously); diff --git a/src/KurrentDB.Core/Bus/Extensions/PublisherReadExtensions.cs b/src/KurrentDB.Core/Bus/Extensions/PublisherReadExtensions.cs index f39c94db551..336d932a613 100644 --- a/src/KurrentDB.Core/Bus/Extensions/PublisherReadExtensions.cs +++ b/src/KurrentDB.Core/Bus/Extensions/PublisherReadExtensions.cs @@ -255,7 +255,6 @@ IAsyncEnumerator GetEnumerator() { user: SystemAccounts.System, requiresLeader: false, expiryStrategy: DefaultExpiryStrategy.Instance, - pool: null, cancellationToken: cancellationToken ) : new Enumerator.ReadIndexBackwards( @@ -266,7 +265,6 @@ IAsyncEnumerator GetEnumerator() { user: SystemAccounts.System, requiresLeader: false, expiryStrategy: DefaultExpiryStrategy.Instance, - pool: null, cancellationToken: cancellationToken ); } diff --git a/src/KurrentDB.Core/Bus/Extensions/PublisherSubscribeExtensions.cs b/src/KurrentDB.Core/Bus/Extensions/PublisherSubscribeExtensions.cs index c45f54710b3..3e5cd332410 100644 --- a/src/KurrentDB.Core/Bus/Extensions/PublisherSubscribeExtensions.cs +++ b/src/KurrentDB.Core/Bus/Extensions/PublisherSubscribeExtensions.cs @@ -80,7 +80,6 @@ public static async IAsyncEnumerable SubscribeToIndex(this IPublis checkpoint: position, user: SystemAccounts.System, requiresLeader: false, - pool: null, cancellationToken: cancellationToken ); diff --git a/src/KurrentDB.Core/ClusterVNodeStartup.cs b/src/KurrentDB.Core/ClusterVNodeStartup.cs index a4676a82735..71a9990d00d 100644 --- a/src/KurrentDB.Core/ClusterVNodeStartup.cs +++ b/src/KurrentDB.Core/ClusterVNodeStartup.cs @@ -137,9 +137,6 @@ public void Configure(WebApplication app) { app.UseAuthorization(); app.UseAntiforgery(); - // provides a lazy DuckDB connection pool unique to the connection - app.UseDuckDb(); - // allow all subsystems to register their legacy controllers before calling MapLegacyHttp foreach (var component in _plugableComponents) component.ConfigureApplication(app, _configuration); @@ -314,7 +311,10 @@ public void ConfigureServices(IServiceCollection services) { services.AddGrpcReflection(); #endif - services.AddDuckDb(); + services.AddDuckDb( + _metricsConfiguration.ServiceName, + workerCount: _configuration.GetValue("KurrentDB:DuckDB:WorkerThreads", Math.Clamp(Environment.ProcessorCount / 2, 2, 16)), + dispatcherCount: _configuration.GetValue("KurrentDB:DuckDB:DispatcherThreads", Math.Clamp(Environment.ProcessorCount / 2, 2, 8))); // Ask the node itself to add DI registrations _configureNodeServices(services); diff --git a/src/KurrentDB.Core/DuckDB/DuckDBConnectionPoolLifetime.cs b/src/KurrentDB.Core/DuckDB/DuckDBConnectionPoolLifetime.cs deleted file mode 100644 index d6f3f6ff81e..00000000000 --- a/src/KurrentDB.Core/DuckDB/DuckDBConnectionPoolLifetime.cs +++ /dev/null @@ -1,134 +0,0 @@ -// Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. -// Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). - -using System; -using System.Collections.Generic; -using System.IO; -using System.Linq; -using System.Threading; -using System.Threading.Tasks; -using DotNext; -using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; -using KurrentDB.Core.TransactionLog.Chunks; -using KurrentDB.DuckDB; -using Microsoft.Extensions.Hosting; -using Microsoft.Extensions.Logging; -using Microsoft.Extensions.Logging.Abstractions; - -namespace KurrentDB.Core.DuckDB; - -// Manages the lifetime of the Shared pool -// Also produces additional pools on demand that the caller should dispose. -public class DuckDBConnectionPoolLifetime : Disposable, IHostedService { - private readonly string _path; - private readonly IReadOnlyList _repeated; - private readonly ILogger _log; - [CanBeNull] private string _tempPath; - - public DuckDBConnectionPool Shared { get; } - - public DuckDBConnectionPoolLifetime( - TFChunkDbConfig config, - IEnumerable setups, - [CanBeNull] ILogger log) { - - _path = config.InMemDb ? GetTempPath() : $"{config.Path}/kurrent.ddb"; - _log = log ?? NullLogger.Instance; - - var once = new List(); - var repeated = new List(); - foreach (var duckDBSetup in setups) { - if (duckDBSetup.OneTimeOnly) { - once.Add(duckDBSetup); - } else { - repeated.Add(duckDBSetup); - } - } - _repeated = repeated; - - Shared = CreatePool(isReadOnly: false, log: true); - using (Shared.Rent(out var connection)) { - foreach (var s in once) - s.Execute(connection); - } - - return; - - string GetTempPath() { - _tempPath = Path.GetTempFileName(); - File.Delete(_tempPath); - return _tempPath; - } - } - - public DuckDBConnectionPool CreatePool() => CreatePool(isReadOnly: true, log: false); // no writes go through here so set read only - - private ConnectionPoolWithFunctions CreatePool(bool isReadOnly, bool log) { - var availableRamMib = CalculateRam(); - var duckDbRamMib = (int)(availableRamMib * 0.25); - var settings = new Dictionary { - ["memory_limit"] = $"{duckDbRamMib}MB", // total, not per connection - ["access_mode"] = isReadOnly ? "READ_ONLY" : "READ_WRITE", - }; - var pool = new ConnectionPoolWithFunctions($"Data Source={_path};{GetParamsString()}", _repeated); - - if (log) - _log.LogInformation("Created DuckDB connection pool at {path} with {settings}", _path, settings); - return pool; - - static long CalculateRam() { - var totalRam = GC.GetGCMemoryInfo().TotalAvailableMemoryBytes; - return totalRam / 1024 / 1024; - } - - string GetParamsString() { - var list = settings.Keys.Select(x => $"{x}={settings[x]}"); - return string.Join(";", list); - } - } - - public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask; - - public Task StopAsync(CancellationToken cancellationToken) { - _log.LogDebug("Checkpointing DuckDB connection"); - var connection = Shared.Open(); - try { - connection.Checkpoint(); - } catch (Exception ex) { - return Task.FromException(ex); - } finally { - connection.Dispose(); - } - - return Task.CompletedTask; - } - - protected override void Dispose(bool disposing) { - if (disposing) { - Shared.Dispose(); - if (_tempPath != null) { - try { - File.Delete(_tempPath); - } catch (IOException) { - // let the file stay and be cleaned up by the OS - } - } - } - - base.Dispose(disposing); - } - - private class ConnectionPoolWithFunctions(string connectionString, IReadOnlyList setup) : DuckDBConnectionPool(connectionString) { - protected override void Initialize(DuckDBAdvancedConnection connection) { - base.Initialize(connection); - for (var i = 0; i < setup.Count; i++) { - try { - setup[i].Execute(connection); - } catch (Exception) { - // it happens for some reason, investigating - } - } - } - } -} diff --git a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs new file mode 100644 index 00000000000..e52d151fd25 --- /dev/null +++ b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs @@ -0,0 +1,14 @@ +// Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. +// Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). + +using System.Diagnostics.Metrics; + +namespace KurrentDB.Core.DuckDB; + +// Shell for now — Task 5b wires the instrument that reports DuckDBExecutor.SampleCpu() readings. +public class DuckDBCpuMetrics { + public const string MeterName = "KurrentDB.DuckDB"; + + public DuckDBCpuMetrics(Meter meter, string serviceName) { + } +} diff --git a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs new file mode 100644 index 00000000000..8e83c99b13a --- /dev/null +++ b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs @@ -0,0 +1,83 @@ +// Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. +// Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). + +using System; +using System.Collections.Generic; +using System.IO; +using System.Threading; +using System.Threading.Tasks; +using DotNext; +using Kurrent.Quack; +using KurrentDB.Core.TransactionLog.Chunks; +using KurrentDB.DuckDB; +using Microsoft.Extensions.Hosting; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Logging.Abstractions; + +namespace KurrentDB.Core.DuckDB; + +// Owns the DuckDB executor: database open (with thread + memory settings applied at open), +// one-time schema setups, the shutdown checkpoint, and executor disposal. +public sealed class DuckDBExecutorLifetime : Disposable, IHostedService { + private readonly ILogger _log; + [CanBeNull] private string _tempPath; + + public DuckDBExecutor Executor { get; } + + public DuckDBExecutorLifetime( + TFChunkDbConfig config, + IEnumerable setups, + int workerCount, + int dispatcherCount, + [CanBeNull] ILogger log) { + _log = log ?? NullLogger.Instance; + + var path = config.InMemDb ? GetTempPath() : $"{config.Path}/kurrent.ddb"; + var memoryMib = (int)(GC.GetGCMemoryInfo().TotalAvailableMemoryBytes / 1024 / 1024 * 0.25); // 25% of RAM, as before + var connectionString = $"Data Source={path};memory_limit={memoryMib}MB"; + + Executor = new DuckDBExecutor(connectionString, workerCount, dispatcherCount); + _log.LogInformation("DuckDB executor started at {path}: {workers} workers, {dispatchers} dispatchers, memory_limit {memory}MB", + path, workerCount, dispatcherCount, memoryMib); + + // One-time setups (IndexingDbSchema, SchemaDbSchema) — effects are database-wide, so running + // them once on any executor-owned connection preserves today's semantics exactly. + Executor.Execute(connection => { + foreach (var setup in setups) + setup.Execute(connection); + return 0; + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); + + return; + + string GetTempPath() { + _tempPath = Path.GetTempFileName(); + File.Delete(_tempPath); // DuckDB refuses a pre-existing empty file; recreate at the same path + return _tempPath; + } + } + + public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask; + + public async Task StopAsync(CancellationToken cancellationToken) { + _log.LogDebug("Checkpointing DuckDB"); + await Executor.Execute(connection => { + connection.Checkpoint(); + return 0; + }, cancellationToken); + } + + protected override void Dispose(bool disposing) { + if (disposing) { + Executor.DisposeAsync().AsTask().GetAwaiter().GetResult(); + if (_tempPath != null) { + try { + File.Delete(_tempPath); + } catch (IOException) { + // let the OS clean it up + } + } + } + base.Dispose(disposing); + } +} diff --git a/src/KurrentDB.Core/DuckDB/DuckDbConnectionPoolMiddleware.cs b/src/KurrentDB.Core/DuckDB/DuckDbConnectionPoolMiddleware.cs deleted file mode 100644 index 0828a345d43..00000000000 --- a/src/KurrentDB.Core/DuckDB/DuckDbConnectionPoolMiddleware.cs +++ /dev/null @@ -1,35 +0,0 @@ -// Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. -// Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). - -using System; -using System.Threading; -using Kurrent.Quack.ConnectionPool; - -namespace KurrentDB.Core.DuckDB; - -/// -/// Represents a DuckDBConnectionPool scoped to a single kestrel connection -/// -/// -/// The underlying pool is constructed lazily. -/// Dispose must be called after GetPool is no longer being called. -/// -public sealed class ConnectionScopedDuckDBConnectionPool(DuckDBConnectionPoolLifetime factory) : IDisposable { - volatile DuckDBConnectionPool _pool; - - public DuckDBConnectionPool GetPool() { - if (_pool is not { } pool) { - pool = factory.CreatePool(); - if (Interlocked.CompareExchange(ref _pool, pool, null) is { } existing) { - pool.Dispose(); - pool = existing; - } - } - - return pool; - } - - public void Dispose() { - Interlocked.Exchange(ref _pool, null)?.Dispose(); - } -} diff --git a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs index 2e9228ff705..23251d3d700 100644 --- a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs +++ b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs @@ -2,60 +2,27 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using System; -using System.Threading.Tasks; -using Kurrent.Quack.ConnectionPool; +using System.Collections.Generic; +using System.Diagnostics.Metrics; +using Kurrent.Quack; +using KurrentDB.Core.TransactionLog.Chunks; using KurrentDB.DuckDB; -using Microsoft.AspNetCore.Builder; -using Microsoft.AspNetCore.Connections; -using Microsoft.AspNetCore.Http; -using Microsoft.AspNetCore.Http.Features; -using Microsoft.AspNetCore.Server.Kestrel.Core; using Microsoft.Extensions.DependencyInjection; +using Microsoft.Extensions.Logging; namespace KurrentDB.Core.DuckDB; public static class InjectionExtensions { - public static IServiceCollection AddDuckDb(this IServiceCollection services) { - services.AddSingleton(); - services.AddHostedService(sp => sp.GetRequiredService()); - services.AddSingleton(sp => sp.GetRequiredService().Shared); - services.AddSingleton(); - services.AddSingleton(CreatePoolPerConnectionInterceptor); + public static IServiceCollection AddDuckDb(this IServiceCollection services, string serviceName, int workerCount, int dispatcherCount) { + services.AddSingleton(sp => new DuckDBExecutorLifetime( + sp.GetRequiredService(), + sp.GetServices(), + workerCount, + dispatcherCount, + sp.GetService>())); + services.AddHostedService(sp => sp.GetRequiredService()); + services.AddSingleton(sp => sp.GetRequiredService().Executor); + services.AddSingleton(new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName)); // Task 5b wires the instrument return services; } - - private static ConnectionInterceptor CreatePoolPerConnectionInterceptor(IServiceProvider provider) - => provider.InjectPoolPerConnectionAsync; - - private static async Task InjectPoolPerConnectionAsync(this IServiceProvider services, - ConnectionDelegate next, - ConnectionContext context) { - // pool is disposed when the connection closes - var poolFactory = services.GetRequiredService(); - using var pool = new ConnectionScopedDuckDBConnectionPool(poolFactory); - context.Features.Set(pool); - await next(context); - // guaranteed no request handlers are running when the pool wrapper is disposed - } - - public static IApplicationBuilder UseDuckDb(this IApplicationBuilder app) { - app.UseMiddleware(); - return app; - } -} - -file class DuckDbConnectionPoolMiddleware : IMiddleware { - public Task InvokeAsync(HttpContext context, RequestDelegate next) { - context.RequestServices = new DuckDBConnectionPoolProvider( - context.Features, - context.RequestServices); - return next(context); - } - - sealed class DuckDBConnectionPoolProvider(IFeatureCollection features, IServiceProvider inner) : IServiceProvider { - public object GetService(Type serviceType) => - serviceType == typeof(DuckDBConnectionPool) - ? features.Get().GetPool() - : inner.GetService(serviceType); - } } diff --git a/src/KurrentDB.Core/Messages/ClientMessage.IndexReads.cs b/src/KurrentDB.Core/Messages/ClientMessage.IndexReads.cs index 7fdac2b0f3d..2c9a28a7a0d 100644 --- a/src/KurrentDB.Core/Messages/ClientMessage.IndexReads.cs +++ b/src/KurrentDB.Core/Messages/ClientMessage.IndexReads.cs @@ -5,7 +5,6 @@ using System.Collections.Generic; using System.Security.Claims; using System.Threading; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.Data; using KurrentDB.Core.Messaging; @@ -42,7 +41,6 @@ public partial class ReadIndexEventsForward( long? validationTfLastCommitPosition, ClaimsPrincipal user, bool replyOnExpired, - DuckDBConnectionPool pool, TimeSpan? longPollTimeout = null, DateTime? expires = null, CancellationToken cancellationToken = default) @@ -56,7 +54,6 @@ public partial class ReadIndexEventsForward( public readonly string IndexName = indexName; public readonly long? ValidationTfLastCommitPosition = validationTfLastCommitPosition; public readonly TimeSpan? LongPollTimeout = longPollTimeout; - public readonly DuckDBConnectionPool Pool = pool; public override string ToString() => $"{base.ToString()}, " + @@ -94,7 +91,6 @@ public partial class ReadIndexEventsBackward( long? validationTfLastCommitPosition, ClaimsPrincipal user, bool replyOnExpired, - DuckDBConnectionPool pool, TimeSpan? longPollTimeout = null, DateTime? expires = null, CancellationToken cancellationToken = default) @@ -108,7 +104,6 @@ public partial class ReadIndexEventsBackward( public readonly string IndexName = indexName; public readonly long? ValidationTfLastCommitPosition = validationTfLastCommitPosition; public readonly TimeSpan? LongPollTimeout = longPollTimeout; - public readonly DuckDBConnectionPool Pool = pool; public override string ToString() => $"{base.ToString()}, " + diff --git a/src/KurrentDB.Core/Services/SubscriptionsService.cs b/src/KurrentDB.Core/Services/SubscriptionsService.cs index dc4f613530c..c5cf39191e3 100644 --- a/src/KurrentDB.Core/Services/SubscriptionsService.cs +++ b/src/KurrentDB.Core/Services/SubscriptionsService.cs @@ -308,7 +308,7 @@ private static Message CloneReadRequestWithNoPollFlag(Message originalRequest) { allReq.PreparePosition, allReq.MaxCount, allReq.ResolveLinkTos, allReq.RequireLeader, allReq.ValidationTfLastCommitPosition, allReq.User, allReq.ReplyOnExpired), ClientMessage.ReadIndexEventsForward indexReq => new ClientMessage.ReadIndexEventsForward(indexReq.InternalCorrId, indexReq.CorrelationId, indexReq.Envelope, indexReq.IndexName, indexReq.CommitPosition, indexReq.PreparePosition, indexReq.ExcludeStart, indexReq.MaxCount, indexReq.RequireLeader, indexReq.ValidationTfLastCommitPosition, indexReq.User, - indexReq.ReplyOnExpired, indexReq.Pool), + indexReq.ReplyOnExpired), _ => throw new Exception($"Unexpected read request of type {originalRequest.GetType()} for long polling: {originalRequest}.") }; } diff --git a/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.IndexSubscription.cs b/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.IndexSubscription.cs index b4b1b40fc89..67bcc578d00 100644 --- a/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.IndexSubscription.cs +++ b/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.IndexSubscription.cs @@ -7,7 +7,6 @@ using System.Threading; using System.Threading.Channels; using System.Threading.Tasks; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Common.Utils; using KurrentDB.Core.Bus; using KurrentDB.Core.Data; @@ -29,7 +28,6 @@ public sealed class IndexSubscription : IAsyncEnumerator { private readonly IPublisher _bus; private readonly ClaimsPrincipal _user; private readonly bool _requiresLeader; - private readonly DuckDBConnectionPool _pool; private readonly CancellationTokenSource _cts; private readonly Channel _channel; private readonly Channel<(ulong SequenceNumber, ResolvedEvent? ResolvedEvent, TFPos? Checkpoint)> _liveEvents; @@ -47,7 +45,6 @@ public IndexSubscription(IPublisher bus, string indexName, ClaimsPrincipal user, bool requiresLeader, - [CanBeNull] DuckDBConnectionPool pool, CancellationToken cancellationToken) { _expiryStrategy = expiryStrategy; _subscriptionId = Guid.NewGuid(); @@ -55,7 +52,6 @@ public IndexSubscription(IPublisher bus, _indexName = Ensure.NotNullOrEmpty(indexName); _user = user; _requiresLeader = requiresLeader; - _pool = pool; _cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken); _channel = Channel.CreateBounded(DefaultCatchUpChannelOptions); _liveEvents = Channel.CreateBounded<(ulong, ResolvedEvent?, TFPos?)>(DefaultLiveChannelOptions); @@ -330,7 +326,6 @@ private void ReadPage(TFPos startPos, IEnvelope envelope, CancellationToken ct) validationTfLastCommitPosition: null, user: _user, replyOnExpired: true, - pool: _pool, expires: _expiryStrategy.GetExpiry() ?? ReadRequestMessage.NeverExpires, cancellationToken: ct)); } diff --git a/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.ReadIndex.cs b/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.ReadIndex.cs index 89fbaff8510..5631b371bcc 100644 --- a/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.ReadIndex.cs +++ b/src/KurrentDB.Core/Services/Transport/Enumerators/Enumerator.ReadIndex.cs @@ -7,7 +7,6 @@ using System.Threading; using System.Threading.Channels; using System.Threading.Tasks; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Common.Utils; using KurrentDB.Core.Bus; using KurrentDB.Core.Data; @@ -29,7 +28,6 @@ public sealed class ReadIndexForwards( ClaimsPrincipal user, bool requiresLeader, IExpiryStrategy expiryStrategy, - [CanBeNull] DuckDBConnectionPool pool, CancellationToken cancellationToken) : ReadIndex(bus, indexName, position, maxCount, user, requiresLeader, expiryStrategy, cancellationToken) { protected override ReadIndexEventsForward CreateRequest( @@ -43,7 +41,6 @@ Func onMessage IndexName, commitPosition, preparePosition, excludeStart, (int)Math.Min(DefaultIndexReadSize, MaxCount), RequiresLeader, null, User, replyOnExpired: true, - pool: pool, expires: ExpiryStrategy.GetExpiry() ?? ReadRequestMessage.NeverExpires, cancellationToken: CancellationToken); } @@ -56,7 +53,6 @@ public sealed class ReadIndexBackwards( ClaimsPrincipal user, bool requiresLeader, IExpiryStrategy expiryStrategy, - [CanBeNull] DuckDBConnectionPool pool, CancellationToken cancellationToken) : ReadIndex(bus, indexName, position, maxCount, user, requiresLeader, expiryStrategy, cancellationToken) { protected override ReadIndexEventsBackward CreateRequest( @@ -70,7 +66,6 @@ Func onMessage IndexName, commitPosition, preparePosition, excludeStart, (int)Math.Min(DefaultIndexReadSize, MaxCount), RequiresLeader, null, User, replyOnExpired: true, - pool: pool, expires: ExpiryStrategy.GetExpiry() ?? ReadRequestMessage.NeverExpires, cancellationToken: CancellationToken); } diff --git a/src/KurrentDB.Core/Services/Transport/Grpc/Streams.Read.cs b/src/KurrentDB.Core/Services/Transport/Grpc/Streams.Read.cs index cc6584f456a..324fa0b17c6 100644 --- a/src/KurrentDB.Core/Services/Transport/Grpc/Streams.Read.cs +++ b/src/KurrentDB.Core/Services/Transport/Grpc/Streams.Read.cs @@ -11,7 +11,6 @@ using EventStore.Client.Streams; using Google.Protobuf; using Grpc.Core; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.Data; using KurrentDB.Core.Metrics; using KurrentDB.Core.Services; @@ -20,7 +19,6 @@ using KurrentDB.Core.Services.Transport.Enumerators; using KurrentDB.Core.Services.Transport.Grpc; using Microsoft.AspNetCore.Http; -using Microsoft.Extensions.DependencyInjection; using static EventStore.Client.Streams.ReadResp.Types; using static EventStore.Plugins.Authorization.Operations.Streams; using CountOptionOneofCase = EventStore.Client.Streams.ReadReq.Types.Options.CountOptionOneofCase; @@ -78,7 +76,6 @@ public override async Task Read( countOptionsCase, readDirection, filterOptionsCase, - httpContext, context.CancellationToken); async void DisposeEnumerator() => await enumerator.DisposeAsync(); @@ -109,7 +106,6 @@ private IAsyncEnumerator CreateEnumerator( CountOptionOneofCase countOptionsCase, ReadDirection readDirection, FilterOptionOneofCase filterOptionsCase, - HttpContext httpContext, CancellationToken cancellationToken) { return (streamOptionsCase, countOptionsCase, readDirection, filterOptionsCase) switch { (StreamOptionOneofCase.Stream, @@ -199,7 +195,7 @@ private IAsyncEnumerator CreateEnumerator( }; IAsyncEnumerator GetFilterOrIndexEnumerator( - Func> getIndexEnumerator, + Func> getIndexEnumerator, Func> getReadAllEnumerator ) { var filter = request.Options.Filter; @@ -207,11 +203,10 @@ IAsyncEnumerator GetFilterOrIndexEnumerator( if (filter.FilterCase == ReadReq.Types.Options.Types.FilterOptions.FilterOneofCase.StreamIdentifier && string.IsNullOrEmpty(filter.StreamIdentifier.Regex)) { var indexName = filter.StreamIdentifier.Prefix.FirstOrDefault(SystemStreams.IsIndexStream); - var pool = httpContext.RequestServices.GetRequiredService(); if (indexName != null) { return filter.StreamIdentifier.Prefix.Count > 1 ? throw RpcExceptions.InvalidArgument("Index reads only work with one index name and cannot be combined with stream prefixes or other indexes") - : getIndexEnumerator(indexName, pool); + : getIndexEnumerator(indexName); } } @@ -220,9 +215,9 @@ IAsyncEnumerator GetFilterOrIndexEnumerator( IAsyncEnumerator GetReadAllForwardsFilteredEnumerator() => GetFilterOrIndexEnumerator( - (indexName, pool) => new Enumerator.ReadIndexForwards( + indexName => new Enumerator.ReadIndexForwards( _publisher, indexName, request.Options.All.ToPosition(), - request.Options.Count, user, requiresLeader, _expiryStrategy, pool, cancellationToken + request.Options.Count, user, requiresLeader, _expiryStrategy, cancellationToken ), filter => new Enumerator.ReadAllForwardsFiltered( _publisher, @@ -239,9 +234,9 @@ IAsyncEnumerator GetReadAllForwardsFilteredEnumerator() => IAsyncEnumerator GetReadAllBackwardsFilteredEnumerator() => GetFilterOrIndexEnumerator( - (indexName, pool) => new Enumerator.ReadIndexBackwards( + indexName => new Enumerator.ReadIndexBackwards( _publisher, indexName, request.Options.All.ToPosition(), - request.Options.Count, user, requiresLeader, _expiryStrategy, pool, cancellationToken + request.Options.Count, user, requiresLeader, _expiryStrategy, cancellationToken ), filter => new Enumerator.ReadAllBackwardsFiltered( _publisher, @@ -258,9 +253,9 @@ IAsyncEnumerator GetReadAllBackwardsFilteredEnumerator() => IAsyncEnumerator GetAllSubscriptionFilteredEnumerator() => GetFilterOrIndexEnumerator( - (indexName, pool) => new Enumerator.IndexSubscription( + indexName => new Enumerator.IndexSubscription( _publisher, _expiryStrategy, request.Options.All.ToSubscriptionPosition(), - indexName, user, requiresLeader, pool, cancellationToken + indexName, user, requiresLeader, cancellationToken ), filter => new Enumerator.AllSubscriptionFiltered( _publisher, From eb0c363297fb9f802cd02b574d95f4bc7e3265bb Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:21:27 -0400 Subject: [PATCH 07/16] feat: kurrentdb.duckdb.cpu.seconds observable counter over executor CPU samples Turns the DuckDBCpuMetrics shell into a real ObservableCounter named "{serviceName}.duckdb.cpu.seconds" that sums DuckDBExecutor.SampleCpu() per role (worker/dispatcher). The metric is created eagerly by DuckDBExecutorLifetime (a hosted service, so instantiated at node startup) over the executor it owns, so the instrument exists even without a consumer. Adds "KurrentDB.DuckDB" to metricsconfig.json Meters and a TDD unit test asserting the per-role sums. Co-Authored-By: Claude Fable 5 --- src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs | 21 ++++++++++-- .../DuckDB/DuckDBExecutorLifetime.cs | 7 ++++ .../DuckDB/InjectionExtensions.cs | 6 +++- .../Diagnostics/DuckDBCpuMetricsTests.cs | 33 +++++++++++++++++++ src/KurrentDB/metricsconfig.json | 1 + 5 files changed, 65 insertions(+), 3 deletions(-) create mode 100644 src/KurrentDB.SecondaryIndexing.Tests/Diagnostics/DuckDBCpuMetricsTests.cs diff --git a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs index e52d151fd25..611886dfad1 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs @@ -1,14 +1,31 @@ // Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). +#nullable enable + +using System; +using System.Collections.Generic; using System.Diagnostics.Metrics; +using Kurrent.Quack; namespace KurrentDB.Core.DuckDB; -// Shell for now — Task 5b wires the instrument that reports DuckDBExecutor.SampleCpu() readings. +// Reports the cumulative CPU time consumed by the DuckDB executor's worker and dispatcher threads as an +// observable counter, tagged by role. This is the whole point of routing DuckDB work through the executor: +// all DuckDB CPU is now attributable to threads the executor owns, so a single SampleCpu() reading sums them. public class DuckDBCpuMetrics { public const string MeterName = "KurrentDB.DuckDB"; - public DuckDBCpuMetrics(Meter meter, string serviceName) { + public DuckDBCpuMetrics(Meter meter, string serviceName, Func> sampleCpu) { + meter.CreateObservableCounter($"{serviceName}.duckdb.cpu.seconds", Observe, + description: "CPU time consumed by DuckDB's executor threads, in seconds"); + + IEnumerable> Observe() { + double workers = 0, dispatchers = 0; + foreach (var s in sampleCpu()) + if (s.Role == "worker") workers += s.CpuSeconds; else dispatchers += s.CpuSeconds; + yield return new(workers, new KeyValuePair("role", "worker")); + yield return new(dispatchers, new KeyValuePair("role", "dispatcher")); + } } } diff --git a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs index 8e83c99b13a..55a20b98c0b 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs @@ -3,6 +3,7 @@ using System; using System.Collections.Generic; +using System.Diagnostics.Metrics; using System.IO; using System.Threading; using System.Threading.Tasks; @@ -24,11 +25,16 @@ public sealed class DuckDBExecutorLifetime : Disposable, IHostedService { public DuckDBExecutor Executor { get; } + // The CPU metric is created here (eagerly, at node startup) rather than via an activated DI singleton so the + // instrument exists even without a metrics consumer, and so it samples the very executor this lifetime owns. + public DuckDBCpuMetrics CpuMetrics { get; } + public DuckDBExecutorLifetime( TFChunkDbConfig config, IEnumerable setups, int workerCount, int dispatcherCount, + string serviceName, [CanBeNull] ILogger log) { _log = log ?? NullLogger.Instance; @@ -37,6 +43,7 @@ public DuckDBExecutorLifetime( var connectionString = $"Data Source={path};memory_limit={memoryMib}MB"; Executor = new DuckDBExecutor(connectionString, workerCount, dispatcherCount); + CpuMetrics = new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName, Executor.SampleCpu); _log.LogInformation("DuckDB executor started at {path}: {workers} workers, {dispatchers} dispatchers, memory_limit {memory}MB", path, workerCount, dispatcherCount, memoryMib); diff --git a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs index 23251d3d700..725174e24da 100644 --- a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs +++ b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs @@ -19,10 +19,14 @@ public static IServiceCollection AddDuckDb(this IServiceCollection services, str sp.GetServices(), workerCount, dispatcherCount, + serviceName, sp.GetService>())); services.AddHostedService(sp => sp.GetRequiredService()); services.AddSingleton(sp => sp.GetRequiredService().Executor); - services.AddSingleton(new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName)); // Task 5b wires the instrument + // The lifetime is a hosted service, so it's instantiated at node startup; it creates the CPU-metric + // instrument in its constructor (over the executor it owns), so the instrument exists even without a + // metrics consumer attached. + services.AddSingleton(sp => sp.GetRequiredService().CpuMetrics); return services; } } diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Diagnostics/DuckDBCpuMetricsTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/Diagnostics/DuckDBCpuMetricsTests.cs new file mode 100644 index 00000000000..c8bbecbaec3 --- /dev/null +++ b/src/KurrentDB.SecondaryIndexing.Tests/Diagnostics/DuckDBCpuMetricsTests.cs @@ -0,0 +1,33 @@ +// Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. +// Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). + +using System.Diagnostics.Metrics; +using Kurrent.Quack; +using KurrentDB.Core.DuckDB; + +namespace KurrentDB.SecondaryIndexing.Tests.Diagnostics; + +public class DuckDBCpuMetricsTests { + [Fact] + public void observable_counter_reports_per_role_cpu_sums() { + using var meter = new Meter("test"); + var samples = new List { + new("worker", 1.5), new("worker", 0.5), new("dispatcher", 0.25), + }; + _ = new DuckDBCpuMetrics(meter, "kurrentdb", () => samples); + + List<(double Value, string Role)> observed = []; + using var listener = new MeterListener(); + listener.InstrumentPublished = (i, l) => { + if (i.Meter == meter && i.Name == "kurrentdb.duckdb.cpu.seconds") l.EnableMeasurementEvents(i); + }; + listener.SetMeasurementEventCallback((_, value, tags, _) => { + foreach (var t in tags) if (t.Key == "role") observed.Add((value, (string)t.Value!)); + }); + listener.Start(); + listener.RecordObservableInstruments(); + + Assert.Contains(observed, m => m is { Role: "worker", Value: 2.0 }); + Assert.Contains(observed, m => m is { Role: "dispatcher", Value: 0.25 }); + } +} diff --git a/src/KurrentDB/metricsconfig.json b/src/KurrentDB/metricsconfig.json index fda6ad698d3..3eb04bfaff6 100644 --- a/src/KurrentDB/metricsconfig.json +++ b/src/KurrentDB/metricsconfig.json @@ -9,6 +9,7 @@ "Kurrent.Connectors", "Kurrent.Connectors.Sinks", "KurrentDB.SecondaryIndexes", + "KurrentDB.DuckDB", "KurrentDB.Kontext" ], From 8916e95330f020bb9102fae01017b4e8f443d84c Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:21:40 -0400 Subject: [PATCH 08/16] feat: index processors run DuckDB work on executor dispatchers (pinned connections) Processors now bind their BufferedView appender to a caller-owned executor.OpenConnection(); every DuckDB call on that connection (the ctor read, CreateUserIndex, Flush/commit, SetCheckpoint) runs via executor.ExecuteOn so it executes on a measured dispatcher thread. TryIndex row-appends stay on the subscription thread (in-memory buffer, intentionally unmeasured). Commit() becomes ValueTask CommitAsync(CancellationToken) and Checkpoint(...) becomes ValueTask CheckpointAsync(...); the subscriptions await them so appender CreateRow (subscription thread) and Flush (dispatcher, via ExecuteOn) never overlap. Dispose blocks on CommitAsync once before unregister/dispose. UserIndex engine/subscription thread the DuckDBExecutor through instead of the pool; the two db.Rent cleanup sites become executor.Execute ops. Co-Authored-By: Claude Fable 5 --- .../Indexes/Default/DefaultIndexProcessor.cs | 24 +++++++---- .../Indexes/ISecondaryIndexProcessor.cs | 2 +- .../Indexes/User/UserIndexEngine.cs | 5 +-- .../User/UserIndexEngineSubscription.cs | 22 +++++----- .../Indexes/User/UserIndexProcessor.cs | 41 ++++++++++++------- .../Subscriptions/DefaultIndexSubscription.cs | 2 +- .../Subscriptions/UserIndexSubscription.cs | 4 +- 7 files changed, 60 insertions(+), 40 deletions(-) diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs index 5c6ff43cd3a..6a02cc6b2ef 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs @@ -9,7 +9,6 @@ using DuckDB.NET.Data; using Google.Protobuf.WellKnownTypes; using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using KurrentDB.Common.Configuration; using KurrentDB.Core.Bus; @@ -28,6 +27,7 @@ namespace KurrentDB.SecondaryIndexing.Indexes.Default; internal class DefaultIndexProcessor : Disposable, ISecondaryIndexProcessor { + private readonly DuckDBExecutor _executor; private readonly DuckDBAdvancedConnection _connection; private readonly IPublisher _publisher; private readonly ILongHasher _hasher; @@ -41,7 +41,7 @@ public TFPos LastIndexedPosition { } public DefaultIndexProcessor( - DuckDBConnectionPool db, + DuckDBExecutor executor, IPublisher publisher, ILongHasher hasher, [FromKeyedServices(SecondaryIndexingConstants.InjectionKey)] @@ -50,7 +50,8 @@ public DefaultIndexProcessor( MetricsConfiguration? metricsConfiguration = null, TimeProvider? clock = null ) { - _connection = db.Open(); + _executor = executor; + _connection = executor.OpenConnection(); _log = loggerFactory.CreateLogger(); _appender = new(_connection, "idx_all", "log_position", DefaultIndexViewName); var serviceName = metricsConfiguration?.ServiceName ?? "kurrentdb"; @@ -59,7 +60,9 @@ public DefaultIndexProcessor( _publisher = publisher; _hasher = hasher; - var (lastPosition, lastTimestamp) = ReadLastIndexedRecord(); + var (lastPosition, lastTimestamp) = executor + .ExecuteOn(_connection, ReadLastIndexedRecord, CancellationToken.None) + .AsTask().GetAwaiter().GetResult(); // ctor is synchronous; runs once at startup _log.LogInformation("Last known log position: {Position}", lastPosition); LastIndexedPosition = lastPosition; Tracker.InitLastIndexed(lastPosition.CommitPosition, lastTimestamp); @@ -133,8 +136,8 @@ static string GetStreamCategory(string streamName) { public TFPos GetLastPosition() => LastIndexedPosition; - private (TFPos, DateTimeOffset) ReadLastIndexedRecord() { - return _connection.QueryFirstOrDefault().TryGet(out var result) + private static (TFPos, DateTimeOffset) ReadLastIndexedRecord(DuckDBAdvancedConnection connection) { + return connection.QueryFirstOrDefault().TryGet(out var result) ? (new TFPos(result.CommitPosition ?? result.PreparePosition, result.PreparePosition), DateTimeOffset.FromUnixTimeMilliseconds(result.Timestamp)) : (TFPos.Invalid, DateTimeOffset.MinValue); @@ -147,13 +150,16 @@ static string GetStreamCategory(string streamName) { /// /// Commits all in-flight records to the index. /// - public void Commit() { + public async ValueTask CommitAsync(CancellationToken ct) { if (IsDisposingOrDisposed || !Interlocked.FalseToTrue(ref _committing)) return; try { using var duration = Tracker.StartCommitDuration(); - _appender.Flush(); + await _executor.ExecuteOn(_connection, c => { + _appender.Flush(); // appender is bound to _connection; runs on a measured dispatcher thread + return 0; + }, ct); } catch (Exception e) { _log.LogError(e, "Failed to commit records to index at log position {LogPosition}", LastIndexedPosition); throw; @@ -167,7 +173,7 @@ public BufferedView.Snapshot CaptureSnapshot(DuckDBConnection connection) protected override void Dispose(bool disposing) { if (disposing) { - Commit(); + CommitAsync(CancellationToken.None).AsTask().GetAwaiter().GetResult(); _appender.Unregister(_connection); _appender.Dispose(); _connection.Dispose(); diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/ISecondaryIndexProcessor.cs b/src/KurrentDB.SecondaryIndexing/Indexes/ISecondaryIndexProcessor.cs index 0cbb3e77f33..3fc7a36fc01 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/ISecondaryIndexProcessor.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/ISecondaryIndexProcessor.cs @@ -7,7 +7,7 @@ namespace KurrentDB.SecondaryIndexing.Indexes; public interface ISecondaryIndexProcessor : IDisposable { - void Commit(); + ValueTask CommitAsync(CancellationToken ct); bool TryIndex(ResolvedEvent evt); diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngine.cs b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngine.cs index 8957d2a9365..b679db606b8 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngine.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngine.cs @@ -3,7 +3,6 @@ using System.Diagnostics.Metrics; using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using Kurrent.Surge.Schema.Serializers; using KurrentDB.Core; @@ -44,7 +43,7 @@ public UserIndexEngine( ISubscriber subscriber, ISchemaSerializer serializer, SecondaryIndexingPluginOptions options, - DuckDBConnectionPool db, + DuckDBExecutor executor, IReadIndex index, TFChunkDbConfig chunkDbConfig, [FromKeyedServices(SecondaryIndexingConstants.InjectionKey)] @@ -53,7 +52,7 @@ public UserIndexEngine( _client = client; _log = loggerFactory.CreateLogger(); _writerCheckpoint = chunkDbConfig.WriterCheckpoint; - _subscription = new(client, publisher, serializer, options, db, index, meter, GetLastAppendedRecord, loggerFactory, _cts!.Token); + _subscription = new(client, publisher, serializer, options, executor, index, meter, GetLastAppendedRecord, loggerFactory, _cts!.Token); subscriber.Subscribe(this); subscriber.Subscribe(this); diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs index c851e4db14b..6c7696b2ed3 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs @@ -5,7 +5,6 @@ using System.Diagnostics.Metrics; using System.Threading.Channels; using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using Kurrent.Surge.Schema; using Kurrent.Surge.Schema.Serializers; @@ -29,7 +28,7 @@ public partial class UserIndexEngineSubscription( IPublisher publisher, ISchemaSerializer serializer, SecondaryIndexingPluginOptions options, - DuckDBConnectionPool db, + DuckDBExecutor executor, IReadIndex readIndex, Meter meter, Func<(long, DateTime)> getLastAppendedRecord, @@ -106,7 +105,7 @@ await client.Subscriptions.SubscribeToStream( CaughtUp = true; - using (db.Rent(out var connection)) { + await executor.Execute(connection => { // in some situations, a user index may have already been marked as deleted in the management stream but not yet in DuckDB: // 1) if a node was off or disconnected from the quorum for an extended period of time // 2) if a crash occurred mid-deletion @@ -114,7 +113,9 @@ await client.Subscriptions.SubscribeToStream( foreach (var deletedIndex in deletedIndexes) { DeleteUserIndexTable(connection, deletedIndex); } - } + + return 0; + }, _cts.Token); foreach (var (name, state) in userIndexes) { if (state.Started) { @@ -174,7 +175,7 @@ await client.Subscriptions.SubscribeToStream( userIndexes.Remove(x.Name); if (CaughtUp) - DeleteUserIndex(x.Name); + await DeleteUserIndex(x.Name); break; } @@ -221,7 +222,7 @@ createdEvent.Fields.Count is 0 jsFieldSelector: createdEvent.Fields.Count is 0 ? "" : createdEvent.Fields[0].Selector, - db: db, + executor: executor, sql: sql, publisher: publisher, meter: meter, @@ -229,7 +230,7 @@ createdEvent.Fields.Count is 0 loggerFactory: logFactory ); - var reader = new UserIndexReader(sharedPool: db, processor, readIndex); + var reader = new UserIndexReader(executor, processor, readIndex); UserIndexSubscription subscription = new UserIndexSubscription( publisher: publisher, @@ -262,12 +263,13 @@ private async Task StopUserIndex(string indexName) { DropSubscriptions(indexName); } - private void DeleteUserIndex(string indexName) { + private async ValueTask DeleteUserIndex(string indexName) { _log.LogDeletingUserIndex(indexName); - using (db.Rent(out var connection)) { + await executor.Execute(connection => { DeleteUserIndexTable(connection, indexName); - } + return 0; + }, _cts.Token); DropSubscriptions(indexName); } diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexProcessor.cs b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexProcessor.cs index 3cdb4d36dbd..3375790875e 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexProcessor.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexProcessor.cs @@ -9,7 +9,6 @@ using Jint; using Jint.Native.Function; using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using Kurrent.Surge.Schema.Serializers.Json; using KurrentDB.Common.Configuration; @@ -24,7 +23,7 @@ namespace KurrentDB.SecondaryIndexing.Indexes.User; internal abstract class UserIndexProcessor : Disposable, ISecondaryIndexProcessor { - public abstract void Commit(); + public abstract ValueTask CommitAsync(CancellationToken ct); public abstract bool TryIndex(ResolvedEvent evt); public abstract TFPos GetLastPosition(); public abstract SecondaryIndexProgressTracker Tracker { get; } @@ -40,6 +39,7 @@ internal class UserIndexProcessor : UserIndexProcessor private readonly Function? _fieldSelector; private readonly string _queryStreamName; private readonly IPublisher _publisher; + private readonly DuckDBExecutor _executor; private readonly DuckDBAdvancedConnection _connection; private readonly UserIndexSql _sql; private readonly object _skip; @@ -59,7 +59,7 @@ public UserIndexProcessor( string indexName, string jsEventFilter, string jsFieldSelector, - DuckDBConnectionPool db, + DuckDBExecutor executor, UserIndexSql sql, IPublisher publisher, [FromKeyedServices(SecondaryIndexingConstants.InjectionKey)] @@ -86,15 +86,22 @@ public UserIndexProcessor( _filter = JsRecordEvaluator.Compile(_engine, jsEventFilter); _fieldSelector = JsRecordEvaluator.Compile(_engine, jsFieldSelector); - _connection = db.Open(); - _sql.CreateUserIndex(_connection); + _executor = executor; + _connection = executor.OpenConnection(); + // CreateUserIndex is DuckDB DDL — run it on a dispatcher against the pinned connection. + executor.ExecuteOn(_connection, c => { + _sql.CreateUserIndex(c); + return 0; + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); _appender = new(_connection, _sql.TableName, "log_position", _sql.ViewName); var serviceName = metricsConfiguration?.ServiceName ?? "kurrentdb"; var tracker = new SecondaryIndexProgressTracker(indexName, serviceName, meter, clock ?? TimeProvider.System, loggerFactory.CreateLogger(), getLastAppendedRecord); - var (lastPosition, lastTimestamp) = GetLastKnownRecord(); + var (lastPosition, lastTimestamp) = executor + .ExecuteOn(_connection, GetLastKnownRecord, CancellationToken.None) + .AsTask().GetAwaiter().GetResult(); // ctor is synchronous; runs once at startup _log.LogUserIndexLoadedLastKnownLogPosition(IndexName, lastPosition, lastTimestamp); tracker.InitLastIndexed(lastPosition.CommitPosition, lastTimestamp); @@ -185,11 +192,11 @@ bool CanHandleEvent(ResolvedEvent resolvedEvent, out TField? field) { } } - private (TFPos, DateTimeOffset) GetLastKnownRecord() { + private (TFPos, DateTimeOffset) GetLastKnownRecord(DuckDBAdvancedConnection connection) { (TFPos pos, DateTimeOffset timestamp) result = (TFPos.Invalid, DateTimeOffset.MinValue); var checkpointArgs = new GetCheckpointQueryArgs(IndexName); - var checkpoint = _sql.GetCheckpoint(_connection, checkpointArgs); + var checkpoint = _sql.GetCheckpoint(connection, checkpointArgs); if (checkpoint != null) { var pos = new TFPos(checkpoint.Value.CommitPosition ?? checkpoint.Value.PreparePosition, checkpoint.Value.PreparePosition); var timestamp = DateTimeOffset.FromUnixTimeMilliseconds(checkpoint.Value.Timestamp); @@ -198,7 +205,7 @@ bool CanHandleEvent(ResolvedEvent resolvedEvent, out TField? field) { result = (pos, timestamp); } - var lastIndexed = _sql.GetLastIndexedRecord(_connection); + var lastIndexed = _sql.GetLastIndexedRecord(connection); if (lastIndexed != null) { var pos = new TFPos(lastIndexed.Value.CommitPosition ?? lastIndexed.Value.PreparePosition, lastIndexed.Value.PreparePosition); var timestamp = DateTimeOffset.FromUnixTimeMilliseconds(lastIndexed.Value.Timestamp); @@ -211,7 +218,7 @@ bool CanHandleEvent(ResolvedEvent resolvedEvent, out TField? field) { return result; } - public void Checkpoint(TFPos position, DateTime timestamp) { + public async ValueTask CheckpointAsync(TFPos position, DateTime timestamp, CancellationToken ct) { _log.LogUserIndexIsCheckpointing(IndexName, position, timestamp); var checkpointArgs = new SetCheckpointQueryArgs { @@ -221,19 +228,25 @@ public void Checkpoint(TFPos position, DateTime timestamp) { Created = new DateTimeOffset(timestamp).ToUnixTimeMilliseconds() }; - UserIndexSql.SetCheckpoint(_connection, checkpointArgs); + await _executor.ExecuteOn(_connection, c => { + UserIndexSql.SetCheckpoint(c, checkpointArgs); + return 0; + }, ct); } /// /// Commits all in-flight records to the index. /// - public override void Commit() { + public override async ValueTask CommitAsync(CancellationToken ct) { if (IsDisposed || !Interlocked.FalseToTrue(ref _committing)) return; try { using var duration = Tracker.StartCommitDuration(); - _appender.Flush(); + await _executor.ExecuteOn(_connection, c => { + _appender.Flush(); // appender is bound to _connection; runs on a measured dispatcher thread + return 0; + }, ct); _log.LogUserIndexCommitted(IndexName); } catch (Exception ex) { _log.LogUserIndexFailedToCommit(ex, IndexName); @@ -251,7 +264,7 @@ public void GetUserIndexTableDetails(out string tableName, out string? fieldName protected override void Dispose(bool disposing) { _log.LogStoppingUserIndexProcessor(IndexName); if (disposing) { - Commit(); + CommitAsync(CancellationToken.None).AsTask().GetAwaiter().GetResult(); _appender.Unregister(_connection); _appender.Dispose(); _connection.Dispose(); diff --git a/src/KurrentDB.SecondaryIndexing/Subscriptions/DefaultIndexSubscription.cs b/src/KurrentDB.SecondaryIndexing/Subscriptions/DefaultIndexSubscription.cs index c533341cef0..d6d01adb590 100644 --- a/src/KurrentDB.SecondaryIndexing/Subscriptions/DefaultIndexSubscription.cs +++ b/src/KurrentDB.SecondaryIndexing/Subscriptions/DefaultIndexSubscription.cs @@ -96,7 +96,7 @@ async Task ProcessEvents(CancellationToken token) { indexProcessor.TryIndex(resolvedEvent); if (++indexedCount >= _commitBatchSize) { - indexProcessor.Commit(); + await indexProcessor.CommitAsync(token); indexedCount = 0; } } catch (OperationCanceledException) { diff --git a/src/KurrentDB.SecondaryIndexing/Subscriptions/UserIndexSubscription.cs b/src/KurrentDB.SecondaryIndexing/Subscriptions/UserIndexSubscription.cs index ebbffaa1dcb..155556044b6 100644 --- a/src/KurrentDB.SecondaryIndexing/Subscriptions/UserIndexSubscription.cs +++ b/src/KurrentDB.SecondaryIndexing/Subscriptions/UserIndexSubscription.cs @@ -97,12 +97,12 @@ async Task ProcessEvents(CancellationToken token) { if (processedCount >= _commitBatchSize) { if (indexedCount > 0) { log.LogUserIndexIsCommitting(indexProcessor.IndexName, indexedCount); - indexProcessor.Commit(); + await indexProcessor.CommitAsync(token); } var lastProcessedPosition = resolvedEvent.OriginalPosition!.Value; var lastProcessedTimestamp = resolvedEvent.OriginalEvent.TimeStamp; - indexProcessor.Checkpoint(lastProcessedPosition, lastProcessedTimestamp); + await indexProcessor.CheckpointAsync(lastProcessedPosition, lastProcessedTimestamp, token); indexedCount = 0; processedCount = 0; From 334b5f11325adac29ceec87e14fddb81d3367c40 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:21:50 -0400 Subject: [PATCH 09/16] feat: index readers execute on DuckDB dispatchers SecondaryIndexReaderBase takes a DuckDBExecutor and rents a connection per read via executor.Execute; the abstract GetDbRecordsForwards/Backwards now receive an open DuckDBAdvancedConnection (instead of a pool) and keep their CaptureSnapshot(connection) + query bodies minus the Rent wrapper. Removes GetPool and the msg.Pool usage deleted from the messages in the executor-lifetime change. Co-Authored-By: Claude Fable 5 --- .../Indexes/Category/CategoryIndexReader.cs | 47 ++++++++-------- .../Indexes/Default/DefaultIndexReader.cs | 53 +++++++++---------- .../Indexes/EventType/EventTypeIndexReader.cs | 53 +++++++++---------- .../Indexes/SecondaryIndexReaderBase.cs | 29 ++++------ .../Indexes/User/UserIndexReader.cs | 25 ++++----- 5 files changed, 89 insertions(+), 118 deletions(-) diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/Category/CategoryIndexReader.cs b/src/KurrentDB.SecondaryIndexing/Indexes/Category/CategoryIndexReader.cs index 4c163873e59..d54ffe7a0e6 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/Category/CategoryIndexReader.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/Category/CategoryIndexReader.cs @@ -2,7 +2,6 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.Data; using KurrentDB.Core.Services.Storage.ReaderIndex; using KurrentDB.SecondaryIndexing.Indexes.Default; @@ -14,55 +13,51 @@ namespace KurrentDB.SecondaryIndexing.Indexes.Category; internal class CategoryIndexReader( - DuckDBConnectionPool sharedPool, + DuckDBExecutor executor, DefaultIndexProcessor processor, IReadIndex index) - : SecondaryIndexReaderBase(sharedPool, index) { + : SecondaryIndexReaderBase(executor, index) { protected override string GetId(string indexName) => CategoryIndex.TryParseCategoryName(indexName, out var categoryName) ? categoryName : string.Empty; - protected override List GetDbRecordsForwards(DuckDBConnectionPool db, + protected override List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection.ExecuteQuery(new(id!, startPosition, - maxCount)) - .CopyTo(records); - } else { - connection.ExecuteQuery(new(id!, startPosition, + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection.ExecuteQuery(new(id!, startPosition, maxCount)) - .CopyTo(records); - } + .CopyTo(records); + } else { + connection.ExecuteQuery(new(id!, startPosition, + maxCount)) + .CopyTo(records); } } return records; } - protected override List GetDbRecordsBackwards(DuckDBConnectionPool db, + protected override List GetDbRecordsBackwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection.ExecuteQuery(new(id!, - startPosition, maxCount)) - .CopyTo(records); - } else { - connection.ExecuteQuery(new(id!, - startPosition, maxCount)) - .CopyTo(records); - } + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection.ExecuteQuery(new(id!, + startPosition, maxCount)) + .CopyTo(records); + } else { + connection.ExecuteQuery(new(id!, + startPosition, maxCount)) + .CopyTo(records); } } diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexReader.cs b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexReader.cs index 4751c589027..4fa84f88bde 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexReader.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexReader.cs @@ -2,7 +2,6 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.Data; using KurrentDB.Core.Services; using KurrentDB.Core.Services.Storage.ReaderIndex; @@ -12,54 +11,50 @@ namespace KurrentDB.SecondaryIndexing.Indexes.Default; internal class DefaultIndexReader( - DuckDBConnectionPool sharedPool, + DuckDBExecutor executor, DefaultIndexProcessor processor, IReadIndex index -) : SecondaryIndexReaderBase(sharedPool, index) { +) : SecondaryIndexReaderBase(executor, index) { protected override string GetId(string indexName) => string.Empty; - protected override List GetDbRecordsForwards(DuckDBConnectionPool db, + protected override List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection.ExecuteQuery(new( - startPosition, - maxCount)) - .CopyTo(records); - } else { - connection.ExecuteQuery(new( - startPosition, - maxCount)) - .CopyTo(records); - } + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection.ExecuteQuery(new( + startPosition, + maxCount)) + .CopyTo(records); + } else { + connection.ExecuteQuery(new( + startPosition, + maxCount)) + .CopyTo(records); } } return records; } - protected override List GetDbRecordsBackwards(DuckDBConnectionPool db, + protected override List GetDbRecordsBackwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection.ExecuteQuery( - new(startPosition, maxCount)) - .CopyTo(records); - } else { - connection.ExecuteQuery( - new(startPosition, maxCount)) - .CopyTo(records); - } + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection.ExecuteQuery( + new(startPosition, maxCount)) + .CopyTo(records); + } else { + connection.ExecuteQuery( + new(startPosition, maxCount)) + .CopyTo(records); } } diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/EventType/EventTypeIndexReader.cs b/src/KurrentDB.SecondaryIndexing/Indexes/EventType/EventTypeIndexReader.cs index 0dc58df9bda..f2602a63819 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/EventType/EventTypeIndexReader.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/EventType/EventTypeIndexReader.cs @@ -2,7 +2,6 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.Data; using KurrentDB.Core.Services.Storage.ReaderIndex; using KurrentDB.SecondaryIndexing.Indexes.Default; @@ -12,55 +11,51 @@ namespace KurrentDB.SecondaryIndexing.Indexes.EventType; internal class EventTypeIndexReader( - DuckDBConnectionPool sharedPool, + DuckDBExecutor executor, DefaultIndexProcessor processor, IReadIndex index) - : SecondaryIndexReaderBase(sharedPool, index) { + : SecondaryIndexReaderBase(executor, index) { protected override string GetId(string streamName) => EventTypeIndex.TryParseEventType(streamName, out var eventTypeName) ? eventTypeName : string.Empty; - protected override List GetDbRecordsForwards(DuckDBConnectionPool db, + protected override List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection - .ExecuteQuery(new(id!, startPosition, maxCount)) - .CopyTo(records); - } else { - connection - .ExecuteQuery(new(id!, startPosition, - maxCount)) - .CopyTo(records); - } + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection + .ExecuteQuery(new(id!, startPosition, maxCount)) + .CopyTo(records); + } else { + connection + .ExecuteQuery(new(id!, startPosition, + maxCount)) + .CopyTo(records); } } return records; } - protected override List GetDbRecordsBackwards(DuckDBConnectionPool db, + protected override List GetDbRecordsBackwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - if (excludeFirst) { - connection - .ExecuteQuery(new(id!, startPosition, maxCount)) - .CopyTo(records); - } else { - connection - .ExecuteQuery(new(id!, - startPosition, maxCount)) - .CopyTo(records); - } + using (processor.CaptureSnapshot(connection)) { + if (excludeFirst) { + connection + .ExecuteQuery(new(id!, startPosition, maxCount)) + .CopyTo(records); + } else { + connection + .ExecuteQuery(new(id!, + startPosition, maxCount)) + .CopyTo(records); } } diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/SecondaryIndexReaderBase.cs b/src/KurrentDB.SecondaryIndexing/Indexes/SecondaryIndexReaderBase.cs index 0a92762af00..87b09d81451 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/SecondaryIndexReaderBase.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/SecondaryIndexReaderBase.cs @@ -1,7 +1,7 @@ // Copyright (c) Kurrent, Inc and/or licensed to Kurrent, Inc under one or more agreements. // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). -using Kurrent.Quack.ConnectionPool; +using Kurrent.Quack; using KurrentDB.Core.Data; using KurrentDB.Core.Services.Storage; using KurrentDB.Core.Services.Storage.ReaderIndex; @@ -10,12 +10,12 @@ namespace KurrentDB.SecondaryIndexing.Indexes; -public abstract class SecondaryIndexReaderBase(DuckDBConnectionPool sharedPool, IReadIndex index) : ISecondaryIndexReader { +public abstract class SecondaryIndexReaderBase(DuckDBExecutor executor, IReadIndex index) : ISecondaryIndexReader { protected abstract string? GetId(string indexName); - protected abstract List GetDbRecordsForwards(DuckDBConnectionPool pool, string? id, long startPosition, int maxCount, bool excludeFirst); + protected abstract List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst); - protected abstract List GetDbRecordsBackwards(DuckDBConnectionPool pool, string? id, long startPosition, int maxCount, bool excludeFirst); + protected abstract List GetDbRecordsBackwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst); public ValueTask ReadForwards(ReadIndexEventsForward msg, CancellationToken token) => ReadForwards(msg, index.IndexReader, index.LastIndexedPosition, token); @@ -57,12 +57,9 @@ ReadIndexEventsForwardCompleted NoData(ReadIndexResult result, bool endOfStream, => new(result, ResolvedEvent.EmptyArray, pos, lastIndexedPosition, endOfStream, error); async ValueTask<(long, IReadOnlyList)> GetEventsForwards(long startPosition) { - var indexPrepares = GetDbRecordsForwards( - GetPool(msg.Pool), - id, - startPosition, - msg.MaxCount, - msg.ExcludeStart); + var indexPrepares = await executor.Execute( + connection => GetDbRecordsForwards(connection, id, startPosition, msg.MaxCount, msg.ExcludeStart), + token); var events = await reader.ReadRecords(indexPrepares, true, token); return (indexPrepares.Count, events); @@ -101,18 +98,12 @@ ReadIndexEventsBackwardCompleted NoData(ReadIndexResult result, string? error = => new(result, ResolvedEvent.EmptyArray, pos, lastIndexedPosition, false, error); async ValueTask<(long, IReadOnlyList)> GetEventsBackwards(TFPos startPosition) { - var indexPrepares = GetDbRecordsBackwards(GetPool(msg.Pool), - id, - startPosition.PreparePosition, - msg.MaxCount, - msg.ExcludeStart); + var indexPrepares = await executor.Execute( + connection => GetDbRecordsBackwards(connection, id, startPosition.PreparePosition, msg.MaxCount, msg.ExcludeStart), + token); var events = await reader.ReadRecords(indexPrepares, false, token); return (indexPrepares.Count, events); } } - - private DuckDBConnectionPool GetPool(DuckDBConnectionPool? pool) { - return pool ?? sharedPool; - } } diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexReader.cs b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexReader.cs index 3ec3d9c1e61..2e8f490a3f5 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexReader.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexReader.cs @@ -2,7 +2,6 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using KurrentDB.Core.Data; using KurrentDB.Core.Services.Storage.ReaderIndex; @@ -10,16 +9,16 @@ namespace KurrentDB.SecondaryIndexing.Indexes.User; -internal abstract class UserIndexReader(DuckDBConnectionPool sharedPool, IReadIndex index) - : SecondaryIndexReaderBase(sharedPool, index) { +internal abstract class UserIndexReader(DuckDBExecutor executor, IReadIndex index) + : SecondaryIndexReaderBase(executor, index) { internal abstract BufferedView.Snapshot CaptureSnapshot(DuckDBAdvancedConnection connection); } internal class UserIndexReader( - DuckDBConnectionPool sharedPool, + DuckDBExecutor executor, UserIndexProcessor processor, IReadIndex index -) : UserIndexReader(sharedPool, index) where TField : IField { +) : UserIndexReader(executor, index) where TField : IField { protected override string? GetId(string indexStream) { // the field is used as the ID. null when there is no field @@ -28,7 +27,7 @@ IReadIndex index return field; } - protected override List GetDbRecordsForwards(DuckDBConnectionPool db, string? id, long startPosition, int maxCount, bool excludeFirst) { + protected override List GetDbRecordsForwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, bool excludeFirst) { if (!TryGetField(id, out var field)) return []; @@ -40,16 +39,14 @@ protected override List GetDbRecordsForwards(DuckDBConnectionP }; var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - processor.Sql.ReadUserIndexForwardsQuery(connection, args, records); - } + using (processor.CaptureSnapshot(connection)) { + processor.Sql.ReadUserIndexForwardsQuery(connection, args, records); } return records; } - protected override List GetDbRecordsBackwards(DuckDBConnectionPool db, + protected override List GetDbRecordsBackwards(DuckDBAdvancedConnection connection, string? id, long startPosition, int maxCount, @@ -65,10 +62,8 @@ protected override List GetDbRecordsBackwards(DuckDBConnection }; var records = new List(maxCount); - using (db.Rent(out var connection)) { - using (processor.CaptureSnapshot(connection)) { - processor.Sql.ReadUserIndexBackwardsQuery(connection, args, records); - } + using (processor.CaptureSnapshot(connection)) { + processor.Sql.ReadUserIndexBackwardsQuery(connection, args, records); } return records; From 17978a10f00f44a5749889c9991b29a1a9d43b40 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:22:05 -0400 Subject: [PATCH 10/16] feat: QueryEngine and stats run on DuckDB dispatchers; Quack owns query cancellation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ExecuteAsync moves the whole rent/snapshot/prepare/consume/cleanup pipeline into one executor.Execute op (the dispatcher blocks on the consumer by design; dispatcherCount bounds concurrent streams). QueryEngine sheds its own InterruptQueryOnCancellation registration and its DuckDBException(Interrupt)->OCE mapping — Quack registers the interrupt and maps it internally. GetArrowSchema gets the same Execute wrap (no consume loop) and stays synchronous by blocking on the dispatcher op. PrepareQuery becomes ValueTask> PrepareQueryAsync(ReadOnlyMemory, ...); the Rewriter's two Rent sites collapse into the single rented connection of one Execute op. StatsService methods become async ValueTask<...>, each wrapping snapshot+query in executor.Execute. Callers updated to await: FlightSqlServer (PlainQuery/Schema go genuinely async; ConnectionState's registry call blocks on the async prepare to keep its span+out mechanism intact), UI QueryService, and UiStatsService (drops its Task.Run offload since the dispatcher already runs the work off the render thread). Co-Authored-By: Claude Fable 5 --- .../FlightSql/ConnectionState.PreparedStmt.cs | 6 +- .../FlightSql/FlightSqlServer.PlainQuery.cs | 43 ++----- .../FlightSql/FlightSqlServer.Schema.cs | 29 ++--- .../FlightSql/FlightSqlServer.cs | 2 +- .../Query/IQueryEngine.cs | 5 +- .../Query/QueryEngine.Rewriter.cs | 14 +-- .../Query/QueryEngine.cs | 118 +++++++++--------- .../Stats/StatsService.cs | 96 +++++++------- .../Components/Query/QueryService.cs | 2 +- .../Components/Stats/UiStatsService.cs | 12 +- 10 files changed, 151 insertions(+), 176 deletions(-) diff --git a/src/KurrentDB.SecondaryIndexing/FlightSql/ConnectionState.PreparedStmt.cs b/src/KurrentDB.SecondaryIndexing/FlightSql/ConnectionState.PreparedStmt.cs index 15859ac18ef..e629d215828 100644 --- a/src/KurrentDB.SecondaryIndexing/FlightSql/ConnectionState.PreparedStmt.cs +++ b/src/KurrentDB.SecondaryIndexing/FlightSql/ConnectionState.PreparedStmt.cs @@ -134,8 +134,12 @@ private PreparedStatement CreatePreparedStatement(ReadOnlySpan query, out Schema parameters) { var buffer = Encoding.UTF8.GetBytes(query, allocator: null); try { + // This registry path has a ReadOnlySpan input and an out parameter, so it can't be async; block on + // the engine's dispatcher op (the dispatcher is a dedicated thread — no deadlock). The prepared-statement + // registry mechanism itself is unchanged; only this engine call is updated for the async PrepareQueryAsync. var preparedQuery = engine - .PrepareQuery(buffer.Span, new() { UseDigitalSignature = false }); + .PrepareQueryAsync(buffer.Memory, new() { UseDigitalSignature = false }) + .AsTask().GetAwaiter().GetResult(); return new(preparedQuery, engine.GetArrowSchema(preparedQuery.Span, out parameters)); } finally { buffer.Dispose(); diff --git a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.PlainQuery.cs b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.PlainQuery.cs index 82af2a12e0d..fe6f3c52f8b 100644 --- a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.PlainQuery.cs +++ b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.PlainQuery.cs @@ -16,44 +16,15 @@ namespace KurrentDB.SecondaryIndexing.FlightSql; partial class FlightSqlServer { - private Task PrepareQueryAsync(CommandStatementQuery query, FlightDescriptor descriptor, CancellationToken token) { - Task task; - if (token.IsCancellationRequested) { - task = Task.FromCanceled(token); - } else { - try { - task = Task.FromResult(PrepareQuery(query, descriptor)); - } catch (Exception e) { - task = Task.FromException(e); - } - } - - return task; - } - - private Task PrepareQueryAsync(ReadOnlySpan query, FlightDescriptor descriptor, CancellationToken token) { - Task task; - if (token.IsCancellationRequested) { - task = Task.FromCanceled(token); - } else { - try { - var preparedQueryBuffer = engine.PrepareQuery(query, new() { UseDigitalSignature = true }); - task = Task.FromResult(GetQueryInfo(preparedQueryBuffer, descriptor, discoverSchema: true)); - } catch (Exception e) { - task = Task.FromException(e); - } - } - - return task; + private async Task PrepareQueryAsync(CommandStatementQuery query, FlightDescriptor descriptor, CancellationToken token) { + using var buffer = Encoding.UTF8.GetBytes(query.Query, allocator: null); + var preparedQuery = await engine.PrepareQueryAsync(buffer.Memory, new() { UseDigitalSignature = true }, token); + return GetQueryInfo(preparedQuery, descriptor, discoverSchema: false); } - private FlightInfo PrepareQuery(CommandStatementQuery query, FlightDescriptor descriptor) { - MemoryOwner preparedQuery; - using (var buffer = Encoding.UTF8.GetBytes(query.Query, allocator: null)) { - preparedQuery = engine.PrepareQuery(buffer.Span, new() { UseDigitalSignature = true }); - } - - return GetQueryInfo(preparedQuery, descriptor, discoverSchema: false); + private async Task PrepareQueryAsync(ReadOnlyMemory query, FlightDescriptor descriptor, CancellationToken token) { + var preparedQueryBuffer = await engine.PrepareQueryAsync(query, new() { UseDigitalSignature = true }, token); + return GetQueryInfo(preparedQueryBuffer, descriptor, discoverSchema: true); } private FlightInfo GetQueryInfo(in MemoryOwner preparedQuery, FlightDescriptor descriptor, bool discoverSchema) diff --git a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.Schema.cs b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.Schema.cs index fed1079efe0..eadc92a3338 100644 --- a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.Schema.cs +++ b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.Schema.cs @@ -12,27 +12,18 @@ namespace KurrentDB.SecondaryIndexing.FlightSql; partial class FlightSqlServer { - private Task GetQuerySchemaAsync(ReadOnlyMemory query, CancellationToken token) { - Task task; - if (token.IsCancellationRequested) { - task = Task.FromCanceled(token); - } else { - var buffer = default(MemoryOwner); - try { - buffer = Encoding.UTF8.GetBytes(query.Span, allocator: null); - var tmp = engine.PrepareQuery(buffer.Span, new() { UseDigitalSignature = false }); - buffer.Dispose(); - buffer = tmp; + private async Task GetQuerySchemaAsync(ReadOnlyMemory query, CancellationToken token) { + var buffer = default(MemoryOwner); + try { + buffer = Encoding.UTF8.GetBytes(query.Span, allocator: null); + var prepared = await engine.PrepareQueryAsync(buffer.Memory, new() { UseDigitalSignature = false }, token); + buffer.Dispose(); + buffer = prepared; - task = Task.FromResult(engine.GetArrowSchema(buffer.Span)); - } catch (Exception e) { - task = Task.FromException(e); - } finally { - buffer.Dispose(); - } + return engine.GetArrowSchema(buffer.Span); + } finally { + buffer.Dispose(); } - - return task; } private ByteString SerializeSchema(Schema schema) { diff --git a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.cs b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.cs index 02aac1a626d..bb2308b3979 100644 --- a/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.cs +++ b/src/KurrentDB.SecondaryIndexing/FlightSql/FlightSqlServer.cs @@ -70,7 +70,7 @@ public override async Task GetFlightInfo(FlightDescriptor request, S // process Arrow Flight command if (request.Command.Span is [not discriminator, ..]) - return await PrepareQueryAsync(request.Command.Span, request, context.CancellationToken); + return await PrepareQueryAsync(request.Command.Memory, request, context.CancellationToken); var state = context.GetHttpContext().Features.Get() ?? throw WrongServerState(); diff --git a/src/KurrentDB.SecondaryIndexing/Query/IQueryEngine.cs b/src/KurrentDB.SecondaryIndexing/Query/IQueryEngine.cs index 63cd4eff999..0a486e3dcb7 100644 --- a/src/KurrentDB.SecondaryIndexing/Query/IQueryEngine.cs +++ b/src/KurrentDB.SecondaryIndexing/Query/IQueryEngine.cs @@ -12,14 +12,15 @@ public interface IQueryEngine { /// /// The query to prepare. /// Query preparation options. + /// The token that can be used to cancel the operation. /// The prepared query. /// The input query has incorrect syntax. - MemoryOwner PrepareQuery(ReadOnlySpan sqlQuery, QueryPreparationOptions options); + ValueTask> PrepareQueryAsync(ReadOnlyMemory sqlQuery, QueryPreparationOptions options, CancellationToken token = default); /// /// Executes the prepared query. /// - /// The prepared query returned by method. + /// The prepared query returned by method. /// The query result consumer. /// The options to adjust the query execution behavior. /// The token that can be used to cancel the operation. diff --git a/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.Rewriter.cs b/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.Rewriter.cs index e8ddca1db65..ee92bc5e8c9 100644 --- a/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.Rewriter.cs +++ b/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.Rewriter.cs @@ -12,13 +12,11 @@ namespace KurrentDB.SecondaryIndexing.Query; partial class QueryEngine { - private MemoryOwner RewriteQuery(ReadOnlySpan queryUtf8, ref PreparedQueryBuilder builder) { - JsonNode? tree; - + // Runs against the connection provided by the enclosing executor.Execute op (see PrepareQueryAsync): both the + // parse and the serialize used to rent separate connections; now they share this one rented connection. + private MemoryOwner RewriteQuery(DuckDBAdvancedConnection connection, ReadOnlySpan queryUtf8, ref PreparedQueryBuilder builder) { // Obtain AST - using (sharedPool.Rent(out var connection)) { - tree = connection.ParseSyntaxTree(queryUtf8); - } + var tree = connection.ParseSyntaxTree(queryUtf8); // Transform AST switch (tree?["error"]?.GetValueKind()) { @@ -36,9 +34,7 @@ private MemoryOwner RewriteQuery(ReadOnlySpan queryUtf8, ref Prepare } // Convert AST back to the query - using (sharedPool.Rent(out var connection)) { - return connection.FromSyntaxTree(tree); - } + return connection.FromSyntaxTree(tree); } private void RewriteNode(JsonNode ast, ref PreparedQueryBuilder builder) { diff --git a/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.cs b/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.cs index c5357584cc3..28d730c80ce 100644 --- a/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.cs +++ b/src/KurrentDB.SecondaryIndexing/Query/QueryEngine.cs @@ -6,11 +6,8 @@ using Apache.Arrow; using DotNext; using DotNext.Buffers; -using DuckDB.NET.Data; -using DuckDB.NET.Native; using Kurrent.Quack; using Kurrent.Quack.Arrow; -using Kurrent.Quack.ConnectionPool; using Kurrent.Quack.Threading; using KurrentDB.SecondaryIndexing.Indexes.Default; using KurrentDB.SecondaryIndexing.Indexes.User; @@ -22,54 +19,59 @@ namespace KurrentDB.SecondaryIndexing.Query; /// /// /// -/// +/// internal sealed partial class QueryEngine(DefaultIndexProcessor defaultIndex, UserIndexEngine userIndex, - DuckDBConnectionPool sharedPool) : IQueryEngine { + DuckDBExecutor executor) : IQueryEngine { // 32 bytes key is aligned with HMAC SHA-3 256 hash length private readonly ReadOnlyMemory _signatureKey = RandomNumberGenerator.GetBytes(32); - public MemoryOwner PrepareQuery(ReadOnlySpan queryUtf8, QueryPreparationOptions options) { - var builder = new PreparedQueryBuilder(); - using var rewrittenQuery = RewriteQuery(queryUtf8, ref builder); - - return builder.Build(rewrittenQuery.Span, options.UseDigitalSignature ? _signatureKey.Span : ReadOnlySpan.Empty); + public ValueTask> PrepareQueryAsync(ReadOnlyMemory queryUtf8, QueryPreparationOptions options, + CancellationToken token = default) { + var useSignature = options.UseDigitalSignature; + // The whole rewrite (parse + AST transform + serialize) runs inside one rented connection on a dispatcher. + // The PreparedQueryBuilder is a ref struct, so it lives entirely within this synchronous lambda — no await crosses it. + return executor.Execute(connection => { + var builder = new PreparedQueryBuilder(); + using var rewrittenQuery = RewriteQuery(connection, queryUtf8.Span, ref builder); + return builder.Build(rewrittenQuery.Span, useSignature ? _signatureKey.Span : ReadOnlySpan.Empty); + }, token); } - public async ValueTask ExecuteAsync(ReadOnlyMemory preparedQuery, + public ValueTask ExecuteAsync(ReadOnlyMemory preparedQuery, TConsumer consumer, QueryExecutionOptions options, CancellationToken token) where TConsumer : IQueryResultConsumer { - var parsedQuery = new PreparedQuery(preparedQuery.Span); - if (options.CheckIntegrity) { - CheckIntegrity(in parsedQuery); - } - - var snapshots = new PoolingBufferWriter { Capacity = parsedQuery.ViewCount + 1 }; // + default index - var rental = sharedPool.Rent(out var connection); - var statement = default(PreparedStatement); - var reader = default(QueryResultReader); - var cancellation = connection.InterruptQueryOnCancellation(token); - try { - CaptureSnapshots(in parsedQuery, connection, snapshots, token); - statement = new(connection, parsedQuery.Query); - consumer.Bind(new QueryBinder(in statement)); - - reader = new(in statement, consumer.UseStreaming); - await consumer.ConsumeAsync(reader, token); - reader.ThrowOnError(); // to handle query interruption from Quack (see catch block) - } catch (DuckDBException e) when (e.ErrorType is DuckDBErrorType.Interrupt) { - throw new OperationCanceledException(token); - } - finally { - await cancellation.DisposeAsync(); - reader?.Dispose(); - statement.Dispose(); - Disposable.Dispose(snapshots.WrittenMemory.Span); // release all captured snapshot - Disposable.Dispose(new ReadOnlySpan(in rental)); - snapshots.Dispose(); - } + var checkIntegrity = options.CheckIntegrity; + // Spec §6: the whole rent/snapshot/prepare/consume/cleanup pipeline runs inside ONE Execute op. The dispatcher + // blocks on the consumer by design (dispatcherCount bounds concurrent streams). Quack owns cancellation: + // it registers InterruptQueryOnCancellation per op and maps an interrupt to OperationCanceledException, so + // the engine no longer registers its own interrupt nor maps DuckDBException(Interrupt). + return new ValueTask(executor.Execute(connection => { + var parsedQuery = new PreparedQuery(preparedQuery.Span); + if (checkIntegrity) + CheckIntegrity(in parsedQuery); + + var snapshots = new PoolingBufferWriter { Capacity = parsedQuery.ViewCount + 1 }; // + default index + var statement = default(PreparedStatement); + var reader = default(QueryResultReader); + try { + CaptureSnapshots(in parsedQuery, connection, snapshots, token); + statement = new(connection, parsedQuery.Query); + consumer.Bind(new QueryBinder(in statement)); + + reader = new(in statement, consumer.UseStreaming); + consumer.ConsumeAsync(reader, token).AsTask().GetAwaiter().GetResult(); // dispatcher blocks: by design + reader.ThrowOnError(); + return 0; + } finally { + reader?.Dispose(); + statement.Dispose(); + Disposable.Dispose(snapshots.WrittenMemory.Span); // release all captured snapshot + snapshots.Dispose(); + } + }, token).AsTask()); } private void CheckIntegrity(ref readonly PreparedQuery parsedQuery) { @@ -104,22 +106,26 @@ public Schema GetArrowSchema(ReadOnlySpan preparedQuery, out Schema parame private TResult GetArrowSchema(ReadOnlySpan preparedQuery) where TReflector : ISchemaReflector, allows ref struct { - var parsedQuery = new PreparedQuery(preparedQuery); - var snapshots = new PoolingBufferWriter { Capacity = parsedQuery.ViewCount + 1 }; // + default index - var rental = sharedPool.Rent(out var connection); - var options = connection.GetArrowOptions(); - var statement = default(PreparedStatement); - try { - CaptureSnapshots(in parsedQuery, connection, snapshots, CancellationToken.None); - statement = new(connection, parsedQuery.Query); - return TReflector.Reflect(statement, options); - } finally { - statement.Dispose(); - options.Dispose(); - Disposable.Dispose(snapshots.WrittenMemory.Span); // release all captured snapshot - Disposable.Dispose(MemoryMarshal.CreateReadOnlySpan(in rental, 1)); - snapshots.Dispose(); - } + // No consume loop here — this is metadata reflection. Copy the span so it can be captured by the lambda, then + // run the whole capture-snapshots/prepare/reflect body on a dispatcher against a rented connection. This is a + // synchronous API, so we block on the dispatcher op (the dispatcher is a dedicated thread — no deadlock). + var queryBytes = preparedQuery.ToArray(); + return executor.Execute(connection => { + var parsedQuery = new PreparedQuery(queryBytes); + var snapshots = new PoolingBufferWriter { Capacity = parsedQuery.ViewCount + 1 }; // + default index + var options = connection.GetArrowOptions(); + var statement = default(PreparedStatement); + try { + CaptureSnapshots(in parsedQuery, connection, snapshots, CancellationToken.None); + statement = new(connection, parsedQuery.Query); + return TReflector.Reflect(statement, options); + } finally { + statement.Dispose(); + options.Dispose(); + Disposable.Dispose(snapshots.WrittenMemory.Span); // release all captured snapshot + snapshots.Dispose(); + } + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); } [StructLayout(LayoutKind.Auto)] diff --git a/src/KurrentDB.SecondaryIndexing/Stats/StatsService.cs b/src/KurrentDB.SecondaryIndexing/Stats/StatsService.cs index bb427202826..b53b5173a5b 100644 --- a/src/KurrentDB.SecondaryIndexing/Stats/StatsService.cs +++ b/src/KurrentDB.SecondaryIndexing/Stats/StatsService.cs @@ -2,7 +2,6 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using Kurrent.Quack; -using Kurrent.Quack.ConnectionPool; using KurrentDB.SecondaryIndexing.Indexes.Default; using KurrentDB.SecondaryIndexing.Storage; using Microsoft.Extensions.DependencyInjection; @@ -11,72 +10,77 @@ namespace KurrentDB.SecondaryIndexing.Stats; public class StatsService(IServiceProvider services) { - private readonly DuckDBConnectionPool _pool = services.GetRequiredService(); + private readonly DuckDBExecutor _executor = services.GetRequiredService(); private readonly DefaultIndexProcessor _defaultIndex = services.GetRequiredService(); - public IEnumerable GetCategories() { - using var connection = _pool.Open(); - using var snapshot = _defaultIndex.CaptureSnapshot(connection); - using var statement = new PreparedStatement(connection, GetAllCategories.CommandText); - foreach (var row in new QueryResult(statement)) { - yield return row; - } + public async ValueTask> GetCategories(CancellationToken ct = default) { + return await _executor.Execute(connection => { + var result = new List(); + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + using var statement = new PreparedStatement(connection, GetAllCategories.CommandText); + foreach (var row in new QueryResult(statement)) { + result.Add(row); + } + + return (IReadOnlyList)result; + }, ct); } // note: very expensive. count(distinct stream) over idx_all filtered by category - public IReadOnlyList GetCategoryStats(string category) { + public async ValueTask> GetCategoryStats(string category, CancellationToken ct = default) { if (string.IsNullOrWhiteSpace(category)) return []; - var result = new List(100); - using var connection = _pool.Open(); - using var snapshot = _defaultIndex.CaptureSnapshot(connection); - connection - .ExecuteQuery(new(category)) - .CopyTo(result); + return await _executor.Execute(connection => { + var result = new List(100); + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + connection + .ExecuteQuery(new(category)) + .CopyTo(result); - return result; + return (IReadOnlyList)result; + }, ct); } // note: expensive. group by event_type over idx_all filtered by category - public IReadOnlyList GetCategoryEventTypes(string category) { + public async ValueTask> GetCategoryEventTypes(string category, CancellationToken ct = default) { if (string.IsNullOrWhiteSpace(category)) return []; - var result = new List(100); - using var connection = _pool.Open(); - using var snapshot = _defaultIndex.CaptureSnapshot(connection); + return await _executor.Execute(connection => { + var result = new List(100); + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + connection + .ExecuteQuery(new(category)) + .CopyTo(result); - connection - .ExecuteQuery(new(category)) - .CopyTo(result); - - return result; + return (IReadOnlyList)result; + }, ct); } // note: very expensive. count(distinct commit_position) grouped by category over idx_all - public IReadOnlyList GetExplicitTransactions() { - var result = new List(100); - using var connection = _pool.Open(); - using var snapshot = _defaultIndex.CaptureSnapshot(connection); - - connection - .ExecuteQuery() - .CopyTo(result); - - return result; + public async ValueTask> GetExplicitTransactions(CancellationToken ct = default) { + return await _executor.Execute(connection => { + var result = new List(100); + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + connection + .ExecuteQuery() + .CopyTo(result); + + return (IReadOnlyList)result; + }, ct); } // note: expensive. DISTINCT ON(category) with ORDER BY event_number DESC over idx_all - public List GetLongestStreams() { - var result = new List(100); - using var connection = _pool.Open(); - using var snapshot = _defaultIndex.CaptureSnapshot(connection); - - connection - .ExecuteQuery() - .CopyTo(result); - - return result; + public async ValueTask> GetLongestStreams(CancellationToken ct = default) { + return await _executor.Execute(connection => { + var result = new List(100); + using var snapshot = _defaultIndex.CaptureSnapshot(connection); + connection + .ExecuteQuery() + .CopyTo(result); + + return result; + }, ct); } } diff --git a/src/KurrentDB/Components/Query/QueryService.cs b/src/KurrentDB/Components/Query/QueryService.cs index c5ca181959b..2d0e3ceaac0 100644 --- a/src/KurrentDB/Components/Query/QueryService.cs +++ b/src/KurrentDB/Components/Query/QueryService.cs @@ -29,7 +29,7 @@ public static async ValueTask ExecuteAdHocUserQuery(this IQueryEng var preparedQuery = default(MemoryOwner); var reader = new JsonReader(); try { - preparedQuery = engine.PrepareQuery(Encoding.UTF8.GetBytes(sql), new() { UseDigitalSignature = false }); + preparedQuery = await engine.PrepareQueryAsync(Encoding.UTF8.GetBytes(sql), new() { UseDigitalSignature = false }, token); await engine.ExecuteAsync(preparedQuery.Memory, reader, new() { CheckIntegrity = false }, token); diff --git a/src/KurrentDB/Components/Stats/UiStatsService.cs b/src/KurrentDB/Components/Stats/UiStatsService.cs index 7f4c5eb135a..df727051393 100644 --- a/src/KurrentDB/Components/Stats/UiStatsService.cs +++ b/src/KurrentDB/Components/Stats/UiStatsService.cs @@ -24,26 +24,28 @@ public sealed class UiStatsService(StatsService inner, IAuthorizationProvider au public async Task> GetCategoriesAsync(ClaimsPrincipal principal, CancellationToken ct = default) { await authorizer.EnsureAccessAsync(principal, ReadAllOperation, ct); - return await Task.Run(() => inner.GetCategories().ToList(), ct); + // The expensive DuckDB work runs on an executor dispatcher thread (StatsService.GetCategories), so the render + // thread isn't blocked — no Task.Run offload needed. + return await inner.GetCategories(ct); } public async Task> GetCategoryStatsAsync(ClaimsPrincipal principal, string category, CancellationToken ct = default) { await authorizer.EnsureAccessAsync(principal, ReadAllOperation, ct); - return await Task.Run(() => inner.GetCategoryStats(category), ct); + return await inner.GetCategoryStats(category, ct); } public async Task> GetCategoryEventTypesAsync(ClaimsPrincipal principal, string category, CancellationToken ct = default) { await authorizer.EnsureAccessAsync(principal, ReadAllOperation, ct); - return await Task.Run(() => inner.GetCategoryEventTypes(category), ct); + return await inner.GetCategoryEventTypes(category, ct); } public async Task> GetExplicitTransactionsAsync(ClaimsPrincipal principal, CancellationToken ct = default) { await authorizer.EnsureAccessAsync(principal, ReadAllOperation, ct); - return await Task.Run(() => inner.GetExplicitTransactions(), ct); + return await inner.GetExplicitTransactions(ct); } public async Task> GetLongestStreamsAsync(ClaimsPrincipal principal, CancellationToken ct = default) { await authorizer.EnsureAccessAsync(principal, ReadAllOperation, ct); - return await Task.Run(() => inner.GetLongestStreams(), ct); + return await inner.GetLongestStreams(ct); } } From 3102d84cf992b3d1dfdcf52c6ea3f26742e6286a Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:22:19 -0400 Subject: [PATCH 11/16] test: fixtures build on DuckDB executor DuckDbIntegrationTest constructs a DuckDBExecutor (workerCount 2, dispatcherCount 2) instead of a raw pool, exposes it as `protected readonly DuckDBExecutor Executor`, and runs schema setup + all query helpers via Executor.Execute. Processor/reader tests build from the executor, and their Commit()/query helpers become async (CommitAsync + Execute). Drops the stale `pool: null` arg from the ReadIndexEventsForward/Backward test message construction (the Pool property was removed from those messages). ReadTests awaits PrepareQueryAsync and CommitAsync. Co-Authored-By: Claude Fable 5 --- .../Fixtures/DuckDbIntegrationTest.cs | 13 +- .../Indexes/DefaultIndexProcessorTests.cs | 112 +++++++++--------- .../DefaultIndexReaderTests/IndexTestBase.cs | 8 +- .../IndexedPositionTests.cs | 4 +- .../ReadBackwardsTests.cs | 15 ++- .../ReadForwardsTests.cs | 17 ++- .../IntegrationTests/ReadTests.cs | 9 +- 7 files changed, 89 insertions(+), 89 deletions(-) diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Fixtures/DuckDbIntegrationTest.cs b/src/KurrentDB.SecondaryIndexing.Tests/Fixtures/DuckDbIntegrationTest.cs index 7a70bc56cd2..b1fb2189028 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Fixtures/DuckDbIntegrationTest.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Fixtures/DuckDbIntegrationTest.cs @@ -2,7 +2,7 @@ // Kurrent, Inc licenses this file to you under the Kurrent License v1 (see LICENSE.md). using System.Security.Claims; -using Kurrent.Quack.ConnectionPool; +using Kurrent.Quack; using KurrentDB.Core.Services.Transport.Enumerators; using KurrentDB.Core.XUnit.Tests; using KurrentDB.SecondaryIndexing.Storage; @@ -10,17 +10,18 @@ namespace KurrentDB.SecondaryIndexing.Tests.Fixtures; public abstract class DuckDbIntegrationTest : DirectoryPerTest { - protected readonly DuckDBConnectionPool DuckDb; + protected readonly DuckDBExecutor Executor; protected DuckDbIntegrationTest() { var dbPath = Fixture.GetFilePathFor($"{GetType().Name}.db"); - DuckDb = new($"Data Source={dbPath};"); + Executor = new($"Data Source={dbPath};", workerCount: 2, dispatcherCount: 2); var schema = new IndexingDbSchema(GetEvents); - using (DuckDb.Rent(out var connection)) { + Executor.Execute(connection => { schema.Execute(connection); - } + return 0; + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); } private static IEnumerator GetEvents(long[] logPositions, ClaimsPrincipal user) { @@ -33,7 +34,7 @@ private static IEnumerator GetEvents(long[] logPositions, ClaimsPr } public override async ValueTask DisposeAsync() { - DuckDb.Dispose(); + await Executor.DisposeAsync(); await base.DisposeAsync(); } } diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexProcessorTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexProcessorTests.cs index 2d4a5f37561..340e9041f99 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexProcessorTests.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexProcessorTests.cs @@ -25,7 +25,7 @@ public void WhenNoEventsProcessedYet_HasDefaultValues() { } [Fact] - public void CommittedMultipleEventsInMultipleStreams_AreIndexed() { + public async Task CommittedMultipleEventsInMultipleStreams_AreIndexed() { // Given const string cat1 = "first"; const string cat2 = "second"; @@ -59,26 +59,26 @@ public void CommittedMultipleEventsInMultipleStreams_AreIndexed() { _processor.TryIndex(resolvedEvent); } - _processor.Commit(); + await _processor.CommitAsync(CancellationToken.None); // Then // Default Index - AssertLastLogPositionQueryReturns(987); + await AssertLastLogPositionQueryReturns(987); - AssertDefaultIndexQueryReturns([100, 110, 117, 200, 213, 394, 500, 601, 987]); + await AssertDefaultIndexQueryReturns([100, 110, 117, 200, 213, 394, 500, 601, 987]); - AssertCategoryIndexQueryReturns(cat1, [100, 117, 200, 213, 500, 601, 987]); - AssertCategoryIndexQueryReturns(cat2, [110, 394]); + await AssertCategoryIndexQueryReturns(cat1, [100, 117, 200, 213, 500, 601, 987]); + await AssertCategoryIndexQueryReturns(cat2, [110, 394]); - AssertReadEventTypeIndexQueryReturns(cat1Et1, [100, 213, 987]); - AssertReadEventTypeIndexQueryReturns(cat2Et1, [110]); - AssertReadEventTypeIndexQueryReturns(cat1Et2, [117, 500]); - AssertReadEventTypeIndexQueryReturns(cat1Et3, [200, 601]); - AssertReadEventTypeIndexQueryReturns(cat2Et2, [394]); + await AssertReadEventTypeIndexQueryReturns(cat1Et1, [100, 213, 987]); + await AssertReadEventTypeIndexQueryReturns(cat2Et1, [110]); + await AssertReadEventTypeIndexQueryReturns(cat1Et2, [117, 500]); + await AssertReadEventTypeIndexQueryReturns(cat1Et3, [200, 601]); + await AssertReadEventTypeIndexQueryReturns(cat2Et2, [394]); } [Fact] - public void CommittedMultipleEventsToAStringWithoutCategory_AreIndexed() { + public async Task CommittedMultipleEventsToAStringWithoutCategory_AreIndexed() { // Given const string streamName = "hello"; string eventType1 = $"{Guid.NewGuid()}"; @@ -96,108 +96,108 @@ public void CommittedMultipleEventsToAStringWithoutCategory_AreIndexed() { _processor.TryIndex(resolvedEvent); } - _processor.Commit(); + await _processor.CommitAsync(CancellationToken.None); // Then // Default Index - AssertLastLogPositionQueryReturns(200); + await AssertLastLogPositionQueryReturns(200); - AssertDefaultIndexQueryReturns([100, 117, 200]); + await AssertDefaultIndexQueryReturns([100, 117, 200]); // Categories - AssertGetCategoriesQueryReturns(["hello"]); - AssertCategoryIndexQueryReturns(streamName, [100, 117, 200]); + await AssertGetCategoriesQueryReturns(["hello"]); + await AssertCategoryIndexQueryReturns(streamName, [100, 117, 200]); // EventTypes - AssertGetAllEventTypesQueryReturns([eventType1, eventType2, eventType3]); - AssertReadEventTypeIndexQueryReturns(eventType1, [100]); - AssertReadEventTypeIndexQueryReturns(eventType2, [117]); - AssertReadEventTypeIndexQueryReturns(eventType3, [200]); + await AssertGetAllEventTypesQueryReturns([eventType1, eventType2, eventType3]); + await AssertReadEventTypeIndexQueryReturns(eventType1, [100]); + await AssertReadEventTypeIndexQueryReturns(eventType2, [117]); + await AssertReadEventTypeIndexQueryReturns(eventType3, [200]); } - private void AssertDefaultIndexQueryReturns(List expected) { - List records; - using (DuckDb.Rent(out var connection)) { + private async Task AssertDefaultIndexQueryReturns(List expected) { + var records = await Executor.Execute(connection => { using (_processor.CaptureSnapshot(connection)) { - records = connection + return connection .ExecuteQuery(new(-1, int.MaxValue)) .ToList(); } - } + }, CancellationToken.None); Assert.Equal(expected, records.Select(x => x.LogPosition)); } - private void AssertLastLogPositionQueryReturns(long? expectedLogPosition) { - LastPositionResult? actual; - using (DuckDb.Rent(out var connection)) { + private async Task AssertLastLogPositionQueryReturns(long? expectedLogPosition) { + var actual = await Executor.Execute(connection => { using (_processor.CaptureSnapshot(connection)) { - actual = connection.QueryFirstOrDefault().OrNull(); + return connection.QueryFirstOrDefault().OrNull(); } - } + }, CancellationToken.None); Assert.Equal(expectedLogPosition, actual?.PreparePosition); } - private void AssertGetCategoriesQueryReturns(string[] expected) { - var records = new List(); - using (DuckDb.Rent(out var connection)) { + private async Task AssertGetCategoriesQueryReturns(string[] expected) { + var records = await Executor.Execute(connection => { + var result = new List(); using (_processor.CaptureSnapshot(connection)) { - using var result = connection.ExecuteAdHocQuery("select distinct category from idx_all_snapshot order by log_position"u8); - while (result.TryFetch(out var chunk)) { + using var query = connection.ExecuteAdHocQuery("select distinct category from idx_all_snapshot order by log_position"u8); + while (query.TryFetch(out var chunk)) { using (chunk) { while (chunk.TryRead(out var row)) { - records.Add(row.ReadString()); + result.Add(row.ReadString()); } } } } - } + + return result; + }, CancellationToken.None); Assert.Equal(expected, records); } - private void AssertCategoryIndexQueryReturns(string category, List expected) { - List records; - using (DuckDb.Rent(out var connection)) { + private async Task AssertCategoryIndexQueryReturns(string category, List expected) { + var records = await Executor.Execute(connection => { using (_processor.CaptureSnapshot(connection)) { - records = connection + return connection .ExecuteQuery(new(category, 0, 32)) .ToList(); } - } + }, CancellationToken.None); Assert.Equal(expected, records.Select(x => x.LogPosition)); } - private void AssertGetAllEventTypesQueryReturns(string[] expected) { - var records = new List(); - using (DuckDb.Rent(out var connection)) { + private async Task AssertGetAllEventTypesQueryReturns(string[] expected) { + var records = await Executor.Execute(connection => { + var result = new List(); using (_processor.CaptureSnapshot(connection)) { - using var result = connection.ExecuteAdHocQuery("select distinct schema_name from idx_all order by log_position"u8); - while (result.TryFetch(out var chunk)) { + using var query = connection.ExecuteAdHocQuery("select distinct schema_name from idx_all order by log_position"u8); + while (query.TryFetch(out var chunk)) { using (chunk) { while (chunk.TryRead(out var row)) { - records.Add(row.ReadString()); + result.Add(row.ReadString()); } } } } - } + + return result; + }, CancellationToken.None); Assert.Equal(expected, records); } - private void AssertReadEventTypeIndexQueryReturns(string eventType, List expected) { - List records; - using (DuckDb.Rent(out var connection)) { + private async Task AssertReadEventTypeIndexQueryReturns(string eventType, List expected) { + var records = await Executor.Execute(connection => { using (_processor.CaptureSnapshot(connection)) { - records = connection.ExecuteQuery( + return connection.ExecuteQuery( new(eventType, 0, 32) ) .ToList(); } - } + }, CancellationToken.None); Assert.Equal(expected, records.Select(x => x.LogPosition)); } @@ -211,7 +211,7 @@ public DefaultIndexProcessorTests() { var publisher = new FakePublisher(); - _processor = new(DuckDb, publisher, hasher, new("test"), NullLoggerFactory.Instance); + _processor = new(Executor, publisher, hasher, new("test"), NullLoggerFactory.Instance); } public override ValueTask DisposeAsync() { diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexTestBase.cs b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexTestBase.cs index 52737ff2b5e..87982c49995 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexTestBase.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexTestBase.cs @@ -22,12 +22,12 @@ protected IndexTestBase() { var hasher = new CompositeHasher(new XXHashUnsafe(), new Murmur3AUnsafe()); var publisher = new FakePublisher(); - _processor = new(DuckDb, publisher, hasher, new("test"), NullLoggerFactory.Instance); + _processor = new(Executor, publisher, hasher, new("test"), NullLoggerFactory.Instance); - Sut = new(DuckDb, _processor, _readIndexStub.ReadIndex); + Sut = new(Executor, _processor, _readIndexStub.ReadIndex); } - protected void IndexEvents(ResolvedEvent[] events, bool shouldCommit) { + protected async ValueTask IndexEvents(ResolvedEvent[] events, bool shouldCommit) { _readIndexStub.IndexEvents(events); foreach (var resolvedEvent in events) { @@ -35,6 +35,6 @@ protected void IndexEvents(ResolvedEvent[] events, bool shouldCommit) { } if (shouldCommit) - _processor.Commit(); + await _processor.CommitAsync(CancellationToken.None); } } diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexedPositionTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexedPositionTests.cs index 5620434c64b..b099d4b57d5 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexedPositionTests.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/IndexedPositionTests.cs @@ -9,14 +9,14 @@ public class IndexedPositionTests : IndexTestBase { [Theory] [InlineData(true)] [InlineData(false)] - public void WhenIndexedRecords_ReturnsProcessorLastIndexedPosition(bool shouldCommit) { + public async Task WhenIndexedRecords_ReturnsProcessorLastIndexedPosition(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []), From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); var streamId = Guid.NewGuid().ToString(); // this can be anything for a Default Index, as it's ignored // When diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadBackwardsTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadBackwardsTests.cs index e0586acc867..4550fc59eb7 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadBackwardsTests.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadBackwardsTests.cs @@ -21,7 +21,7 @@ public async Task WhenNegativeStart_UsesEndOfLog(bool shouldCommit) { From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadBackwards(startFrom: new TFPos(-1, -1), maxCount: 5); @@ -44,7 +44,7 @@ public async Task WhenNegativeStart_UsesEndOfLog(bool shouldCommit) { public async Task WhenEmptyResultSet_ReturnsSuccessWithNoEvents(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When - request events beyond what exists var result = await ReadBackwards(startFrom: new TFPos(50, 50), maxCount: 5); @@ -67,7 +67,7 @@ public async Task WhenEmptyResultSet_ReturnsSuccessWithNoEvents(bool shouldCommi public async Task WhenEmptyNegativeStart_UsesValidationVersion(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadBackwards(new TFPos(-1, -1), maxCount: 5, validationStreamVersion: 3); @@ -94,7 +94,7 @@ public async Task WhenReadingFromLastMoreThanAvailable_ReturnsIsEndOfStreamTrue( From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadBackwards(startFrom: new TFPos(300, 300), maxCount: 5); @@ -117,7 +117,7 @@ public async Task WhenReadingFromLastMoreThanAvailable_ReturnsIsEndOfStreamTrue( public async Task WhenValidationVersionMatches_ReturnsNotModified(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadBackwards(validationStreamVersion: 100); @@ -144,7 +144,7 @@ public async Task WhenReadingFromEndMoreThanAvailable_ReturnsIsEndOfStreamTrue(b From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When - request events that don't exist var result = await ReadBackwards(startFrom: new TFPos(long.MaxValue, long.MaxValue), maxCount: 1000); @@ -173,7 +173,7 @@ public async Task WhenMoreEventsAvailable_ReturnsOnlyRequested(bool shouldCommit From("test-stream", 3, 400, "TestEvent", []), From("test-stream", 4, 500, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadBackwards(startFrom: new TFPos(500, 500), maxCount: 2); @@ -220,7 +220,6 @@ private async Task ReadBackwards( validationStreamVersion, user, replyOnExpired, - pool: null, longPollTimeout, expires, CancellationToken.None diff --git a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadForwardsTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadForwardsTests.cs index a58a20bbe7b..78baa97817c 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadForwardsTests.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/Indexes/DefaultIndexReaderTests/ReadForwardsTests.cs @@ -17,7 +17,7 @@ public class ReadForwardsTests : IndexTestBase { public async Task WhenValidationVersionMatches_ReturnsNotModified(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(validationTfLastCommitPosition: 100); @@ -40,7 +40,7 @@ public async Task WhenValidationVersionMatches_ReturnsNotModified(bool shouldCom public async Task WhenMaxCountOverflow_HandlesCorrectly(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var pos = long.MaxValue - 5; @@ -64,7 +64,7 @@ public async Task WhenMaxCountOverflow_HandlesCorrectly(bool shouldCommit) { public async Task WhenRequestBeyondExistingSet_ReturnsSuccessWithNoEvents(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(startFrom: new TFPos(200, 200), maxCount: 10); @@ -87,7 +87,7 @@ public async Task WhenRequestBeyondExistingSet_ReturnsSuccessWithNoEvents(bool s public async Task WhenRequestBeyondExistingSetWithValidationVersion_UsesValidationVersion(bool shouldCommit) { // Given var events = new[] { From("test-stream", 0, 100, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(startFrom: new TFPos(100, 100), maxCount: 10, validationTfLastCommitPosition: 7); @@ -114,7 +114,7 @@ public async Task WhenSuccessfullyReadingWithMoreEventsAvailable_ReturnsCorrectR From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(maxCount: 2); @@ -140,7 +140,7 @@ public async Task WhenAtEndOfStream_ReturnsCorrectResult(bool shouldCommit) { From("test-stream", 0, 100, "TestEvent", []), From("test-stream", 1, 200, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(startFrom: new TFPos(200, 200), maxCount: 10); @@ -167,7 +167,7 @@ public async Task WhenEventsExist_SetsNextEventNumberFromLastEvent(bool shouldCo From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(maxCount: 3); @@ -194,7 +194,7 @@ public async Task WhenOverTheEnd_ReturnsEmptyResult(bool shouldCommit) { From("test-stream", 1, 200, "TestEvent", []), From("test-stream", 2, 300, "TestEvent", []) }; - IndexEvents(events, shouldCommit); + await IndexEvents(events, shouldCommit); // When var result = await ReadForwards(startFrom: new TFPos(long.MaxValue, long.MaxValue), maxCount: 1000); @@ -241,7 +241,6 @@ private async Task ReadForwards( validationTfLastCommitPosition, user, replyOnExpired, - pool: null, longPollTimeout, expires, CancellationToken.None diff --git a/src/KurrentDB.SecondaryIndexing.Tests/IntegrationTests/ReadTests.cs b/src/KurrentDB.SecondaryIndexing.Tests/IntegrationTests/ReadTests.cs index c3b3f1f08df..5dc381b5b63 100644 --- a/src/KurrentDB.SecondaryIndexing.Tests/IntegrationTests/ReadTests.cs +++ b/src/KurrentDB.SecondaryIndexing.Tests/IntegrationTests/ReadTests.cs @@ -31,9 +31,10 @@ public async Task ReadsAllEventsFromDefaultIndex(bool forwards) { [Fact] public async Task ReadFromDefaultIndexUsingQueryEngine() { var engine = Fixture.NodeServices.GetRequiredService(); - using var preparedSql = engine.PrepareQuery( - "SELECT metadata FROM kdb.records"u8, - new() { UseDigitalSignature = true }); + using var preparedSql = await engine.PrepareQueryAsync( + "SELECT metadata FROM kdb.records"u8.ToArray(), + new() { UseDigitalSignature = true }, + TestContext.Current.CancellationToken); var consumer = new RowCountReader(); await engine.ExecuteAsync( @@ -125,7 +126,7 @@ private async Task ValidateRead(CommitMode mode, string indexName, IEventFilter var processor = Fixture.NodeServices.GetRequiredService(); switch (mode) { case CommitMode.CommitAndClear: - processor.Commit(); + await processor.CommitAsync(TestContext.Current.CancellationToken); break; } From 1e5415137b272a9d830756370497f4130daaaedd Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 15:50:16 -0400 Subject: [PATCH 12/16] fix: bridge SchemaRegistry's DuckDBConnectionPool to the executor's shared DB Task 4 removed the shared DuckDBConnectionPool DI registration, but SchemaRegistry reaches DuckDB through Kurrent.Surge.DuckDB's IDuckDBConnectionProvider, which is constructed from a DuckDBConnectionPool - so every node boot failed to resolve it (14 integration tests). The external projector framework hard-requires that provider, so a full reroute to executor.Execute is not possible. Register a DuckDBConnectionPool built from DuckDBExecutor.CreateSharedConnectionPool() (added in Kurrent.Quack): an independent, DI-owned pool that shares the executor's already-open database (same file => same in-process DuckDB instance and task scheduler). SchemaRegistry's query work therefore still runs on the executor's measured worker threads. Bumps the local Quack feed to local.2 for the new API. SecondaryIndexing.Tests 74/74 green; SchemaRegistry.Tests 113 passed + 1 skipped. Co-Authored-By: Claude Fable 5 --- src/Directory.Packages.props | 4 ++-- src/KurrentDB.Core/DuckDB/InjectionExtensions.cs | 9 +++++++++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/src/Directory.Packages.props b/src/Directory.Packages.props index cf508cf375f..9085bd66a97 100644 --- a/src/Directory.Packages.props +++ b/src/Directory.Packages.props @@ -68,8 +68,8 @@ - - + + diff --git a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs index 725174e24da..361a44cc573 100644 --- a/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs +++ b/src/KurrentDB.Core/DuckDB/InjectionExtensions.cs @@ -5,6 +5,7 @@ using System.Collections.Generic; using System.Diagnostics.Metrics; using Kurrent.Quack; +using Kurrent.Quack.ConnectionPool; using KurrentDB.Core.TransactionLog.Chunks; using KurrentDB.DuckDB; using Microsoft.Extensions.DependencyInjection; @@ -23,6 +24,14 @@ public static IServiceCollection AddDuckDb(this IServiceCollection services, str sp.GetService>())); services.AddHostedService(sp => sp.GetRequiredService()); services.AddSingleton(sp => sp.GetRequiredService().Executor); + // SchemaRegistry reaches DuckDB through Kurrent.Surge.DuckDB's IDuckDBConnectionProvider, which is + // constructed from a DuckDBConnectionPool. Hand it a pool that SHARES the executor's already-open database + // (same file => same in-process DuckDB instance and task scheduler), so its query work still runs on the + // executor's measured worker threads. This is a bridge for that external framework's hard requirement, not a + // revival of the removed shared-RW / per-connection READ_ONLY pool model. DI owns and disposes this pool; + // its lifetime is independent of the executor's own internal pool. + services.AddSingleton(sp => + sp.GetRequiredService().Executor.CreateSharedConnectionPool()); // The lifetime is a hosted service, so it's instantiated at node startup; it creates the CPU-metric // instrument in its constructor (over the executor it owns), so the instrument exists even without a // metrics consumer attached. From df31346f6ef9fb004987964246d87ef8dbd0ab94 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 16:05:58 -0400 Subject: [PATCH 13/16] docs: correct UiStatsService class comment (dispatcher, not thread-pool offload) Review follow-up: the class summary still described the removed Task.Run thread-pool offload; StatsService now runs its DuckDB work on an executor dispatcher thread. Co-Authored-By: Claude Fable 5 --- src/KurrentDB/Components/Stats/UiStatsService.cs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/KurrentDB/Components/Stats/UiStatsService.cs b/src/KurrentDB/Components/Stats/UiStatsService.cs index df727051393..120968615b1 100644 --- a/src/KurrentDB/Components/Stats/UiStatsService.cs +++ b/src/KurrentDB/Components/Stats/UiStatsService.cs @@ -17,8 +17,8 @@ namespace KurrentDB.Components.Stats; // directly and authorizes nothing. This wrapper gives Stats the same defense-in-depth every other // UI service has (via the IAuthorizationProvider.EnsureAccessAsync extension): it enforces the // configured policy (read $all — the same Operation the ViewStats page policy and the FlightSql query -// path use) before each query, so the page-level [Authorize] is no longer the only gate. The expensive, -// synchronous DuckDB work is offloaded to the thread pool so the render thread isn't blocked. +// path use) before each query, so the page-level [Authorize] is no longer the only gate. The expensive +// DuckDB work runs on an executor dispatcher thread (inside StatsService), so the render thread isn't blocked. public sealed class UiStatsService(StatsService inner, IAuthorizationProvider authorizer) { static readonly Operation ReadAllOperation = UiOperations.ReadAll; From 8ecf4c33e94aa23980fb984483b3c98ad9702b88 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 16:21:05 -0400 Subject: [PATCH 14/16] fix: address final whole-branch review findings - DuckDBExecutorLifetime: wrap the one-time-setup execution in try/catch that disposes the executor before rethrow, so a setup failure at startup can't leak the executor's worker/dispatcher threads + open DB handle (Important #2). - DefaultIndexProcessor.CommitAsync: guard on IsDisposed, not IsDisposingOrDisposed, so the dispose-time flush actually runs (DotNext sets Disposing before Dispose(bool)). Matches UserIndexProcessor; removes the shutdown re-index window. Verified safe: Dispose() drains the subscription before Dispose(bool), and _committing guards residual concurrency (Minor #4). - DuckDBCpuMetrics: note in the metric description that SchemaRegistry's own-thread DuckDB work is excluded (its parallel portions still run on worker threads) (Minor #5). - docs/server/diagnostics/metrics.md: document the new DuckDB CPU meter with the kurrentdb_proc_cpu cross-reference for computing DuckDB's CPU share (Important #3). SecondaryIndexing.Tests 74/74 green after the changes. Co-Authored-By: Claude Fable 5 --- docs/server/diagnostics/metrics.md | 8 ++++++ src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs | 4 ++- .../DuckDB/DuckDBExecutorLifetime.cs | 27 ++++++++++++------- .../Indexes/Default/DefaultIndexProcessor.cs | 5 +++- 4 files changed, 32 insertions(+), 12 deletions(-) diff --git a/docs/server/diagnostics/metrics.md b/docs/server/diagnostics/metrics.md index 0832055ae98..22484f9bcf0 100644 --- a/docs/server/diagnostics/metrics.md +++ b/docs/server/diagnostics/metrics.md @@ -393,6 +393,14 @@ kurrentdb_proc_contention_count_total 297 1688147136862 kurrentdb_gc_pause_duration_max_seconds{range="16-20 seconds"} 0.0485873 1688147136862 ``` +### DuckDB + +CPU time consumed by the dedicated DuckDB executor that owns every thread DuckDB runs on (used by secondary indexing and the query engine). Because the executor owns those threads, this is the DuckDB share of the process CPU reported by `kurrentdb_proc_cpu`: compare `rate(kurrentdb_duckdb_cpu_seconds_total)` against `kurrentdb_proc_cpu` to see the fraction of KurrentDB's CPU spent inside DuckDB. + +| Time Series | Type | Description | +|:------------|:-----|:------------| +| `kurrentdb_duckdb_cpu_seconds_total{role=}` | [Counter](#common-types) | Cumulative CPU seconds consumed by DuckDB's executor threads, split by `role`: `worker` (DuckDB's parallel task-scheduler threads) and `dispatcher` (threads that run queries, reads, and index commits). Requires per-thread CPU support (Linux/Windows); reads 0 on macOS. Excludes DuckDB work the schema registry runs on its own threads (low volume; its parallel portions still run on worker threads). | + ### Projections Projection metrics track the statistics for projections. diff --git a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs index 611886dfad1..97b06022298 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs @@ -18,7 +18,9 @@ public class DuckDBCpuMetrics { public DuckDBCpuMetrics(Meter meter, string serviceName, Func> sampleCpu) { meter.CreateObservableCounter($"{serviceName}.duckdb.cpu.seconds", Observe, - description: "CPU time consumed by DuckDB's executor threads, in seconds"); + description: "CPU time consumed by DuckDB's executor worker and dispatcher threads, in seconds " + + "(excludes DuckDB work SchemaRegistry runs on its own threads via a shared connection pool; " + + "that work's parallel portions still run on the executor's worker threads)."); IEnumerable> Observe() { double workers = 0, dispatchers = 0; diff --git a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs index 55a20b98c0b..c41264fd3c4 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs @@ -43,17 +43,24 @@ public DuckDBExecutorLifetime( var connectionString = $"Data Source={path};memory_limit={memoryMib}MB"; Executor = new DuckDBExecutor(connectionString, workerCount, dispatcherCount); - CpuMetrics = new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName, Executor.SampleCpu); - _log.LogInformation("DuckDB executor started at {path}: {workers} workers, {dispatchers} dispatchers, memory_limit {memory}MB", - path, workerCount, dispatcherCount, memoryMib); + try { + CpuMetrics = new DuckDBCpuMetrics(new Meter(DuckDBCpuMetrics.MeterName, "1.0.0"), serviceName, Executor.SampleCpu); + _log.LogInformation("DuckDB executor started at {path}: {workers} workers, {dispatchers} dispatchers, memory_limit {memory}MB", + path, workerCount, dispatcherCount, memoryMib); - // One-time setups (IndexingDbSchema, SchemaDbSchema) — effects are database-wide, so running - // them once on any executor-owned connection preserves today's semantics exactly. - Executor.Execute(connection => { - foreach (var setup in setups) - setup.Execute(connection); - return 0; - }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); + // One-time setups (IndexingDbSchema, SchemaDbSchema) — effects are database-wide, so running + // them once on any executor-owned connection preserves today's semantics exactly. + Executor.Execute(connection => { + foreach (var setup in setups) + setup.Execute(connection); + return 0; + }, CancellationToken.None).AsTask().GetAwaiter().GetResult(); + } catch { + // A setup failure (e.g. a schema-migration error) throws before this object exists for anyone to + // Dispose(), so the executor's worker/dispatcher threads and open DB handle would leak. Dispose it. + Executor.DisposeAsync().AsTask().GetAwaiter().GetResult(); + throw; + } return; diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs index 6a02cc6b2ef..f0ee852e026 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs @@ -151,7 +151,10 @@ private static (TFPos, DateTimeOffset) ReadLastIndexedRecord(DuckDBAdvancedConne /// Commits all in-flight records to the index. /// public async ValueTask CommitAsync(CancellationToken ct) { - if (IsDisposingOrDisposed || !Interlocked.FalseToTrue(ref _committing)) + // IsDisposed (not IsDisposingOrDisposed): DotNext sets Disposing before Dispose(bool) runs, so guarding on + // IsDisposingOrDisposed would skip the dispose-time flush and drop the last partial batch. Dispose() drains + // the subscription before Dispose(bool), so no concurrent CommitAsync is in flight; _committing covers the rest. + if (IsDisposed || !Interlocked.FalseToTrue(ref _committing)) return; try { From 1404035102f3ad0909d585221a741e17ea6a6341 Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 21:13:33 -0400 Subject: [PATCH 15/16] fix: address KurrentDB Codex review (read-lock lifetime, shutdown flush, meter root, checkpoint bound) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - UserIndexEngineSubscription.ReadForwards/ReadBackwards: make async and await the DuckDB read UNDER the read lock. They now queue an async executor.Execute, so returning the ValueTask while releasing the lock let a concurrent stop/delete take the write lock and dispose the processor before the queued read ran (P2). - DefaultIndexProcessor.CommitAsync: revert the dispose-time flush (guard on IsDisposingOrDisposed, not IsDisposed). Handle(BecomeShuttingDown) disposes the processor WITHOUT draining the subscription, so a dispose-time Flush() could race an in-flight TryIndex/CreateRow. Skipping is safe — the index self-heals from its committed position on restart (P2). - DuckDBCpuMetrics: store the Meter in a field so it (and its ObservableCounter) stays rooted for the process lifetime; it was created inline and only kept alive by listener internals (P3). - DuckDBExecutorLifetime.StopAsync: bound the shutdown checkpoint with a timeout so a dispatcher pool saturated by streaming queries can't hang teardown; a missed checkpoint is not data loss (WAL replays on open). The broader dispatcher-starvation fix (pool split vs sizing) is deferred to the load/soak (P1, per decision). SecondaryIndexing.Tests 74/74. (Directory.Packages.props bumped to the reviewed Quack local.3 placeholder.) Co-Authored-By: Claude Fable 5 --- src/Directory.Packages.props | 4 ++-- src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs | 6 ++++++ .../DuckDB/DuckDBExecutorLifetime.cs | 18 +++++++++++++++-- .../Indexes/Default/DefaultIndexProcessor.cs | 10 ++++++---- .../User/UserIndexEngineSubscription.cs | 20 +++++++++++-------- 5 files changed, 42 insertions(+), 16 deletions(-) diff --git a/src/Directory.Packages.props b/src/Directory.Packages.props index 9085bd66a97..fdf2617cf86 100644 --- a/src/Directory.Packages.props +++ b/src/Directory.Packages.props @@ -68,8 +68,8 @@ - - + + diff --git a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs index 97b06022298..dcc84fc433a 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBCpuMetrics.cs @@ -16,7 +16,13 @@ namespace KurrentDB.Core.DuckDB; public class DuckDBCpuMetrics { public const string MeterName = "KurrentDB.DuckDB"; + // Held so the meter — and therefore its ObservableCounter — stays rooted for the lifetime of this object + // (owned by DuckDBExecutorLifetime, i.e. process lifetime). Without this reference the meter could be + // garbage-collected and the instrument would silently stop reporting. + private readonly Meter _meter; + public DuckDBCpuMetrics(Meter meter, string serviceName, Func> sampleCpu) { + _meter = meter; meter.CreateObservableCounter($"{serviceName}.duckdb.cpu.seconds", Observe, description: "CPU time consumed by DuckDB's executor worker and dispatcher threads, in seconds " + "(excludes DuckDB work SchemaRegistry runs on its own threads via a shared connection pool; " + diff --git a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs index c41264fd3c4..2534ecb00cf 100644 --- a/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs +++ b/src/KurrentDB.Core/DuckDB/DuckDBExecutorLifetime.cs @@ -73,12 +73,26 @@ string GetTempPath() { public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask; + // Bound on the shutdown checkpoint: it runs on a dispatcher, and if streaming queries have saturated the + // dispatcher pool the checkpoint could otherwise block shutdown indefinitely. A missed checkpoint is not data + // loss — DuckDB replays its WAL on next open — so time out and continue rather than hang teardown. (The broader + // dispatcher-starvation question is tracked for the load/soak: size DispatcherThreads above the expected + // concurrent-streaming-query count, or split streaming onto its own dispatcher pool.) + private static readonly TimeSpan CheckpointTimeout = TimeSpan.FromSeconds(30); + public async Task StopAsync(CancellationToken cancellationToken) { _log.LogDebug("Checkpointing DuckDB"); - await Executor.Execute(connection => { + var checkpoint = Executor.Execute(connection => { connection.Checkpoint(); return 0; - }, cancellationToken); + }, cancellationToken).AsTask(); + try { + await checkpoint.WaitAsync(CheckpointTimeout, cancellationToken); + } catch (TimeoutException) { + _log.LogWarning("DuckDB shutdown checkpoint did not complete within {Timeout}; skipping it (the WAL is " + + "replayed on next open). This can happen if streaming queries have saturated the DuckDB dispatcher pool.", + CheckpointTimeout); + } } protected override void Dispose(bool disposing) { diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs index f0ee852e026..ab5f71f158a 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/Default/DefaultIndexProcessor.cs @@ -151,10 +151,12 @@ private static (TFPos, DateTimeOffset) ReadLastIndexedRecord(DuckDBAdvancedConne /// Commits all in-flight records to the index. /// public async ValueTask CommitAsync(CancellationToken ct) { - // IsDisposed (not IsDisposingOrDisposed): DotNext sets Disposing before Dispose(bool) runs, so guarding on - // IsDisposingOrDisposed would skip the dispose-time flush and drop the last partial batch. Dispose() drains - // the subscription before Dispose(bool), so no concurrent CommitAsync is in flight; _committing covers the rest. - if (IsDisposed || !Interlocked.FalseToTrue(ref _committing)) + // IsDisposingOrDisposed (NOT IsDisposed): the dispose-time flush is deliberately SKIPPED. The shutdown path + // DefaultIndexBuilder.Handle(BecomeShuttingDown) disposes this processor WITHOUT first draining the + // subscription, so a dispose-time Flush() could run concurrently with an in-flight TryIndex/CreateRow on the + // subscription thread — which the appender forbids. Skipping is safe: the default index is derived state that + // resumes from its last committed position on restart, re-indexing any unflushed tail. + if (IsDisposingOrDisposed || !Interlocked.FalseToTrue(ref _committing)) return; try { diff --git a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs index 6c7696b2ed3..504014bf258 100644 --- a/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs +++ b/src/KurrentDB.SecondaryIndexing/Indexes/User/UserIndexEngineSubscription.cs @@ -296,36 +296,40 @@ public TFPos GetLastIndexedPosition(string indexStream) { return data.Subscription.GetLastIndexedPosition(); } - public ValueTask ReadForwards(ClientMessage.ReadIndexEventsForward msg, + public async ValueTask ReadForwards(ClientMessage.ReadIndexEventsForward msg, CancellationToken token) { UserIndexHelpers.ParseQueryStreamName(msg.IndexName, out var indexName, out _); _log.LogUserIndexReceivedReadForwardsRequest(indexName); if (!TryAcquireReadLockForIndex(indexName, out var readLock, out var data)) { - var result = new ClientMessage.ReadIndexEventsForwardCompleted( + return new ClientMessage.ReadIndexEventsForwardCompleted( ReadIndexResult.IndexNotFound, [], new(msg.CommitPosition, msg.PreparePosition), -1, true, $"Index {msg.IndexName} does not exist" ); - return ValueTask.FromResult(result); } + // await UNDER the read lock: the reader now queues an async executor.Execute, so returning the ValueTask + // while releasing the lock (as the non-async form did) would let a concurrent stop/delete take the write lock + // and dispose the processor before the queued read captures its snapshot. Holding the lock across the await + // keeps the processor alive for the read. using (readLock) - return data.Reader.ReadForwards(msg, token); + return await data.Reader.ReadForwards(msg, token); } - public ValueTask ReadBackwards(ClientMessage.ReadIndexEventsBackward msg, + public async ValueTask ReadBackwards(ClientMessage.ReadIndexEventsBackward msg, CancellationToken token) { UserIndexHelpers.ParseQueryStreamName(msg.IndexName, out var indexName, out _); _log.LogUserIndexReceivedReadBackwardsRequest(indexName); if (!TryAcquireReadLockForIndex(indexName, out var readLock, out var data)) { - var result = new ClientMessage.ReadIndexEventsBackwardCompleted( + return new ClientMessage.ReadIndexEventsBackwardCompleted( ReadIndexResult.IndexNotFound, [], new(msg.CommitPosition, msg.PreparePosition), -1, true, $"Index {msg.IndexName} does not exist" ); - return ValueTask.FromResult(result); } + // await UNDER the read lock (see ReadForwards): the async executor read must not outlive the lock, or a + // concurrent stop/delete could dispose the processor before the queued read captures its snapshot. using (readLock) - return data.Reader.ReadBackwards(msg, token); + return await data.Reader.ReadBackwards(msg, token); } public bool TryGetUserIndexTableDetails(string indexName, out string tableName, out string? fieldName) { From 2ff395964b63d46822bb39294d43a1393d61b8bc Mon Sep 17 00:00:00 2001 From: realtonyyoung <6655045+realtonyyoung@users.noreply.github.com> Date: Thu, 2 Jul 2026 21:49:51 -0400 Subject: [PATCH 16/16] docs(spec): mark PR #5642 closed/superseded (review follow-up) Co-Authored-By: Claude Fable 5 --- .../specs/2026-06-26-duckdb-cpu-attribution-design.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md index 27952cfd2a4..f301e42fc32 100644 --- a/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md +++ b/docs/superpowers/specs/2026-06-26-duckdb-cpu-attribution-design.md @@ -9,7 +9,7 @@ We want operators to be able to answer: **"What fraction of this node's CPU is DuckDB vs the rest of KurrentDB?"** Process- and system-level CPU are already exported (`kurrentdb_proc_cpu`, `kurrentdb_sys_cpu`); the missing piece is the DuckDB share so the two can be compared. -PR #5642 (open, not merged) proposed a first attempt: `DuckDBCpuMetrics.Measure(activity)` returns a `ref struct` scope that reads the calling thread's CPU (`clock_gettime(CLOCK_THREAD_CPUTIME_ID)` on Linux/macOS, `GetThreadTimes` on Windows) at construction and again at `Dispose`, recording the delta into a counter. It is wrapped around the synchronous DuckDB sections of commit, checkpoint, index reads, and query setup. +PR #5642 (now closed, superseded by this design) proposed a first attempt: `DuckDBCpuMetrics.Measure(activity)` returns a `ref struct` scope that reads the calling thread's CPU (`clock_gettime(CLOCK_THREAD_CPUTIME_ID)` on Linux/macOS, `GetThreadTimes` on Windows) at construction and again at `Dispose`, recording the delta into a counter. It is wrapped around the synchronous DuckDB sections of commit, checkpoint, index reads, and query setup. Code review identified three **fundamental** flaws, all rooted in one assumption — *"measure the calling thread across a synchronous span"*: @@ -128,7 +128,7 @@ Most of the blast radius is converting a few `void`/synchronous DuckDB methods t - **Kurrent.Quack (first):** add the `DuckDBExecutor` (worker + dispatcher pools), the native task-scheduler interop (`duckdb_create_task_state` / `execute_tasks_state` / `finish_execution` / `destroy_task_state`), the cross-thread per-OS CPU read, and the per-thread CPU enumeration. Ship as a new Quack version. - **KurrentDB (second):** consume the new Quack version; migrate the DuckDB call sites (§6) to the executor; register the `kurrentdb.duckdb.cpu.seconds` observable counter and add the `KurrentDB.DuckDB` meter to `metricsconfig.json`; document the metric in `docs/server/diagnostics/metrics.md`. -- **PR #5642:** **remove** the caller-side CPU measurement (`DuckDBCpuMetrics`, `ThreadCpuTime`, the scope instrumentation and its tests). Per the review, we will not merge a metric whose headline value is wrong for parallel work. The `KurrentDB.DuckDB` meter name and the docs scaffolding may be retained as the landing point for this design. (If #5642 has other unrelated value it can keep it; otherwise it closes in favor of this work.) +- **PR #5642:** **remove** the caller-side CPU measurement (`DuckDBCpuMetrics`, `ThreadCpuTime`, the scope instrumentation and its tests). Per the review, we will not merge a metric whose headline value is wrong for parallel work. The `KurrentDB.DuckDB` meter name and the docs scaffolding are retained as the landing point for this design (re-created fresh on this PR). #5642 has been closed in favor of this work. ## 11. Risks & open questions