Sync by Swiftyos · Pull Request #11 · Swiftyos/AgentProbe

Swiftyos · 2026-05-05T16:48:30Z

No description provided.

…Phase 0 Turns the approved AgentProbe server design (docs/design-docs/agent-probe-server.md) into a binding product contract before any runtime code lands, per the Phase 0 exec plan (docs/exec-plans/active/agent-probe-server-phase-0-contract-2026-04.md). - platform.md: add "Server control plane" scenario group with 9 scenarios covering default boot, exposure safety, read-only HTTP/UI history, live SSE, run control, cancellation, presets, comparisons, and Docker SQLite persistence. - current-state.md: mirror all 9 new scenarios as unchecked (planned) and refresh "Last validated against" to 2026-04-17. - e2e-checklist.md: add a planned test-owner row per scenario covering tests/e2e/server-e2e.test.ts, tests/integration/server/, tests/unit/server/, and dashboard component tests. scripts/check-behaviour-docs.ts reports zero drift across all 24 scenarios. No runtime files modified; PR targets dev. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs(product-specs): Server control plane contract (SYM-20 Phase 0)

Summary: - Add start-server config, auth, routing, static dashboard, SQLite read routes, suite discovery, report rendering, and SSE replay support. - Add a dual-mode dashboard that preserves /api/state live polling and adds read-only server views for overview, runs, scenarios, suites, and settings. - Add unit, integration, and e2e coverage for server config, auth, streams, HTTP read paths, token protection, and CLI lifecycle. Rationale: - Phase 1 needs a stable read-only HTTP surface before later write paths add run control, presets, cancellation, and other orchestration. - Binding safety and token auth are enforced at boot and request edges so non-loopback exposure cannot accidentally start unauthenticated. Tests: - bun run docs:validate - bun run test tests/unit/server - bun run test tests/integration/server - bun run test:e2e - bun run dashboard:build - bun run typecheck - bun run lint - manual start-server smoke with /healthz, /readyz, /api/scenarios, and /api/runs Co-authored-by: Codex <codex@openai.com>

Add read-only AgentProbe server

Summary: - Add authenticated server write routes for starting and cancelling runs, preset CRUD, preset launch history, and frozen preset snapshots. - Extend SQLite run history to schema v4 with preset tables, server-run metadata, cancellation timestamps, and WAL-enabled server connections. - Teach runSuite to accept prepared file/id scenario selections and a cooperative AbortSignal while preserving existing CLI and dashboard mode. - Add dashboard start/preset/cancel views plus Docker and Compose packaging with a token-protected SQLite volume deployment path. Rationale: - Phase 2 needs an operator workflow that can configure a run, save it as a reusable preset, launch it from the server UI, and observe/cancel progress. - Scenario references now use file plus id so duplicate ids across scenario files remain deterministic, and write paths validate boundaries at data-root and bearer-auth edges. Tests: - bun run docs:validate - bun run test tests/unit/server - bun run test tests/integration/server - bun run test:e2e - bun run dashboard:build - bun run typecheck - bun run fast-feedback - COMPOSE_PROGRESS=plain AGENTPROBE_SERVER_TOKEN=... OPEN_ROUTER_API_KEY=... docker compose -p agentprobe-sym22 up --build --detach - curl -fsS http://127.0.0.1:7878/healthz - curl -fsS -X POST http://127.0.0.1:7878/api/runs ... dry-run payload - OPEN_ROUTER_API_KEY=... docker compose -p agentprobe-sym22-missing config exited 1 for missing AGENTPROBE_SERVER_TOKEN Co-authored-by: Codex <codex@openai.com>

AgentProbe server run control and presets

…SYM-23) Phase 3 of the AgentProbe server: ship historical-run comparison as an API and dashboard workspace, and introduce the persistence abstraction needed to back the server with Postgres behind `AGENTPROBE_DB_URL`. - Persistence contract: new PersistenceRepository interface, URL parser / redactor, migration dispatcher (per-backend versioned) and a backend factory that returns SqliteRepository or PostgresRepository. SQLite free-function exports stay as compat wrappers; all new consumers go through the interface. - Postgres backend: full DDL with jsonb columns and indexes for server filters + comparison lookups, migration runner, boot-time schema-version check that refuses to start when behind, preset/run/comparison reads, and preset CRUD. Run recorder (writes) is deferred to SYM-25. - CLI `agentprobe db:migrate`: accepts --db or AGENTPROBE_DB_URL, prints backend / current / target / applied, fails clearly on unsupported schemes. - Server config + /readyz + new /api/session expose the backend kind and a redacted db_url. Postgres URLs are now accepted via --db and env. - Comparison controller: loads 2–10 runs, chooses alignment (preset snapshot → preset id → scenario id → file::id), emits runs, scenarios with per-run status/score/reason, delta_score, status_change, present_in, and summary buckets. GET /api/comparisons rejects out-of-range counts with the common error envelope and sets cache-control: no-store. - Dashboard /compare workspace: ad-hoc multi-run picker, sticky summary bar, per-run columns, missing-scenario rows, "only changes" toggle, ?run_ids=…&only=changes deep links, and a "Compare last two runs" CTA on the preset detail view. - Docker Compose example documents Postgres as an opt-in service; SQLite on a named volume remains the default. - Playbook expansion: Postgres setup/migration/rollback/backup, comparison semantics, connection errors, and duplicate-scenario-id behaviour. - Tests: unit coverage for url redaction + parse, migration runner (fresh, idempotent, upgrade-in-place, unsupported schemes), factory dispatch, and the comparison controller (alignment, delta/status_change, duplicate ids, range enforcement, 404 routing). Integration coverage starts a real server against seeded SQLite runs and exercises the /api/comparisons payload. Follow-up SYM-25 tracks the buffered async Postgres recorder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AgentProbe Server Phase 3: comparison API, /compare dashboard, Postgres scaffold (SYM-23)

## Intent Fix SYM-33 by making the Postgres run-recording limitation visible in the repository type system and by failing fast when the write-enabled server is configured with Postgres. ## Behavior changes - `PostgresRepository` no longer exposes `createRecorder()`; only `RecordingRepository` implementations can create run recorders. - `agentprobe start-server` rejects Postgres URLs while run write routes such as `POST /api/runs` are enabled, with a clear SQLite guidance error before schema probing or first write traffic. - Server docs now state the current Postgres posture: migrations, preset CRUD, and historical reads are supported; run recording remains SQLite-only. ## Validation - [x] `./scripts/fast-feedback.sh` passed - [x] Behavior docs updated (if behavior changed) - [x] `bun test --timeout 20000 tests/e2e/start-server.e2e.test.ts` passed - [x] `bun test --timeout 20000 tests/e2e/cli.e2e.test.ts` passed - [x] `git diff --check` passed ## Screenshots / video N/A for CLI-only changes.

## Intent Fix the inline dashboard run-detail renderer so malformed scenario ordinals cannot break out of the scenario link href or body text. This is defense-in-depth for run data consumed from `/api/runs/:id`. ## Behavior changes No intended user-visible behavior changes for valid run data. The inline dashboard now escapes malformed ordinal/count/score values before inserting them into generated HTML. ## Validation - [x] `./scripts/fast-feedback.sh` passed - [x] `bun test tests/unit/server/inline-dashboard.test.ts` passed - [x] `bun test --timeout 15000` passed - [x] Behavior docs updated (if behavior changed; N/A for this security hardening)

## Intent Fix the Postgres \ 2N+1 query pattern by batching related scenario-selection and latest-run reads by \. ## Behavior changes No user-visible behavior changes. Preset listing keeps the same ordering and returned fields while reducing Postgres round trips from 2N+1 to a constant query count. ## Validation - [x] \bun test v1.3.12 (700fc117) - [x] \bun test v1.3.12 (700fc117) - [x] \error TS5025: Unknown compiler option '--noEmit\'. Did you mean 'noEmit'? - [x] \ - [x] \ - [x] Behavior docs updated (if behavior changed): no behavior docs needed for this non-user-visible persistence optimization ## Screenshots / video N/A for persistence-only changes.

## Intent Fix SYM-31 by adding explicit CORS handling for the AgentProbe control-plane API. The server now answers `/api/*` OPTIONS preflights centrally, attaches CORS response headers for allowed origins, and fails closed when operators expose the server externally without an explicit origin allow-list. ## Behavior changes - `/api/*` OPTIONS preflights return `204` for allow-listed origins with allow-methods, allow-headers, allow-credentials, and max-age headers. - `/api/*` OPTIONS preflights return `403` for unlisted origins. - Non-preflight `/api/*` responses echo `Access-Control-Allow-Origin` for allowed origins. - `AGENTPROBE_SERVER_CORS_ORIGINS` configures comma-separated exact `http://` or `https://` origins. - `--unsafe-expose` / `AGENTPROBE_SERVER_UNSAFE_EXPOSE=true` now requires `AGENTPROBE_SERVER_CORS_ORIGINS` in addition to an auth token. ## Validation - [x] `./scripts/fast-feedback.sh` passed on merge head `3774c52` - [x] Behavior docs updated (if behavior changed) - `bun test tests/unit/server/config.test.ts tests/integration/server/read-only.test.ts` - `bun test tests/unit/server tests/integration/server` - `bun run docs:validate` - `bun test --timeout 15000` (pre-merge validation on original CORS head) ## Screenshots / video N/A for CLI/server changes.

Add an `agentprobe` Docker Compose healthcheck so Compose can distinguish process start from server readiness. The probe calls `/readyz` from inside the container, and the server playbook documents the command plus failure debugging steps. Validation: - docker compose config - docker compose up --build -d agentprobe; waited for healthy; curl /readyz; docker compose down -v - bun run docs:validate - bun test tests/integration/server/read-only.test.ts tests/unit/server/auth.test.ts tests/unit/server/config.test.ts - git diff --check - bun run fast-feedback

## Intent Fix SYM-28 by removing the length-mismatch branch from bearer-token comparison. `constantTimeEquals` now pads both UTF-8 byte arrays to a fixed width, performs one timing-safe comparison, and only then gates the return value on byte-length equality and compare-size bounds. ## Behavior changes No user-visible behavior changes. Valid configured bearer tokens are still accepted, invalid tokens are still rejected, and API auth coverage is unchanged. The internal comparison path no longer has a distinct length-mismatch branch. ## Validation - [x] `./scripts/fast-feedback.sh` passed - [x] Behavior docs updated (if behavior changed) - Not applicable; this is an internal security hardening with unchanged public behavior. - [x] `bun test tests/unit/server/auth.test.ts` - [x] `git diff --check` - [x] `bun test --timeout 20000 tests/e2e/start-server.e2e.test.ts` - [x] `bun test --timeout 20000 tests/e2e/cli.e2e.test.ts --test-name-pattern "run records|openclaw commands"` ## Screenshots / video N/A for CLI-only changes.

## Intent Slim the Docker runtime image for SYM-30 by replacing the runtime-stage full-tree copy with a production dependency install and explicit runtime asset copies. This removes tests, docs, scripts, agent metadata, and dev dependencies from the final image while preserving the Bun TypeScript server entrypoint, runtime data, and built dashboard bundle. ## Behavior changes No CLI or API behavior changes. The published Docker image contents are narrower: - Runtime stage now runs `bun install --production --frozen-lockfile`. - Final image copies only `src`, `data`, and `dashboard/dist` from the build stage. - `.dockerignore` excludes docs, scripts, tests, and agent metadata from the build context. Image evidence: - Before: `126398737` bytes; `/app` was `191M`. - After: `71911094` bytes; `/app` is `15M`. - Reduction: `54487643` bytes, about 43.1%. - Final image inspection confirms `tests`, `dashboard/src`, `docs`, `scripts`, `.git`, `.agents`, `.claude`, `node_modules/@biomejs`, `node_modules/typescript`, `node_modules/bun-types`, and `node_modules/@types` are absent. ## Validation - [ ] `./scripts/fast-feedback.sh` passed - Not used as the final authority because its `bun run test` step hits existing 5s e2e timeout failures in this workspace; filed SYM-40. - [x] Behavior docs updated (if behavior changed) - No product behavior docs changed; Docker packaging only. Validation run: - `docker build -t agentprobe:sym-30-before .` on the original `origin/dev` Dockerfile for reproduction. - `docker run --rm --entrypoint sh agentprobe:sym-30-before -c '...'` confirmed broad runtime tree and dev deps present. - `docker build -t agentprobe:sym-30-after .` - `docker run --rm --entrypoint sh agentprobe:sym-30-after -c '...'` confirmed excluded paths/dev deps absent and runtime paths present. - `bun run docs:validate` passed. - `bun run test` failed only on existing 5s e2e timeouts; see SYM-40. - `bun run test:e2e` reproduced the same 5s e2e timeout failures; see SYM-40. - `bun test --timeout 30000 tests/e2e` passed: 20 pass, 0 fail. - `bun run test tests/integration/server` passed: 8 pass, 0 fail. - `bun run lint` passed. - `bun run typecheck` passed. - `bun run dashboard:build` passed. - `AGENTPROBE_SERVER_TOKEN=sym30-test-token OPEN_ROUTER_API_KEY=dummy-key docker compose -p sym30 up --build -d` passed. - Compose smoke: `/healthz` returned 200, authenticated `/api/presets` returned 200, authenticated dry-run `/api/runs` completed with `passed: true` and `exitCode: 0`. - `docker compose -p sym30 down -v` cleaned up the stack. ## Screenshots / video N/A for CLI-only changes.

## Intent Fix SYM-27 by making database URL redaction handle passwords that contain reserved characters such as `@`, `:`, `/`, and `%`, then apply redaction consistently to operator-visible output. ## Behavior changes Database URL passwords are now redacted using URL parsing for all schemes with userinfo credentials. Config errors, SQLite unsupported-URL errors, health/readiness output, migration output, and the start-server banner avoid exposing raw configured passwords. The branch has also been synced with current `dev`; conflict resolution preserved the `dev` Postgres write-mode restriction, CORS/readiness updates, and Postgres batching tests while keeping the SYM-27 redaction behavior. ## Validation - [ ] `./scripts/fast-feedback.sh` passed locally - Latest landing run passed repo validation, lint, and typecheck, then hit known local 5s e2e timeout cases tracked separately as SYM-41: `agentprobe start-server > boots without OPEN_ROUTER_API_KEY and shuts down on SIGTERM` and `bun e2e baseline for the typescript cli > run records the suite in sqlite and report renders both explicit and discovered outputs`. - [x] GitHub CI passed for commit `987af8fb5`. - [x] Behavior docs updated (if behavior changed) - [x] `bun test tests/unit/persistence/url.test.ts tests/unit/server/config.test.ts` - [x] `bun run lint` via fast-feedback - [x] `bun run docs:validate` via fast-feedback - [x] `bunx tsc --noEmit && bun run --cwd dashboard typecheck` via fast-feedback ## Screenshots / video N/A for CLI-only changes.

## Summary - `bun run test`, `bun run test:coverage`, and `bun run test:e2e` now pass `--timeout 30000` so the CLI-spawning e2e cases don't trip Bun's 5s default once `bunfig.toml` enables coverage instrumentation. - Documented the choice and intent in `docs/HARNESS.md`. ## Test plan - [x] `bun run test` (88 pass) - [x] `bun run test:e2e` (19 pass) - [x] `bun run docs:validate` Linear: https://linear.app/autogpt/issue/SYM-40 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Intent Fix SYM-34 by making server-managed run executor failures observable outside SSE subscribers. The controller now writes a structured `run_executor` stderr line, persists the failure on the run record when a run ID exists, and still publishes the terminal `run_error` stream event. ## Behavior changes Failed `agentprobe start-server` runs that reach the executor catch path now remain visible through `/api/runs/:runId` after the SSE session closes, with `finalError` populated for the historical run detail. Operators also get a JSON stderr log line with the run ID, error type, message, and stack. ## Validation - [x] `./scripts/fast-feedback.sh` passed - [x] Behavior docs updated (if behavior changed) - [x] `bun test tests/integration/server/write-control.test.ts` passed - [x] `bun run typecheck` passed - [x] `bun run lint` passed ## Screenshots / video N/A for CLI/server behavior changes.

Summary: - Add Happy DOM CompareView unit tests for the only-changes filter, picker apply state, empty aligned rows, and null versus zero score formatting. - Extend comparison integration and controller tests for three-run file/id alignment, empty comparison rows, malformed run IDs, duplicate run IDs, and structured bad_request responses. - Reject malformed or duplicate compare run IDs before repository lookup, and document the run UUID validation contract. Rationale: - SYM-37 called out compare coverage gaps across both the dashboard UI and /api/comparisons endpoint behavior. - Duplicate run IDs previously deduped silently, which hid bad input and made the endpoint validation contract weaker than the ticket requires. Tests: - bun test tests/unit/dashboard/compare-view.test.tsx - bun test tests/integration/server/comparisons.test.ts - bun test tests/unit/server/comparison.test.ts - bun run docs:validate - bun run lint - bun run typecheck - bun run fast-feedback Co-authored-by: Symphony Agent <swifty@symphony.ai> Co-authored-by: Codex <codex@openai.com>

## Intent Remove server-layer imports of concrete SQLite persistence helpers by routing controllers and routes through typed persistence repository interfaces. This addresses SYM-35 and keeps backend-specific initialization inside persistence implementations. ## Behavior changes No behavior changes. The server still uses SQLite for write-enabled start-server mode, while read/preset repository methods remain available through the typed SQLite and Postgres backends. ## Validation - [x] `./scripts/fast-feedback.sh` passed - [x] Behavior docs updated (if behavior changed): no behavior changes Additional evidence: - `rtk rg -n "providers/persistence/(sqlite|postgres)-|sqlite-run-history|sqlite-connection|postgres-backend|SqliteRunRecorder" src/runtime/server` returns no matches. - `rtk bun run typecheck` passed. - `rtk bun run test` passed: 139 tests before latest dev merge; fast feedback passed after the final `origin/dev` merge with 152 tests. - GitHub CI passed on head `73a110c` after resolving the merge conflict against `dev`. ## Screenshots / video N/A for CLI-only changes.

Summary: - Add the repo-local update-harness skill and Claude skill symlink. - Add a Bun-owned ci command and route GitHub CI through it. - Tighten generated-doc freshness and workspace inventory generation. Rationale: - Keeps the harness upgrade workflow available in-repo for future agents. - Gives local and hosted CI one shared command instead of duplicated YAML. - Makes generated inventory depend on tracked files and catches real drift. Tests: - bun install --frozen-lockfile - bun run ci Co-authored-by: Codex <codex@openai.com>

Summary: - Make run recording async and add a Postgres recorder for full run lifecycle writes. - Add Postgres storage for encrypted settings and endpoint overrides, including schema version 3 migrations. - Reuse a long-lived Bun SQL client per repository and close it during server shutdown. - Allow start-server to boot with postgres URLs and document the required migration and encryption-key workflow. - Add env-gated Postgres recorder, secret, and migration tests. Rationale: - Production server deploys need durable networked persistence while preserving SQLite as the local default. - Postgres writes are async and scenario ids come from BIGSERIAL, so the recorder contract now reflects the backend reality. - Secrets remain encrypted app-side; Postgres deployments require an explicit AGENTPROBE_ENCRYPTION_KEY instead of a local sidecar file. Tests: - bun run lint - bunx tsc --noEmit - bun run test - bun run docs:validate - bun run ci Co-authored-by: Codex <codex@openai.com>

Summary: - Add the new persistence docs, Postgres recorder, and Postgres tests to the generated workspace inventory. Rationale: - The files were added to the previous commit, but the inventory was generated before they were tracked, causing fast-feedback and CI to fail the generated-doc freshness check. Tests: - ./scripts/fast-feedback.sh Co-authored-by: Codex <codex@openai.com>

…31) ## Summary - `assertProcessCompletes` now throws a descriptive error when the per-test watchdog fires, instead of silently returning the SIGTERM exit code (143). This prevents a timed-out child from masking the real failure with a downstream assertion mismatch. - Adds a 2s SIGKILL escalation after SIGTERM so cleanup is deterministic even if the CLI ignores SIGTERM. ## Why SYM-39 third acceptance criterion: test process cleanup must be deterministic and a timed-out child must not mask the real failure cause. SYM-40 already bumped the per-test timeout to 30s so the primary timeout flakiness is gone; this change makes the remaining timeout path self-explanatory. ## Test plan - [x] bun run fast-feedback (88 pass, 0 fail) Linear: https://linear.app/autogpt/issue/SYM-39

* Added deployment workflow and helm charts * updated workspace docs * fix lint issues

Move Perf/PerfTracker into src/shared/observability with an AsyncLocalStorage so persistence and route layers can record spans without plumbing the tracker through every call site. Wire withPerf into the response-budget middleware so the breakdown logged on budget breaches now names the actual culprit (per-table queries in loadScenarioRecords, repo.getRun, JSON serialization) instead of showing 'unaccounted'. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The run-detail polling endpoint loaded all scenario children (turns, target events, tool calls, checkpoints, judge dimension scores) even though the dashboard overview only renders per-scenario summary fields. The per-scenario route loaded the same full payload then kept one ordinal. Add GetRunOptions { summary?, ordinal? } and have the routes request only what each page needs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The HTTP run-detail and per-scenario routes always strip scenarioSnapshot/personaSnapshot/rubricSnapshot/expectations/tags before responding, but the postgres reader was loading those wide JSONB columns anyway via select *. Add SCENARIO_RUN_HTTP_COLUMNS and use it whenever getRun is invoked with summary or a specific ordinal — internal callers that need the snapshots still receive them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6aa0226dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-05T16:55:50Z

+    pathname.startsWith("/runs") ||
+    pathname.startsWith("/suites") ||
+    pathname.startsWith("/presets") ||
+    pathname === "/start" ||
+    pathname === "/compare"


Add /endpoints to dashboard SPA fallback paths

The static-route allowlist for SPA fallback omits "/endpoints", even though the dashboard router and nav include that page (dashboard/src/App.tsx routes to /endpoints). When serving dashboardDist, a direct browser load or refresh on /endpoints will return 404 instead of index.html, breaking that view outside in-app navigation.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-05T16:55:50Z

+        const snapshot = context.streamHub.publish({
+          runId,
+          kind: "snapshot",
+          payload: snapshotPayloadForRun(historicalRun),
+        });


Stop mutating the stream buffer when sending snapshots

When there is no replay buffer but a historical run exists, this code calls streamHub.publish to build a one-off snapshot response. publish appends into the per-run ring and creates a run buffer if missing, so every SSE read of an old run adds retained events/state in memory rather than just serializing a transient payload. This can steadily grow StreamHub state for viewed historical runs because those buffers are never cleared in the normal request path.

Useful? React with 👍 / 👎.

* update rubric * add human scoring feature with rubric correlation tracking Adds an end-to-end "Score" surface for human review of completed runs: a new persisted human_dimension_scores table mirroring judge_dimension_scores, HTTP routes that drain an unscored backlog one chat at a time, and a React dashboard view with rubric/objective/tool-call sidebars and Pearson-correlation pills against the LLM judge scores. Replaces the legacy inline dashboard with the built React bundle as the only frontend, and adds a one-shot seed-test-scores script for retargeting old data onto the new product rubric. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix postgres-backend test mock for sql.unsafe column fragments The PostgresRepository.listPresets path uses `sql.unsafe(RUN_SUMMARY_COLUMNS)` inline inside a tagged template to interpolate the column list. JS evaluates that call eagerly before the tagged template runs, so the mock's `sql.unsafe` was being invoked with just the column list and throwing because the text did not match any "from <table>" branch. Make `sql.unsafe` return an inert empty result for fragment-style calls instead of throwing; the parent template still records the real query string so the existing query-count assertions hold. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refresh generated docs (quality score + workspace inventory) Re-run of `docs:quality` and `docs:workspace` after the test fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Swiftyos and others added 30 commits April 17, 2026 12:18

added exec plans for server

b68b8db

Merge pull request #14 from Significant-Gravitas/symphony/SYM-20

c063242

docs(product-specs): Server control plane contract (SYM-20 Phase 0)

Merge pull request #15 from Significant-Gravitas/symphony/SYM-21

2cd96bd

Add read-only AgentProbe server

Merge pull request #16 from Significant-Gravitas/symphony/SYM-22

3839ed0

AgentProbe server run control and presets

Merge pull request #17 from Significant-Gravitas/symphony/SYM-23

e3fcc5f

AgentProbe Server Phase 3: comparison API, /compare dashboard, Postgres scaffold (SYM-23)

fix(dashboard): stabilize run event stream

5f4a0a0

seed presets

ffc3daf

Merge branch 'main' into dev

393c328

refersh docs

9700122

update ui

355bfbe

update dashbaord for comparisons

94469f5

Swiftyos and others added 26 commits April 30, 2026 09:36

update Dockerfile

e456b6f

Added deployment workflow and helm charts (#36)

4246624

* Added deployment workflow and helm charts * updated workspace docs * fix lint issues

fix persona failures

83c9670

remove helm complexity

daadf24

deploy to dev env only

e0df338

updated to keep things secret

dbb726c

update workflow to prevent attempt to pass secrets

964b671

fix lint and workspace inventory

fa29562

updated manifest

4cea26a

fix cloudsql

1cd661a

remove requirement for barer token

9c6a463

add jwt secret override

324fac6

auto docs update

20e63ff

remove janky llm security

c333e1b

fix missing scenario files

0d688ff

add pre-release preset

063f5c9

updating to full react app

3202390

update chat components

8a69513

adding loading skeletons

aad69b8

refactor

c9c8840

add logging for slow requests

c4d9953

fix crash issue

0bafd60

chatgpt-codex-connector Bot reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync#11

Sync#11
Swiftyos wants to merge 57 commits into
Swiftyos:mainfrom
Significant-Gravitas:main

Swiftyos commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Swiftyos commented May 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant