Skip to content

feat: dataset-specific prompt addendum (curator notes)#155

Merged
hweej merged 8 commits into
mainfrom
feat/dataset-prompt-addendum
May 21, 2026
Merged

feat: dataset-specific prompt addendum (curator notes)#155
hweej merged 8 commits into
mainfrom
feat/dataset-prompt-addendum

Conversation

@hweej
Copy link
Copy Markdown
Contributor

@hweej hweej commented May 21, 2026

Summary

  • Adds nullable prompt_addendum column to Dataset; admin API accepts + returns it via DatasetCreate / DatasetUpdate / DatasetAdminResponse.
  • Threads the field through DatasetContext_build_dataset_context_cached (forwarded from the cached DB row).
  • build_system_prompt renders the value as a fenced curator-notes block between the description and the shape line; omits cleanly when None / empty / whitespace-only.

Test plan

  • New tests in test_admin_routes.py (create / update / list / null-default)
  • New test in test_chat_session.py (kwarg forwarded into build_dataset_context)
  • New tests in test_prompt_system.py (present / none / empty-and-whitespace / paragraph isolation)
  • Full agent suite 259 passed; full API suite 281 passed locally
  • CI green

Closes #109.

hweej added 8 commits May 21, 2026 10:09
Operator-curated free-prose note attached to each Dataset row. Threaded
into the chat agent's system prompt in subsequent tasks so its answers
are grounded in dataset-specific biological context.

Migration is purely additive (nullable, default NULL); existing rows
are unaffected and continue to behave identically.
- sa.String -> sa.Text: prose addendum is open-ended free-form text;
  Text is the semantically correct type (PostgreSQL distinguishes,
  SQLite is storage-equivalent).
- Drop unused 'import sqlmodel' from migration top (cargo-culted
  from template, never referenced).
DatasetCreate / DatasetUpdate / DatasetAdminResponse gain a new optional
prompt_addendum field. POST stores it, PUT updates it (preserving other
fields via the existing exclude_unset behavior), GET surfaces it. No
frontend UI changes — operators edit via the admin API for now.
- Reorder prompt_addendum above chat_enabled in DatasetCreate /
  DatasetUpdate / DatasetAdminResponse and the three response-
  construction call sites, matching the column order in db/models.py.
- Add test_create_dataset_prompt_addendum_defaults_null asserting
  omitted prompt_addendum produces a null in the response.
- Add baseline 'created prompt_addendum is None' assertion in
  test_update_dataset_prompt_addendum so the test fails closed if
  the create path ever started defaulting non-null.
DatasetContext gains an optional prompt_addendum field. build_dataset_context
accepts it as a keyword arg; _build_dataset_context_cached (from #101)
forwards it from the Dataset row. The cache key (slug, updated_at)
already invalidates when admin PUT bumps updated_at, so changes to
prompt_addendum take effect on the next request.

Default is None for backward compatibility — datasets without curator
notes produce a context byte-identical to before.
When DatasetContext.prompt_addendum is non-empty, build_system_prompt
inserts a fenced === Curator notes === block between the description
and shape lines so the agent treats operator-supplied context as
authoritative. Whitespace-only and None values produce no output.
@hweej hweej self-assigned this May 21, 2026
@hweej hweej added the enhancement New feature or request label May 21, 2026
@hweej hweej changed the title Dataset-specific prompt addendum (curator notes) feat: dataset-specific prompt addendum (curator notes) May 21, 2026
@hweej hweej merged commit 978c33c into main May 21, 2026
3 of 4 checks passed
@hweej hweej deleted the feat/dataset-prompt-addendum branch May 21, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dataset-specific prompts: per-dataset system prompt addendum + suggestion chips

1 participant