Skip to content

feat(tools): describe_var_column + var_columns in dataset schema#161

Merged
hweej merged 1 commit into
mainfrom
feat/describe-var-column
May 22, 2026
Merged

feat(tools): describe_var_column + var_columns in dataset schema#161
hweej merged 1 commit into
mainfrom
feat/describe-var-column

Conversation

@hweej
Copy link
Copy Markdown
Contributor

@hweej hweej commented May 22, 2026

Summary

Adds symmetric var-side metadata access for the chat agent. Fixes the gene-symbol blind spot — previously the agent saw obs columns but had no way to discover which var column holds gene symbols.

  • `ZarrAccess.var_column(name)` parallel to `obs_column(name)`, wired through AnnDataStore + api zarr_adapter + test fake
  • `describe_var_column` tool mirrors `describe_obs_column` exactly (categorical → top 50 values + counts; numeric → min/max/mean/median/quartiles/stddev)
  • `get_dataset_schema` payload now includes `var_columns` so the agent can discover gene-metadata column names upfront

A future cleanup is to combine `describe_obs_column` + `describe_var_column` into one `describe_column(axis, name)` tool; left as follow-up.

Test plan

  • zarr-access: 64 passed
  • api: 281 passed
  • cell-explorer-agent: 259 passed (+7 new: 3 describe_var_column tests, 1 schema assertion, 1 catalog set update, 2 from var_column plumbing)
  • CI green
  • After merge: rebuild API container; agent should report var columns via get_dataset_schema and be able to pick the gene-symbol column via set_gene_label_column

Adds symmetric var-side metadata access for the chat agent.

- ZarrAccess gets var_column(name) parallel to obs_column(name), wired
  through AnnDataStore, the api zarr_adapter, and the test fake
- describe_var_column tool mirrors describe_obs_column: categorical
  -> top 50 values + counts; numeric -> min/max/mean/median/quartiles
- get_dataset_schema payload includes var_columns so the agent can
  discover gene-metadata columns (gene_symbol, feature_id, etc.) up
  front instead of guessing

Fixes the gene-symbol blind spot: previously the agent could see obs
columns but had no way to learn which var column holds gene symbols.
@hweej hweej self-assigned this May 22, 2026
@hweej hweej added the enhancement New feature or request label May 22, 2026
@hweej hweej merged commit f8089c3 into main May 22, 2026
3 checks passed
@hweej hweej deleted the feat/describe-var-column branch May 22, 2026 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant