Skip to content

feat(zarr-adapter): recognize 'gene' as a gene-symbol column candidate#162

Merged
hweej merged 1 commit into
mainfrom
feat/gene-symbol-column-candidate-gene
May 22, 2026
Merged

feat(zarr-adapter): recognize 'gene' as a gene-symbol column candidate#162
hweej merged 1 commit into
mainfrom
feat/gene-symbol-column-candidate-gene

Conversation

@hweej
Copy link
Copy Markdown
Contributor

@hweej hweej commented May 22, 2026

Summary

Adds `gene` (the literal column name) to `GENE_SYMBOL_COLUMNS` so datasets like `egfr_all_cells.zarr` — which store symbols in a column named `gene` rather than the common `feature_name`/`gene_symbol` conventions — auto-resolve gene symbols correctly. Without this the chat agent's tools return raw Ensembl IDs and have to guess symbols from training data.

`gene` is the most ambiguous candidate (the name could in theory mean other things), so it sits at the end of the priority list. A dataset that has both `feature_name` and `gene` still picks `feature_name`.

Per-dataset overrides for truly unusual schemas are tracked separately.

Test plan

  • 17 zarr_adapter tests pass locally, +2 new (gene resolution; feature_name-wins-over-gene priority)
  • CI green

egfr_all_cells.zarr stores symbols in a column literally named 'gene' rather
than the common 'feature_name' / 'gene_symbol' conventions. Adds 'gene' at
the END of the priority list so datasets with a canonical column still win.

Per-dataset overrides for unusual schemas are tracked in a separate issue.
@hweej hweej self-assigned this May 22, 2026
@hweej hweej added the enhancement New feature or request label May 22, 2026
@hweej hweej merged commit a5962a7 into main May 22, 2026
3 checks passed
@hweej hweej deleted the feat/gene-symbol-column-candidate-gene branch May 22, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant