Cellina

Cellina is a dual-encoder variational autoencoder for predicting how a cell's gene expression changes under altered spatial contexts — a class of queries we call tissue graph counterfactuals.

In tissues, a cell's transcriptional state is shaped by its local neighborhood: the composition of nearby cells and the signals they emit. Existing perturbation methods typically treat cells as independent and apply perturbations uniformly. Cellina addresses this gap by explicitly separating a cell's intrinsic state (z, encoding cell identity) from its spatial context (s, encoding microenvironmental influence), then uses s as a conditioning input to render counterfactual predictions under two types of intervention:

Edge perturbation — rewire a cell's neighborhood (replace neighbors with those from a different domain)
Node perturbation — modify the expression of existing neighbors (e.g. pathway activation or knockout)

Getting started

Follow the worked tutorials for end-to-end examples on colorectal cancer tissue: Cellina and Cellina-GAT (or run them locally from docs/tutorial.ipynb and docs/tutorial_gat.ipynb).

How it works

Generative model. Cellina is a VAE with two latent variables. An MLP encoder $\text{Enc}_z$ maps raw counts to $z \sim q(z \mid x)$; a spatial encoder maps the cell's neighborhood to $s \sim q(s \mid \mathcal{N}(v))$. A shared decoder reconstructs counts from $[z;, s]$ under a Negative Binomial likelihood. Both latents have standard normal priors.

Supervised disentanglement. Optimizing the ELBO alone does not prevent $z$ from absorbing spatially-driven variation. Cellina adds auxiliary objectives:

A cell-type classifier on $z$ anchors it to transcriptional identity.
An adversarial discriminator is trained to predict spatial domain from $z$; the encoder is then trained to fool it, routing microenvironmental variation to $s$ by elimination.
A graph-supervised contrastive loss $s$ (CellinaGCN only, optional), as a biologically grounded inductive bias that promotes similarity within local neighbourhoods. Enabled by setting link_prediction_weight > 0.

Training alternates between a discriminator step (encoder frozen) and a VAE step (discriminator frozen), following a standard adversarial schedule.

Two variants differ in how the spatial encoder is implemented:

Code class	Paper name	Spatial encoder
`Cellina`	Cellina	Degree-normalized weighted pseudobulk aggregation of neighbor expression → MLP
`CellinaGCN`	Cellina-GAT	Multi-layer GATv2 on the local subgraph; self-loops excluded so $v$'s own expression is captured by $z$ alone; modified contrastive loss on $s$

The two variants perform on par. Cellina decouples neighborhood construction from training and scales similarly to non-spatial baselines; CellinaGCN learns attention over each subgraph at additional cost per step.

Tissue graph counterfactuals

Cellina supports two post-training interventions on the spatial graph $\mathcal{G}$:

Edge perturbation replaces a cell's spatial neighbourhood with donors sampled from a target tissue domain, while keeping the cell's own expression fixed:

$$\mathcal{N}(v) := \mathcal{N}'$$

Node perturbation modifies the feature vectors of $v$'s neighbours while preserving graph topology. For a target gene set $\mathcal{S}$ and a gene-specific transformation $T_g$:

$$x_{u,g}^{\mathrm{cf}} = \begin{cases} T_g(x_{u,g}) & g \in \mathcal{S} \ x_{u,g} & g \notin \mathcal{S} \end{cases}$$

$T_g$ can encode any intervention (additive shift, knockout, overexpression, or learned counterfactual values).

See the Cellina and Cellina-GAT tutorials for full worked examples.

Release notes

See the changelog.

Installation

Cellina ships two conda environments: environment.yml for the full GPU/CUDA setup, and env_minimal.yml for a lightweight CPU-only install. Create one with conda env create, then follow the tutorials above.

Citation

Citation coming soon.

Built on scvi-tools.

Contact

If you found a bug, please use the issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
.github/workflows		.github/workflows
docs		docs
src/cellina		src/cellina
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
env_minimal.yml		env_minimal.yml
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cellina

Getting started

How it works

Tissue graph counterfactuals

Release notes

Installation

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cellina

Getting started

How it works

Tissue graph counterfactuals

Release notes

Installation

Citation

Contact

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages