Skip to content

perf(images): materialize rasterized image once to avoid per-channel re-warp#708

Merged
timtreis merged 1 commit into
mainfrom
perf/render-images-compute-once
Jun 10, 2026
Merged

perf(images): materialize rasterized image once to avoid per-channel re-warp#708
timtreis merged 1 commit into
mainfrom
perf/render-images-compute-once

Conversation

@timtreis

Copy link
Copy Markdown
Member

What

render_images re-materializes the lazy rasterized image once per channel, on every pass, so the affine warp runs N_passes × N_channels times per render. This materializes it once.

Root cause

spatialdata.rasterize returns a lazy dask array (~5 ms). _render_images then reads it per channel in several passes — the NaN check, the per-channel compositing (img.sel(c=ch).values), and the draw — each of which recomputes the entire dask graph (the order=0 affine warp + concatenate3). So an 8-channel render pays for the warp roughly passes × 8 times.

Measured (synthetic 8ch float32, chunked like the real pipeline):

step time
_rasterize_if_necessary (lazy) 5 ms
full pl.show (8ch 6000²) 10 245 ms
per-channel .sel(c=ch).values 2544 ms
single .compute() of the same array 320 ms (8.0× = channel count)

Fix

Materialize the rasterized array once before returning:

image = rasterize(...)
if hasattr(image.data, "compute"):
    image = image.copy(data=image.data.compute())

Results

End-to-end: 10 457 ms → 508 ms (20.6×) at 8ch 6000²; 15.4× at 4000². Single-channel is neutral. The datashader path already .compute()s, so it is unaffected.

Why this is safe

  • Pixel-identical output (np.array_equal == True) — transform, coords, and channel labels preserved — so no visual baselines change (that's the correctness proof: existing image/label tests stay green unchanged).
  • Memory-bounded: the output of _rasterize_if_necessary is always display-sized (rasterized to ~target, or returned only when ≤ target+100 per axis), so the eager compute is ~15 MB at 700²×8.
  • Covers images and labels (both call _rasterize_if_necessary).

Closes #707.

…re-warp

`spatialdata.rasterize` returns a lazy dask array; `_render_images` then reads it once
per channel (NaN check, compositing, draw), so the affine warp re-runs N_passes x N_channels
times per render. Materializing the rasterized result once collapses that to a single warp.

Pixel-identical output (no baseline change); the rasterized array is always display-sized so
the eager compute is memory-bounded. Measured ~15-20x on multi-channel images
(8ch 6000^2: 10.4s -> 0.5s); neutral for single-channel. Covers images and labels.

Closes #707
@timtreis timtreis merged commit cb91f41 into main Jun 10, 2026
7 of 8 checks passed
@timtreis timtreis deleted the perf/render-images-compute-once branch June 10, 2026 11:58
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.36%. Comparing base (2c8803a) to head (46464e7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/spatialdata_plot/pl/utils.py 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #708      +/-   ##
==========================================
- Coverage   76.40%   76.36%   -0.04%     
==========================================
  Files          14       14              
  Lines        4314     4316       +2     
  Branches     1003     1004       +1     
==========================================
  Hits         3296     3296              
  Misses        663      663              
- Partials      355      357       +2     
Files with missing lines Coverage Δ
src/spatialdata_plot/pl/utils.py 68.96% <50.00%> (-0.07%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: render_images re-materializes the lazy rasterized image once per channel (up to ~20x slower for multi-channel)

2 participants