perf(autoreload): skip stdlib/site-packages on per-cell check#9629
Conversation
Every cell run with auto_reload enabled was stat-ing every entry in sys.modules (often 1000+), adding 16-80ms of overhead per cell. Add an opt-in skip_non_user_modules flag on ModuleReloader.check that caches stdlib/site-packages module names in a persistent skip set. AutoreloadManager.cell_scope opts in; the background ModuleWatcher keeps the default full scan so edits inside installed packages remain detectable at watcher latency. Fixes #9628
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
3 issues found across 3 files
Architecture diagram
sequenceDiagram
participant UI as User/Client
participant AM as AutoreloadManager
participant MR as ModuleReloader
participant MW as ModuleWatcher (Background)
participant sysmod as sys.modules dict
participant FS as Filesystem (os.stat)
Note over AM,FS: Per-cell execution path (hot path)
UI->>AM: Execute cell (lazy/auto reload)
AM->>AM: snapshot = set(sys.modules)
AM->>MR: check(modules=sys.modules, reload=True, skip_non_user_modules=True)
Note over MR: Skip cache populated lazily
MR->>MR: _non_user_roots from sysconfig (stdlib, purelib, platlib, base_prefix)
loop For each module in sys.modules
alt Module name in _skip set
MR->>MR: continue (skip entirely)
else Module not classified yet
MR->>MR: _is_user_module(module)
alt __file__ starts with non_user_root
MR->>MR: skip.add(modname), continue
else User module (editable install / source tree)
MR->>FS: os.stat(module.__file__)
FS-->>MR: mtime
MR->>MR: Compare with cached mtime
end
end
end
alt Stale modules found
MR->>MR: Reload stale modules
MR-->>AM: Set of modified modules
else No stale modules
MR-->>AM: Empty set (fast path)
end
AM->>AM: Execute cell yield
AM->>AM: new_modules = sys.modules - snapshot
AM->>MR: check(new_modules, reload=False, skip_non_user_modules=True)
Note over AM: Cell execution complete
Note over MW,FS: Background watcher path (1s loop)
loop Every ~1 second
MW->>MR: check(modules=sys.modules, reload=False)
Note over MR: Default behavior - scans ALL modules
loop For each module
alt User module (not in site-packages)
MR->>FS: os.stat(n), compare
else Stdlib / site-packages
MR->>FS: os.stat(n), compare
end
end
alt Modified modules detected
MR-->>MW: Set of updated module names
MW->>MW: Trigger reload callback (if auto_reload=autorun)
end
end
Note over AM,FS: New: skip_non_user_modules flag
Note over AM: User code changes detected immediately
Note over MW: Site-package changes detected at watcher latency
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
There was a problem hiding this comment.
Pull request overview
This PR improves autoreload performance by avoiding per-cell os.stat scans over the full sys.modules set when runtime.auto_reload is enabled, addressing the cell execution latency regression reported in #9628.
Changes:
- Added
skip_non_user_modulesoption toModuleReloader.check()and a persistent skip cache for stdlib/site-packages modules. - Updated
AutoreloadManager.cell_scope()to use the skip behavior on the hot per-cell path. - Added targeted tests for user vs non-user module classification and skip-cache behavior.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
marimo/_runtime/reload/autoreload.py |
Introduces non-user root detection, user-module classification, and a persistent skip cache used by ModuleReloader.check(). |
marimo/_runtime/reload/manager.py |
Opts the per-cell autoreload path into skipping non-user modules to reduce per-cell overhead. |
tests/_runtime/reload/test_autoreload.py |
Adds regression tests for skip-cache population and behavior differences between watcher vs hot path. |
akshayka
left a comment
There was a problem hiding this comment.
Overall LGTM, code style comments
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
dmadisetti
left a comment
There was a problem hiding this comment.
I think the continue logic should be tied to skip_non_user_modules
| source tree, so they are correctly classified as user code. | ||
| """ | ||
| f = safe_getattr(module, "__file__", None) | ||
| if not f: |
There was a problem hiding this comment.
false positive on c libraries? Unsure, but I think so. Maybe that's fine
|
thanks @dmadisetti, i had that but removed from the comments. will add back |
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
|
🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.7-dev71 |
This pull request was authored by a coding agent.
Fixes #9628.
With
auto_reloadset tolazyorautorun, every cell run was callingModuleReloader.check(sys.modules, reload=True), which iterates all ofsys.modulesand doesos.staton each entry. With ~1000 modules in scope (typical), that adds 16–80ms per cell — compounded across the dozen cells re-running on a UI interaction it becomes a >1s lag.This change adds an opt-in
skip_non_user_modules=Trueflag onModuleReloader.check. When set, stdlib and site-packages module names are recorded in a persistent skip set (classified bysysconfigprefixes) and short-circuited on subsequent calls.AutoreloadManager.cell_scope(the hot per-cell path) opts in. The backgroundModuleWatcherkeeps the default behavior and continues to scan every module on its 1s loop, so edits inside an installed package are still detected — just at watcher latency rather than cell-entry latency. Editable installs (pip install -e .,uv add --editable) have__file__outside site-packages, so they are correctly classified as user code and reload with no latency change.Benchmark
Driving
ModuleReloader.check()directly, 200 iterations post-warmup. Issue-shaped workload: ~2.5k modules (heavy stdlib + numpy/pandas/etc.) + 5 user files in a tmp dir.~4 ms saved per cell run, 5.4× median speedup.
Scale curve (median µs, varying user-module count):
The win narrows as user-code grows, by design: the optimization only filters out non-user-code.