Prevent stale queue snapshots from regressing workflow completion state#9043
Merged
lstein merged 28 commits intoinvoke-ai:mainfrom Apr 23, 2026
Merged
Prevent stale queue snapshots from regressing workflow completion state#9043lstein merged 28 commits intoinvoke-ai:mainfrom
lstein merged 28 commits intoinvoke-ai:mainfrom
Conversation
a510262 to
868a411
Compare
868a411 to
9134a38
Compare
…ches/283301-1776127127/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/306684-1776184069/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/318403-1776197224/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/558-1776426442/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/4860-1776428487/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/30802-1776472946/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/40315-1776547892/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/44449-1776563122/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/61407-1776641836/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/64843-1776644718/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/79583-1776705700/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/93876-1776717315/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/99484-1776718892/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/103059-1776723590/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/119808-1776731940/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/125326-1776735930/3153380b270b192d54042e0c70ca38e7f897a2cf
5 tasks
…ches/127970-1776736741/3153380b270b192d54042e0c70ca38e7f897a2cf
…ches/175532-1776819348/3153380b270b192d54042e0c70ca38e7f897a2cf
lstein
approved these changes
Apr 23, 2026
Collaborator
lstein
left a comment
There was a problem hiding this comment.
@JPPhoto The last commit fixed the "stuck" issue I'd found in fast-running workflows, and I'ved this PR
I did run a code review with Claude Code and it had a few very minor suggestions. They seem reasonable, but I'll leave it up to you whether you wish to implement them before the merge. I'll check in again tomorrow to see what you prefer.
Blocking issues: None.
Non-blocking suggestions:
- session_queue_common.py:222 — status_sequence Pydantic default is None but DB default is 0; a one-line comment explaining the fallback-for-pre-migration
intent would help future readers. - state.ts:152 — _pruneSeenItemOrdering intentionally drops tracking for vanished items; a comment noting this is the eviction trade-off (vs. unbounded
memory growth) would clarify intent. - migration_30.py has no dedicated test like test_migration_27_creates_users_table. The generic idempotency test covers the framework but not the backfill
path. Low risk, but a focused test would strengthen guarantees. - test_session_queue_status_sequence.py:76 — existing cancel test uses never-dequeued items (seq goes to 1). A complementary dequeue-then-cancel case
(verifying 1 → 2) would prove the counter continues rather than resets. The lifecycle test already exercises this path, so it's a mild gap. - _shouldAcceptQueueItem equal-sequence tiebreaker falls through to rank comparison; worth a comment calling out that terminal-vs-terminal at same sequence
will accept the later arrival (edge case that shouldn't occur but is handled sensibly).
Positive notes:
- COALESCE(status_sequence, 0) + 1 is atomic and correctly handles both NULL and 0 in one expression.
- Dual-track ordering (_seenItemStatusSequences + _seenItemStatusRanks fallback) cleanly handles missing-sequence events.
- ?? undefined (not || undefined) correctly preserves status_sequence = 0.
- Event handler records ordering before early-returns, so skipped events still advance the counter.
- Schema.ts asymmetry (event field required, queue-item field optional) matches backend Pydantic correctly and appears properly auto-generated.
Collaborator
Author
|
@lstein Glad that fixed it. Those comments are now in the code and I've added the test case in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix a race where fast workflow runs could stay visible as working after the queue had already drained.
This PR adds a per-item
status_sequenceto queue items andqueue_item_status_changedevents, persists it ininvokeai/app/services/shared/sqlite_migrator/migrations/migration_28.py, and uses it ininvokeai/frontend/web/src/features/controlLayers/components/StagingArea/state.tsto ignore stale queue snapshots that arrive after newer status updates. It also updatesinvokeai/frontend/web/src/services/events/setEventListeners.tsxso queue caches carry the new sequence immediately from socket events.Related Issues / Discussions
QA Instructions
Back up your database or run with an in-memory database; this PR involves a DB migration.
Build the frontend as normal.
Run backend tests:
pytest tests/test_session_queue.py tests/app/services/session_queue/test_session_queue_clear.py tests/app/services/session_queue/test_session_queue_status_sequence.py tests/app/routers/test_session_queue_sanitization.py tests/test_sqlite_migrator.pyRun the staging area frontend tests:
cd invokeai/frontend/web && ./node_modules/.bin/vitest run src/features/controlLayers/components/StagingArea/state.test.tsTry reproducing manually by running a very fast workflow (with one Integer Primitive node, for example) and confirming the staging area does not remain stuck in a pending or working state after completion.
Merge Plan
This touches queue event payloads and adds a DB migration in
invokeai/app/services/shared/sqlite_migrator/migrations/migration_28.py, so it should be rebased cleanly before merge.Checklist
What's Newcopy (if doing a release after this PR)