fix(ui): stabilize workflow node execution state updates by JPPhoto · Pull Request #9029 · invoke-ai/InvokeAI

JPPhoto · 2026-04-08T01:20:45Z

Summary

Fixes workflow node execution state updates in the frontend event layer.

This change fixes nodes getting stuck in IN_PROGRESS or showing duplicate outputs when socket events arrive out of order or are repeated. The fix moves the event-ordering logic into shared helpers and uses a listener-local completed-invocation key set so late invocation_started / invocation_progress events cannot overwrite a completed node state.

Related Issues / Discussions

QA Instructions

On main, run a workflow in the Workflow Editor and examine the Outputs pane for a node that executes. You should see two outputs even when the node is executed once.
After pulling and building (or running in dev mode), open the Workflow Editor and run a workflow with visible node progress.
- Confirm nodes transition from pending to in progress to completed.
- Confirm completed nodes do not revert back to in progress during or after the run.
- Confirm the Outputs pane does not show duplicate outputs for a single node execution.
Execute pnpm vitest run to run regression tests.

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

lstein

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.

In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?

(Yes, I rebuilt the front end)

JPPhoto · 2026-04-21T01:51:30Z

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.

In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?

(Yes, I rebuilt the front end)

I think there are multiple issues here and this PR addresses double results while #9043 addresses status issues. Can you locally merge #9043 into your checkout and see if everything is better with both applied?

lstein · 2026-04-21T15:29:41Z

It's a funny coincidence that I noticed the doubled output just a few days ago on my own, and was puzzled by it, not knowing whether it was a feature or a bug.
In any case, I'm having difficulty testing this PR. I set up the very simple integer operation workflow shown below. When I run it, the nodes go into permanent pending state and continue showing pending even after cancelling. On the other hand, when using an image generation workflow that previously doubled the output, the doubled output is gone. However, the nodes at the very beginning and end of the workflow get stuck in PENDING. Is there something I'm doing wrong?
(Yes, I rebuilt the front end)

I think there are multiple issues here and this PR addresses double results while #9043 addresses status issues. Can you locally merge #9043 into your checkout and see if everything is better with both applied?

Will do. I'm traveling for a bunch of business meetings this week so it may be a couple days before I get back to this, but I'm anxious to get it pushed through.

lstein · 2026-04-21T23:48:10Z

I've created a branch that contains both #9029 and #9043 . However the problem of stuck workflows persists.

When I create the simplest workflow of them all, a single Add Integers node, and press the invoke button, about 90% of the time it gets stuck in pending state. If I create a slightly more complex workflow, such as feeding the Add Integers output into Integer Range of Size, the workflow completes about 80% of the time and get stuck in PENDING the rest of the time.

This suggests to me that there is still a race condition of some sort. Let me know if testing this the wrong way.

lstein · 2026-04-21T23:53:33Z

Oh, wait, I merged #9042, not #9043. Trying that as well.

No, the problem persists. This is with all three PRs (#9029, #9042 and #9043) merged into a local branch. Also I note that the single node Add Integer workflow sometimes appears to run to complete, but produces no output.

JPPhoto · 2026-04-22T01:13:30Z

I think I have the issue isolated. The node's initial execution state has not been put into $nodeExecutionStates before the first socket event for that node arrives. For very fast workflows, invocation_started, invocation_progress, or even invocation_complete can win that race, and the handler was previously dropping the event because there was no existing execution-state entry to update. Try this PR again now and see if that resolves the problem.

…tialization is processed correctly

lstein

I tested it out with several workflows that were previously returning doubled output, and this PR fixed the issue. I performed a Claude code review and it flagged a couple of minor bugs that I thought were worth your attention:

completedInvocationKeys grows without bound (likely memory leak) — setEventListeners.tsx:73

The replaced cache was LRUCache<number, boolean>({ max: 100 }). The new Set has no bound and lives for the full lifetime of
setEventListeners (the whole socket session). Every completed invocation in the session adds a string that's never removed. In a long-running
session with many runs, this accumulates forever.

Suggested fixes (any one):

Use LRUCache<string, boolean>({ max: <e.g.> 1000 }) to match the prior pattern.
Clear keys in the queue_item_status_changed handler on the completed | failed | canceled branch — you already know all invocations belonging to
item_id are done. A Map<number, Set> keyed by item_id makes this cleanup O(1).
At minimum, clear the Set alongside the existing $nodeExecutionStates reset on status === 'in_progress' (defensible if you believe the concern
is only within-run ordering).

Asymmetric handling between invocation_complete and the other three invocation events — setEventListeners.tsx:111, 128, 161, 178

invocation_started, invocation_progress, and invocation_error still early-return on finishedQueueItemIds.has(data.item_id). invocation_complete no
longer does. This is intentional but subtle: if queue_item_status_changed(failed) arrives before a late invocation_error, the error event is now
silently dropped and the node may be left stuck in IN_PROGRESS. Since this PR's whole theme is hardening against out-of-order events, consider
also removing the finishedQueueItemIds early-return from invocation_error so the error helper can still populate the node's terminal state.

Finally, there is a cosmetic issue: the test file for nodeExecutionState.ts is named nodeExecutionStateHelpers.test.ts., but the pattern elsewhere would be to name it nodeExecutionState.test.ts. I assume at one point the code file was named ...Helpers.ts and was renamed.

JPPhoto · 2026-04-25T00:22:13Z

@lstein Thanks, I think it's hardened now and also won't grow without bounds. The test has been renamed to match the repo convention.

JPPhoto requested review from Pfannkuchensack, blessedcoolant, dunkeroni and lstein as code owners April 8, 2026 01:20

JPPhoto added the frontend PRs that change frontend files label Apr 8, 2026

JPPhoto force-pushed the workflow-node-execution-event-ordering branch 9 times, most recently from 23fae17 to 894df8a Compare April 14, 2026 00:39

lstein self-assigned this Apr 14, 2026

lstein added the v6.13.x label Apr 14, 2026

lstein added this to Invoke - Community Roadmap Apr 14, 2026

lstein moved this to 6.13.x Theme: MODELS in Invoke - Community Roadmap Apr 14, 2026

JPPhoto force-pushed the workflow-node-execution-event-ordering branch 12 times, most recently from 057032b to 780663a Compare April 20, 2026 21:02

JPPhoto force-pushed the workflow-node-execution-event-ordering branch 2 times, most recently from 36894df to f2068cf Compare April 21, 2026 00:40

lstein reviewed Apr 21, 2026

View reviewed changes

JPPhoto force-pushed the workflow-node-execution-event-ordering branch from e0b1e7d to 4afe259 Compare April 21, 2026 02:00

JPPhoto force-pushed the workflow-node-execution-event-ordering branch from 4afe259 to e65a67f Compare April 22, 2026 00:56

JPPhoto force-pushed the workflow-node-execution-event-ordering branch from 0af9f1f to 46cc494 Compare April 22, 2026 01:32

JPPhoto added 3 commits April 23, 2026 10:58

fix(ui): stabilize workflow node execution state updates

94d4262

fix(ui): initialize workflow node execution state from events

bbf56e0

fix: make sure invocation_error arriving before execution-state ini…

814a95b

…tialization is processed correctly

JPPhoto force-pushed the workflow-node-execution-event-ordering branch from 46cc494 to 814a95b Compare April 23, 2026 15:58

lstein requested changes Apr 24, 2026

View reviewed changes

fix(ui): harden workflow invocation event dedupe

bae30f7

JPPhoto requested a review from lstein April 25, 2026 00:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ui): stabilize workflow node execution state updates#9029

fix(ui): stabilize workflow node execution state updates#9029
JPPhoto wants to merge 4 commits intoinvoke-ai:mainfrom
JPPhoto:workflow-node-execution-event-ordering

JPPhoto commented Apr 8, 2026

Uh oh!

lstein left a comment •

edited

Loading

Uh oh!

JPPhoto commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026 •

edited

Loading

Uh oh!

JPPhoto commented Apr 22, 2026

Uh oh!

lstein left a comment

Uh oh!

JPPhoto commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JPPhoto commented Apr 8, 2026

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

lstein left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JPPhoto commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026

Uh oh!

lstein commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JPPhoto commented Apr 22, 2026

Uh oh!

lstein left a comment

Choose a reason for hiding this comment

Uh oh!

JPPhoto commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lstein left a comment •

edited

Loading

lstein commented Apr 21, 2026 •

edited

Loading