Skip to content

feat: [DSM-103] Actual priority queue for long executions, take 2#10034

Merged
alin-at-dfinity merged 4 commits intomasterfrom
alin/DSM-103-long-execution-priority-queue
Apr 28, 2026
Merged

feat: [DSM-103] Actual priority queue for long executions, take 2#10034
alin-at-dfinity merged 4 commits intomasterfrom
alin/DSM-103-long-execution-priority-queue

Conversation

@alin-at-dfinity
Copy link
Copy Markdown
Contributor

Instead of a binary prioritized / opportunistic flag, explicitly (record and) prioritize long executions based on number of slices executed, AP and round when the long execution started. This ensures that we don't starve low priority canisters (which may happen with bounded AP and just the right distribution across execution cores).

Also switch from persisting SubnetSchedule spread across individual canister states to persisting it as part of the subnet's SystemMetadata.

Instead of a binary prioritized / opportunistic flag, explicitly (record and) prioritize long executions based on number of slices executed, AP and round when the long execution started. This ensures that we don't starve low priority canisters (which may happen with bounded AP and just the right distribution across execution cores).

Also switch from persisting SubnetSchedule spread across individual canister states to persisting it as part of the subnet's SystemMetadata.
 * Stop populating `SystemMetadata::subnet_schedule` on "subnet B" during a split to be serialized into individual `canister.pbuf` files; it is actually being written to `system_metadata.pbuf` now.
 * And stop writing the `CanisterPriority` into every `canister.pbuf`, we no longer read it from there.
@alin-at-dfinity alin-at-dfinity requested a review from a team as a code owner April 27, 2026 15:08
@github-actions github-actions Bot added the feat label Apr 27, 2026
@alin-at-dfinity
Copy link
Copy Markdown
Contributor Author

This broke //rs/tests/consensus:subnet_splitting_test_colocate, so it was rolled back (#10030).

I'm resubmitting it with the fix, plus necessary adjustments to other tests.

@alin-at-dfinity alin-at-dfinity added the CI_ALL_BAZEL_TARGETS Runs all bazel targets label Apr 27, 2026
@alin-at-dfinity alin-at-dfinity removed the CI_ALL_BAZEL_TARGETS Runs all bazel targets label Apr 28, 2026
@alin-at-dfinity alin-at-dfinity added this pull request to the merge queue Apr 28, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 28, 2026
@alin-at-dfinity alin-at-dfinity added this pull request to the merge queue Apr 28, 2026
Merged via the queue into master with commit 7095a2a Apr 28, 2026
37 checks passed
@alin-at-dfinity alin-at-dfinity deleted the alin/DSM-103-long-execution-priority-queue branch April 28, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants