Skip to content

feat: [DSM-103] Improve scheduler fairness#9985

Open
alin-at-dfinity wants to merge 16 commits intomasterfrom
alin/DSM-103-scheduler-proptest
Open

feat: [DSM-103] Improve scheduler fairness#9985
alin-at-dfinity wants to merge 16 commits intomasterfrom
alin/DSM-103-scheduler-proptest

Conversation

@alin-at-dfinity
Copy link
Copy Markdown
Contributor

@alin-at-dfinity alin-at-dfinity commented Apr 22, 2026

This is a collection of scheduler improvements backported from the active canister scheduler dev branch, with the goal of having the scheduler efficiency proptest pass (with reduced success thresholds). And, by applying the proptest and all other RoundSchedule tests onto master, hugely reducing the size of that change.

  • Charge heap delta rate limited canisters, so that they actually skip execution rounds instead of being merely delayed while accumulating priority.
  • Charge immediately for the first round of every long execution (which was scheduled as a new execution).
  • Apply an exponential decay to AP outside the [-2000, 500] range (-2000 because of 20 max DTS rounds, the 500 is somewhat arbitrary, but seems to work well). Due to the interaction between long and short executions, runaway priorities are inevitable. This provides a soft bound for runaway AP, while still preserving relative priorities to some extent. It also makes priority resets unnecessary (to be removed later).
  • Fully distribute all positive free compute, even if it means exceeding 100 priority per canister (by distributing it equally to all canisters).
  • Allow long executions to use all scheduler cores when there are no new executions.
  • Fully segregate long and new executions across cores, to prevent inversion of priority when lower priority long executions get a slice executed.
  • Track long executions across iterations, not just the ones from the start of the round.

@github-actions github-actions Bot added the feat label Apr 22, 2026
Instead of a binary prioritized / opportunistic flag, explicitly (record and) prioritize long executions based on number of slices executed, AP and round when the long execution started. This ensures that we don't starve low priority canisters (which may happen with bounded AP and just the right distribution across execution cores).

Also switch from persisting SubnetSchedule spread across individual canister states to persisting it as part of the subnet's SystemMetadata.
These are some scheduler improvements backported from the active canister scheduler dev branch, with the goal of having the scheduler efficiency proptest pass (with reduced success thresholds).
 * Fully segregate long and new executions across cores, to prevent inversion of priority when lower priority long executions get a slice executed.
 * Track long executions across iterations, not just the ones from the start of the round.
 * Charge heap delta rate limited canisters, so that they are actually skipped instead of just delayed while accumulating priority.
 * Charge immediately for the first round of every long execution (scheduled as a new execution).
 * Apply an exponential decay to AP outside the `[-2000, 500]` range (`-2000` because of 20 max DTS rounds, the `500` is somewhat arbitrary, but seems to work well). Due to the interaction between long and short executions, runaway priorities are inevitable. This provides a soft bound for runaway AP, while still preserving relative priorities to some extent. It also makes priority resets unnecessary (to be removed separately).
 * Fully distribute all positive free compute, even if it means exceeding 100 priority per canister (by distributing it equally to all canisters).
 * Allow long executions to use all scheduler cores if there are no new executions.
@alin-at-dfinity alin-at-dfinity force-pushed the alin/DSM-103-scheduler-proptest branch from b86e911 to ac9fba3 Compare April 24, 2026 15:55
@alin-at-dfinity alin-at-dfinity changed the base branch from master to alin/DSM-103-long-execution-priority-queue April 24, 2026 15:55
@alin-at-dfinity alin-at-dfinity changed the title feat: [DSM-103] Scheduler efficiency proptest feat: [DSM-103] Improve scheduler fairness Apr 24, 2026
@alin-at-dfinity alin-at-dfinity marked this pull request as ready for review April 24, 2026 16:01
@alin-at-dfinity alin-at-dfinity requested a review from a team as a code owner April 24, 2026 16:01
Base automatically changed from alin/DSM-103-long-execution-priority-queue to master April 27, 2026 09:53
… bail out before the latter if there are no active canisters.
…ction. And only do so once per round.

* Tweak the per_canister_cap calculation so we always end up with sum(AP) >= 0.
* Simplify long execution core calculation.
* Improve some comments.
In field names and comments, replace "priority credit" and "executed slices" with "executed rounds". "Priority credit" was the old mechanism, now replaced. And "executed slices" is a misnomer, what we're actually counting is rounds during which a long execution made progress, not the actual number of slices executed (multiple slices might be executed in any given round, but we charge for rounds, not slices).
…harging for in-progress executions; and CanisterRoundState ordering.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant