fix(cie): apply SCHED_DEADLINE immediately using SCHED_FLAG_RESET_ON_FORK#48
Merged
atsushi421 merged 4 commits intomainfrom Apr 17, 2026
Merged
fix(cie): apply SCHED_DEADLINE immediately using SCHED_FLAG_RESET_ON_FORK#48atsushi421 merged 4 commits intomainfrom
atsushi421 merged 4 commits intomainfrom
Conversation
…FORK Previously, SCHED_DEADLINE configurations were delayed because threads with this policy could not call fork(2)/clone(2), causing EAGAIN errors in nodes that spawn child threads at startup (e.g., EKF Localizer). This commit sets SCHED_FLAG_RESET_ON_FORK in sched_attr.sched_flags, which allows SCHED_DEADLINE threads to call fork/clone (children reset to SCHED_OTHER). This eliminates the need for the interactive "Apply sched deadline?" prompt and delayed application logic. Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
Update README to explain that the "Press enter to exit and remove cgroups..." prompt only appears when SCHED_DEADLINE threads with CPU affinity (cgroup-based) are configured. Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR removes the delayed/interactive application of SCHED_DEADLINE by setting SCHED_FLAG_RESET_ON_FORK, allowing fork(2)/clone(2) to succeed from SCHED_DEADLINE threads, and applies all scheduling policies immediately when thread info is received.
Changes:
- Apply
SCHED_DEADLINEimmediately by settingSCHED_FLAG_RESET_ON_FORKin thesched_setattr()path. - Remove delayed
SCHED_DEADLINEbuffering/application APIs and simplify the main workflow. - Update user-facing logs and README to reflect the simplified runtime flow.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| cie_thread_configurator/src/thread_configurator_node_main.cpp | Removes the interactive “Apply sched deadline?” step; only blocks for cgroup cleanup. |
| cie_thread_configurator/src/thread_configurator_node.cpp | Sets SCHED_FLAG_RESET_ON_FORK for SCHED_DEADLINE; removes delayed application list; adds warnings on syscall failure. |
| cie_thread_configurator/include/cie_thread_configurator/thread_configurator_node.hpp | Removes delayed-apply APIs/state; adds has_cgroup() const. |
| README.md | Removes delayed SCHED_DEADLINE workflow documentation and updates runtime instructions. |
Comments suppressed due to low confidence (1)
cie_thread_configurator/src/thread_configurator_node.cpp:283
sched_setattr()has historically requiredsched_flagsto be 0 on some kernels (and the manpage still documents restrictions), so settingSCHED_FLAG_RESET_ON_FORKmay causesched_setattrto fail withEINVALon those systems. Since this is intended as a compatibility improvement, consider a fallback path: try withSCHED_FLAG_RESET_ON_FORK, and if the syscall fails specifically due to unsupported flags, retry withsched_flags = 0and emit a clear warning that RESET_ON_FORK is not supported (and that fork/clone from SCHED_DEADLINE threads may fail).
struct sched_attr attr;
memset(&attr, 0, sizeof(attr));
attr.size = sizeof(attr);
// SCHED_FLAG_RESET_ON_FORK lets the target thread still call
// fork(2)/clone(2) after being placed under SCHED_DEADLINE; without it,
// clone(2) returns EAGAIN. Children reset to SCHED_OTHER; each
// callback-group thread that needs its own SCHED_DEADLINE gets it via its
// own CallbackGroupInfo message.
attr.sched_flags = SCHED_FLAG_RESET_ON_FORK;
attr.sched_nice = 0;
attr.sched_priority = 0;
attr.sched_policy = SCHED_DEADLINE;
attr.sched_runtime = config.runtime;
attr.sched_period = config.period;
attr.sched_deadline = config.deadline;
if (sched_setattr(config.thread_id, &attr, 0) == -1) {
RCLCPP_ERROR(
this->get_logger(), "Failed to configure policy (id=%s, tid=%ld): %s",
config.thread_str.c_str(), config.thread_id, strerror(errno));
return false;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Mark config as applied even on syscall failure so the node can terminate instead of spinning indefinitely. - Fix misleading cgroup cleanup prompt wording to match the actual has_cgroup() condition. - Update README to accurately describe the configurator's exit behavior (exits after all configs applied, not after target app finishes). Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
Restore the early return in callback_group_callback and non_ros_thread_callback when issue_syscalls fails. Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Previously,
SCHED_DEADLINEconfigurations were applied via a delayed, interactive workflow: the configurator waited for a user prompt ("Apply sched deadline?") before applying them. This was a workaround becauseSCHED_DEADLINEthreads could not callfork(2)/clone(2)without triggeringEAGAIN, which caused issues with nodes that spawn child threads at startup (e.g., EKF Localizer).This PR eliminates the delayed application by setting
SCHED_FLAG_RESET_ON_FORKinsched_attr.sched_flagsforSCHED_DEADLINEthreads. With this flag, children ofSCHED_DEADLINEthreads reset toSCHED_OTHER, allowingfork/cloneto succeed. All scheduling policies, includingSCHED_DEADLINE, are now applied immediately upon receiving callback group or non-ROS thread info.Key changes:
SCHED_FLAG_RESET_ON_FORKforSCHED_DEADLINEthreads inissue_syscalls()deadline_configs_vector and delayed application logicexist_deadline_config()andapply_deadline_configs()methods; addhas_cgroup() constRCLCPP_WARNlogs when syscall application fails for a threadRelated links
port from autowarefoundation/agnocast#1248
How was this PR tested?
Notes for reviewers