Skip to content

fix(cie): apply SCHED_DEADLINE immediately using SCHED_FLAG_RESET_ON_FORK#48

Merged
atsushi421 merged 4 commits intomainfrom
fix/apply-deadline-immediately
Apr 17, 2026
Merged

fix(cie): apply SCHED_DEADLINE immediately using SCHED_FLAG_RESET_ON_FORK#48
atsushi421 merged 4 commits intomainfrom
fix/apply-deadline-immediately

Conversation

@atsushi421
Copy link
Copy Markdown
Collaborator

@atsushi421 atsushi421 commented Apr 17, 2026

Description

Previously, SCHED_DEADLINE configurations were applied via a delayed, interactive workflow: the configurator waited for a user prompt ("Apply sched deadline?") before applying them. This was a workaround because SCHED_DEADLINE threads could not call fork(2)/clone(2) without triggering EAGAIN, which caused issues with nodes that spawn child threads at startup (e.g., EKF Localizer).

This PR eliminates the delayed application by setting SCHED_FLAG_RESET_ON_FORK in sched_attr.sched_flags for SCHED_DEADLINE threads. With this flag, children of SCHED_DEADLINE threads reset to SCHED_OTHER, allowing fork/clone to succeed. All scheduling policies, including SCHED_DEADLINE, are now applied immediately upon receiving callback group or non-ROS thread info.

Key changes:

  • Set SCHED_FLAG_RESET_ON_FORK for SCHED_DEADLINE threads in issue_syscalls()
  • Remove the deadline_configs_ vector and delayed application logic
  • Remove exist_deadline_config() and apply_deadline_configs() methods; add has_cgroup() const
  • Remove the interactive "Apply sched deadline?" prompt from main; show success message immediately
  • Add RCLCPP_WARN logs when syscall application fails for a thread
  • Update README to reflect the simplified workflow (no more delayed SCHED_DEADLINE step)

Related links

port from autowarefoundation/agnocast#1248

How was this PR tested?

Notes for reviewers

…FORK

Previously, SCHED_DEADLINE configurations were delayed because threads
with this policy could not call fork(2)/clone(2), causing EAGAIN errors
in nodes that spawn child threads at startup (e.g., EKF Localizer).

This commit sets SCHED_FLAG_RESET_ON_FORK in sched_attr.sched_flags,
which allows SCHED_DEADLINE threads to call fork/clone (children reset
to SCHED_OTHER). This eliminates the need for the interactive
"Apply sched deadline?" prompt and delayed application logic.

Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
Update README to explain that the "Press enter to exit and remove
cgroups..." prompt only appears when SCHED_DEADLINE threads with
CPU affinity (cgroup-based) are configured.

Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
@atsushi421 atsushi421 marked this pull request as ready for review April 17, 2026 23:12
@atsushi421 atsushi421 requested a review from Copilot April 17, 2026 23:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the delayed/interactive application of SCHED_DEADLINE by setting SCHED_FLAG_RESET_ON_FORK, allowing fork(2)/clone(2) to succeed from SCHED_DEADLINE threads, and applies all scheduling policies immediately when thread info is received.

Changes:

  • Apply SCHED_DEADLINE immediately by setting SCHED_FLAG_RESET_ON_FORK in the sched_setattr() path.
  • Remove delayed SCHED_DEADLINE buffering/application APIs and simplify the main workflow.
  • Update user-facing logs and README to reflect the simplified runtime flow.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
cie_thread_configurator/src/thread_configurator_node_main.cpp Removes the interactive “Apply sched deadline?” step; only blocks for cgroup cleanup.
cie_thread_configurator/src/thread_configurator_node.cpp Sets SCHED_FLAG_RESET_ON_FORK for SCHED_DEADLINE; removes delayed application list; adds warnings on syscall failure.
cie_thread_configurator/include/cie_thread_configurator/thread_configurator_node.hpp Removes delayed-apply APIs/state; adds has_cgroup() const.
README.md Removes delayed SCHED_DEADLINE workflow documentation and updates runtime instructions.
Comments suppressed due to low confidence (1)

cie_thread_configurator/src/thread_configurator_node.cpp:283

  • sched_setattr() has historically required sched_flags to be 0 on some kernels (and the manpage still documents restrictions), so setting SCHED_FLAG_RESET_ON_FORK may cause sched_setattr to fail with EINVAL on those systems. Since this is intended as a compatibility improvement, consider a fallback path: try with SCHED_FLAG_RESET_ON_FORK, and if the syscall fails specifically due to unsupported flags, retry with sched_flags = 0 and emit a clear warning that RESET_ON_FORK is not supported (and that fork/clone from SCHED_DEADLINE threads may fail).
    struct sched_attr attr;
    memset(&attr, 0, sizeof(attr));
    attr.size = sizeof(attr);
    // SCHED_FLAG_RESET_ON_FORK lets the target thread still call
    // fork(2)/clone(2) after being placed under SCHED_DEADLINE; without it,
    // clone(2) returns EAGAIN. Children reset to SCHED_OTHER; each
    // callback-group thread that needs its own SCHED_DEADLINE gets it via its
    // own CallbackGroupInfo message.
    attr.sched_flags = SCHED_FLAG_RESET_ON_FORK;
    attr.sched_nice = 0;
    attr.sched_priority = 0;

    attr.sched_policy = SCHED_DEADLINE;
    attr.sched_runtime = config.runtime;
    attr.sched_period = config.period;
    attr.sched_deadline = config.deadline;

    if (sched_setattr(config.thread_id, &attr, 0) == -1) {
      RCLCPP_ERROR(
          this->get_logger(), "Failed to configure policy (id=%s, tid=%ld): %s",
          config.thread_str.c_str(), config.thread_id, strerror(errno));
      return false;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cie_thread_configurator/src/thread_configurator_node.cpp
Comment thread cie_thread_configurator/src/thread_configurator_node_main.cpp Outdated
Comment thread README.md Outdated
Comment thread cie_thread_configurator/src/thread_configurator_node.cpp
- Mark config as applied even on syscall failure so the node can
  terminate instead of spinning indefinitely.
- Fix misleading cgroup cleanup prompt wording to match the actual
  has_cgroup() condition.
- Update README to accurately describe the configurator's exit behavior
  (exits after all configs applied, not after target app finishes).

Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
Restore the early return in callback_group_callback and
non_ros_thread_callback when issue_syscalls fails.

Signed-off-by: atsushi421 <atsushi.yano.2@tier4.jp>
@atsushi421 atsushi421 enabled auto-merge April 17, 2026 23:26
@atsushi421 atsushi421 merged commit 0d942a6 into main Apr 17, 2026
3 checks passed
@atsushi421 atsushi421 deleted the fix/apply-deadline-immediately branch April 17, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants