Skip to content

Fix spurious "Terminal needs approval" from Hermes pre_tool_call (#4985)#5010

Merged
austinywang merged 5 commits into
mainfrom
issue-4985-hermes-pre-tool-call-notif
May 30, 2026
Merged

Fix spurious "Terminal needs approval" from Hermes pre_tool_call (#4985)#5010
austinywang merged 5 commits into
mainfrom
issue-4985-hermes-pre-tool-call-notif

Conversation

@austinywang
Copy link
Copy Markdown
Contributor

@austinywang austinywang commented May 30, 2026

Problem

With the Hermes hook integration installed (cmux hooks hermes-agent install --yes), users occasionally see a "Terminal needs approval" desktop notification while the Hermes TUI shows no pending approval. Hermes' log shows shell hook timed out after 120s (event=pre_tool_call ...), confirming pre_tool_call is the trigger.

Root cause

CMUXCLI.classifyFeedEvent mapped Hermes pre_tool_call (a tool starting — no approval pending) to a blocking PermissionRequest whenever the tool was side-effecting (e.g. terminal, Bash). That made FeedCoordinator post the "needs approval" banner and block the hook for 120s. Hermes reserves a distinct pre_approval_request event for real approvals, so pre_tool_call should never be treated as one.

This is a class of bug, not a Hermes one-off. The heuristic "pre-tool event + side-effecting tool ⇒ approval" is correct only for agents whose only signal is the pre-tool event (gemini, copilot, factory, …). Agents that expose a dedicated approval event — Claude (PermissionRequest), Codex (PermissionRequest), Hermes (pre_approval_request) — must treat their pre-tool event as telemetry, or it double-counts.

Fix

Replaced the raw event-name switch ladders in classifyFeedEvent with an explicit, typed registry keyed on (source, event) → FeedEventSemantic:

  • Notification eligibility and the blocking wait are derived from the resolved semantic, never from string matching.
  • The camp distinction is encoded explicitly: .toolStart (dedicated-approval agents — always telemetry) vs .toolStartMaybeApproval (escalate side-effecting tools). Hermes pre_tool_call is now .toolStart.
  • Unknown / future event names are safe by default — they resolve to .unknown → non-actionable telemetry that never notifies.

pre_approval_request keeps firing the genuine approval notification through the dedicated notification hook subcommand path (unchanged); on the feed path it stays a non-blocking Notification to avoid a duplicate banner. All other agents' mappings are behavior-preserving for real agent payloads.

Tests

Two-commit red/green structure:

  1. FeedEventClassificationTests (Swift Testing) added first — fails on current code.
  2. The registry fix — turns it green.

Coverage: Hermes pre_tool_call/terminal/Bash are non-actionable; pre_approval_request/lifecycle events non-blocking; unknown future events safe; Claude PreToolUse does not escalate while PermissionRequest/ExitPlanMode/AskUserQuestion do; generic agents still escalate side-effecting PreToolUse; Codex PreToolUse is telemetry.

Fixes #4985

🤖 Generated with Claude Code


View with Codesmith Autofix with Codesmith
Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.


Note

Medium Risk
Changes when Feed blocks and notifies across all agents; behavior is intended to be preserving except the Hermes fix, but mis-registry entries could suppress real approvals or miss escalations for generic agents.

Overview
Fixes spurious "Terminal needs approval" notifications and 120s hook blocks when Hermes emits pre_tool_call (tool start, not an approval) by replacing inline classifyFeedEvent switch logic in the CLI with shared FeedEventClassifier.

Classification now uses per-agent (source, event) → FeedEventSemantic tables, then maps semantics to wire hook_event_name and isActionable. Agents with a dedicated approval event (Claude/Codex PermissionRequest, Hermes pre_approval_request) treat pre-tool events as telemetry only (.toolStart); generic agents still escalate side-effecting pre-tool tools via .toolStartMaybeApproval. Unknown events default to non-actionable telemetry.

Adds FeedEventClassificationTests and compiles the classifier into both cmux-cli and cmuxTests so behavior is regression-tested without subprocess CLI runs.

Reviewed by Cursor Bugbot for commit 6d59f63. Bugbot is set up for automated code reviews on this repo. Configure here.


Summary by cubic

Stops false "Terminal needs approval" notifications from Hermes by treating pre_tool_call as telemetry, not an approval. Prevents the 120s hook block and aligns event handling across agents. Fixes #4985.

  • Bug Fixes

    • Replaced string-matching with a typed (source, event) → semantic registry; notifications and blocking now derive from semantics.
    • Mapped Hermes pre_tool_call to tool-start telemetry; kept pre_approval_request as a non‑blocking feed notification (real approval still via the notification path).
    • Defaulted unknown events to non‑actionable telemetry; beforeShellExecution (Codex) and other lifecycle hooks stay telemetry; tests cover Hermes, Claude, Codex, and generic agents.
  • Refactors

    • Extracted FeedEventClassifier into CLI/FeedEventClassifier.swift and use it from cmux.swift; compiled into both cmux-cli and tests so unit tests run in CI.
    • Scoped FeedEventSemantic to private and added a shared dedicatedApprovalEvent helper; expanded tests to cover dedicated-approval tool routing on generic pre‑tool paths.

Written for commit 6d59f63. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • Refactor

    • Reworked feed-event classification into a semantic-driven pipeline with per-agent mappings and centralized handling of notification names and actionability, making unknown events non-blocking telemetry by default and ensuring permission requests only appear when appropriate.
  • Tests

    • Added comprehensive tests verifying agent-specific mappings, side-effect vs read-only tool actionability, approval routing, and safe defaults for unknown events.

Review Change Stack

austinywang and others added 2 commits May 29, 2026 22:06
cmux's feed-event classifier maps Hermes `pre_tool_call` (a tool *starting*,
no approval pending) to a blocking PermissionRequest whenever the tool is
side-effecting (e.g. `terminal`). That fires a spurious "Terminal needs
approval" notification while the Hermes TUI shows nothing to approve. Hermes
reserves `pre_approval_request` for real approvals.

This commit adds the regression test only (no fix) so CI proves the test
catches the bug. classifyFeedEvent is made internal so it can be exercised
directly.

Refs #4985

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…_call

Replace the raw event-name pattern matching in classifyFeedEvent with an
explicit, typed registry keyed on (source, event) -> FeedEventSemantic.
Notification eligibility and the blocking Feed wait are derived from the
resolved semantic, never from string matching, so a tool-*starting*
lifecycle event can no longer be mistaken for an approval request.

The class of bug: the heuristic "pre-tool event + side-effecting tool =>
approval" is correct only for agents whose *only* signal is the pre-tool
event (gemini, copilot, …). Agents with a dedicated approval event (Claude
PermissionRequest, Codex PermissionRequest, Hermes pre_approval_request)
must treat their pre-tool event as telemetry, or it double-counts and fires
a spurious "needs approval" notification. Hermes `pre_tool_call` (tool
starting, no approval) was being escalated to PermissionRequest for
side-effecting tools like `terminal`, producing the false-positive banner
in #4985.

The registry encodes this distinction explicitly (toolStart vs
toolStartMaybeApproval) and makes unknown/future event names safe by
default: they resolve to non-actionable telemetry that never notifies.

Fixes #4985

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
cmux Ready Ready Preview, Comment May 30, 2026 7:09am
cmux-staging Building Building Preview, Comment May 30, 2026 7:09am

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

📝 Walkthrough

Walkthrough

Refactors feed-event classification into: (1) resolve a typed FeedEventSemantic for (source,event) via per-source registries with generic fallback, then (2) map that semantic plus toolName to the wire hook_event_name and isActionable decision. Adds tests and Xcode project registration.

Changes

Feed Event Classification Refactoring

Layer / File(s) Summary
Semantic model and lookup
CLI/FeedEventClassifier.swift
Adds file header, FeedEventClassifier.classify(source:event:toolName:), FeedEventSemantic, and feedEventSemantic(source:event:) with per-source registry + genericFeedEventSemantics fallback.
Wire mapping and registries
CLI/FeedEventClassifier.swift
Adds dedicatedApprovalEvent(for:), wireMapping(for:toolName:), agent-specific feedEventSemanticRegistry entries (claude, codex, hermes-agent), and sideEffectingTools used to determine actionable PermissionRequest escalation.
Classifier integration and cleanup
CLI/cmux.swift
Replaces the previous nested classifyFeedEvent implementation with a call to FeedEventClassifier.classify(...) and removes the old monolithic switch.
Test coverage and project setup
cmuxTests/FeedEventClassificationTests.swift, cmux.xcodeproj/project.pbxproj
Adds unit tests validating Hermes/Claude/Codex/generic-agent mappings (including Hermes pre_tool_call telemetry-only, pre_approval_request approval signal, and safe defaults) and registers source/test files in the Xcode project.

Sequence Diagram

sequenceDiagram
  participant Caller as Feed hook (cmux hooks feed)
  participant Classifier as FeedEventClassifier.classify
  participant Lookup as feedEventSemanticRegistry
  participant Mapper as wireMapping
  participant Output as Notification system

  Caller->>Classifier: (source, event, toolName)
  Classifier->>Lookup: resolve FeedEventSemantic
  Lookup-->>Classifier: FeedEventSemantic
  Classifier->>Mapper: semantic, toolName
  Mapper-->>Output: hook_event_name, isActionable
Loading

🎯 4 (Complex) | ⏱️ ~45 minutes

🐇 A rabbit codes with careful paws,
Mapping hooks to safe, well-typed laws.
Hermes whispers pre_tool_call—just a peep,
PermissionRequest wakes only when approvals leap.
Thump, thump — tests ensure no noisy beep.

🚥 Pre-merge checks | ✅ 18
✅ Passed checks (18 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main bug fix: preventing spurious 'Terminal needs approval' notifications triggered by Hermes pre_tool_call events.
Linked Issues check ✅ Passed The PR fully addresses issue #4985 by replacing event-name switches with a typed (source,event)→FeedEventSemantic registry, mapping Hermes pre_tool_call to non-actionable telemetry, keeping pre_approval_request as non-blocking, and defaulting unknown events to safe telemetry.
Out of Scope Changes check ✅ Passed All changes directly support the fix for issue #4985: refactoring classifyFeedEvent into a shared FeedEventClassifier, adding comprehensive tests covering Hermes/Claude/Codex/generic agents, and updating build configuration to wire tests into the target.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Cmux Swift Actor Isolation ✅ Passed FeedEventClassifier is a pure utility struct with only static immutable properties and methods. No mutable state, reference types, or MainActor issues. Test file appropriately excluded.
Cmux Swift Blocking Runtime ✅ Passed PR introduces pure deterministic event classification logic with no blocking runtime primitives (semaphores, sleeps, locks, polling, etc.) in production or test code.
Cmux No Hacky Sleeps ✅ Passed PR modifies only Swift files and Xcode project config, which are out of scope. Rule covers TypeScript, JavaScript, shell, and non-Swift build/runtime scripts; Swift timing is covered separately.
Cmux Algorithmic Complexity ✅ Passed FeedEventClassifier uses O(1) dictionary/Set lookups on small static collections (~36-50 items) for per-event classification, with no nested scans, filter/map/sort, or unbounded iteration.
Cmux Swift Concurrency ✅ Passed New files contain only pure synchronous static methods with Foundation imports—no DispatchQueue, Task, async/await, Combine, or completion handlers introduced.
Cmux Swift @Concurrent ✅ Passed PR adds only synchronous code. No async/await, @concurrent, or actor isolation violations detected in new or modified files.
Cmux Swift File And Package Boundaries ✅ Passed FeedEventClassifier (244 lines) has single responsibility, Foundation-only imports, proper CLI placement, and is testable independently without app lifecycle.
Cmux Swift Logging ✅ Passed New Swift production code (FeedEventClassifier.swift) contains no logging calls (print, debugPrint, dump, NSLog, Logger). Test file appropriately avoids logging. Complies with swift-logging.md rules.
Cmux User-Facing Error Privacy ✅ Passed No user-facing error/alert messages added. FeedEventClassifier.swift contains only internal wire protocol constants and agent/tool mappings for Feed RPC routing, not user display.
Cmux Full Internationalization ✅ Passed No new user-facing text added. FeedEventClassifier.swift contains only config/protocol identifiers and developer comments. Test and project files are allowed.
Cmux Swiftui State Layout ✅ Passed PR introduces only pure Swift logic: FeedEventClassifier struct, unit tests, and CLI refactoring. No SwiftUI Views, @State, @Published, @Observable, or state management patterns found.
Cmux Architecture Rethink ✅ Passed Pure correctness fix with single static owner (FeedEventClassifier), explicit named invariants in registry, no timing/mutable/observer patterns, single call site. Meets allowed criteria.
Cmux Swift Auxiliary Window Close Shortcuts ✅ Passed PR adds event classification logic only (FeedEventClassifier.swift, tests). No NSWindow, NSPanel, NSWindowController, SwiftUI Window, or WindowGroup code found.
Description check ✅ Passed The PR description comprehensively documents the problem, root cause, fix, and testing approach with clear technical detail.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-4985-hermes-pre-tool-call-notif

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLI/cmux.swift`:
- Around line 30239-30258: The switch branches for .approvalRequest and
.toolStartMaybeApproval duplicate the same special-case checks for
"ExitPlanMode" and "AskUserQuestion"; extract a small helper (e.g., a static
method like resolveSpecialToolName(toolName: String) -> (String, Bool) or a
private func specialToolResponse(toolName: String) -> (String, Bool)) and call
it from both cases, leaving the original branch logic to only handle the
default/path-specific behavior and the Self.sideEffectingTools check in
.toolStartMaybeApproval; update both case arms to delegate to this helper to
remove duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e7316230-3339-48fd-b6ca-df2457ee9b16

📥 Commits

Reviewing files that changed from the base of the PR and between 239600b and 1538585.

📒 Files selected for processing (3)
  • CLI/cmux.swift
  • cmux.xcodeproj/project.pbxproj
  • cmuxTests/FeedEventClassificationTests.swift

Comment thread CLI/cmux.swift Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 30, 2026

Greptile Summary

Extracts classifyFeedEvent from cmux.swift into a standalone FeedEventClassifier struct driven by an explicit (source, event) → FeedEventSemantic registry. The core bug — Hermes pre_tool_call escalating to a blocking approval for side-effecting tools — is fixed by mapping it to .toolStart (always telemetry), while agents without a dedicated approval event continue to use .toolStartMaybeApproval so side-effecting PreToolUse events still escalate.

  • Bug fix: Hermes pre_tool_call is now permanently classified as telemetry; only the dedicated pre_approval_request path reaches the approval notification, eliminating the spurious "Terminal needs approval" banner and 120 s hook timeout.
  • Refactor: The raw string-matching switch ladders in classifyFeedEvent are replaced with a per-agent semantic registry; unknown/future event names default to non-actionable telemetry.
  • Tests: FeedEventClassificationTests (Swift Testing) covers Hermes, Claude, Codex, and generic agents, plus the safe-default invariant for unknown sources/events.

Confidence Score: 5/5

Safe to merge. The registry-driven refactor is behavior-preserving for all existing agents, and the Hermes fix is surgically correct.

All three registered agents (Claude, Codex, Hermes) have explicit, tested registry entries. The old switch ladders and the new registry produce identical results for every event combination that real agents emit. Unknown/future events now safely default to non-actionable telemetry rather than potentially mis-notifying. Test coverage is thorough and directly encodes the invariants that caused the original bug.

No files require special attention. The one subtle difference — dedicatedApprovalEvent now applies universally on the .approvalRequest path, not just for Claude — only matters for a generic agent that names a tool "ExitPlanMode" or "AskUserQuestion", which has no known real-world occurrence.

Important Files Changed

Filename Overview
CLI/FeedEventClassifier.swift New file extracting feed-event classification into a clean semantic registry; single responsibility, well-documented, all observable behaviors preserved or intentionally changed.
CLI/cmux.swift Call-site swap from Self.classifyFeedEvent(…) to FeedEventClassifier.classify(…); no logic changes, removes ~139 lines from the already large file.
cmuxTests/FeedEventClassificationTests.swift Comprehensive regression tests for the classification logic; covers the reported bug, dedicated-approval agents (Claude, Codex, Hermes), generic agents, and the safe-default invariant for unknown events.
cmux.xcodeproj/project.pbxproj Correctly registers FeedEventClassifier.swift in the CLI source group and wires it into both the cmux-cli target and the cmuxTests target via separate build-file references.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["FeedEventClassifier.classify(source, event, toolName)"] --> B["feedEventSemantic(source, event)"]
    B --> C{source in registry?}
    C -- yes --> D["agentTable[event] ?? .unknown"]
    C -- no --> E["genericFeedEventSemantics[event] ?? .unknown"]
    D --> F["FeedEventSemantic"]
    E --> F
    F --> G{semantic}
    G -- ".approvalRequest" --> H["dedicatedApprovalEvent(toolName)\nor PermissionRequest, true"]
    G -- ".toolStartMaybeApproval" --> I{sideEffecting tool?}
    I -- yes --> J["PermissionRequest, true"]
    I -- no --> K["PreToolUse, false"]
    G -- ".toolStart" --> K
    G -- ".toolEnd" --> L["PostToolUse, false"]
    G -- ".statusNotification" --> M["Notification, false"]
    G -- ".unknown" --> K
Loading

Reviews (4): Last reviewed commit: "Make Hermes feed-event regression test c..." | Re-trigger Greptile

Comment thread CLI/cmux.swift Outdated
DRY the ExitPlanMode / AskUserQuestion tool-name routing shared by the
.approvalRequest and .toolStartMaybeApproval wire mappings into a single
dedicatedApprovalEvent(for:) helper, and scope FeedEventSemantic to private.
Add coverage that the dedicated-approval tool names route correctly on the
generic pre-tool path and that Codex's beforeShellExecution stays telemetry.

Refs #4985

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…w feedback

The FeedEventClassificationTests regression test never compiled on CI.
`CMUXCLI.classifyFeedEvent` lives in the `cmux_cli` executable module, but the
test (hosted by `cmux.app`) imported `cmux_DEV` and could neither see nor link
the CLI tool's symbols. The `tests` job was failing at the cmuxTests *build*
step (`cannot find 'CMUXCLI' in scope`) before any test ran — so the PR's
two-commit red/green never actually validated the fix.

Extract the pure classification logic into `CLI/FeedEventClassifier.swift`
(`struct FeedEventClassifier`, single source of truth), compiled into both the
`cmux-cli` and `cmuxTests` targets. The decision is now a genuine compiled unit
test: 10 tests run and pass. `@testable import cmux_cli` is not an option — the
app-hosted test bundle cannot link the CLI executable's symbols (the same
reason the existing CLI test uses a subprocess).

Behavior is unchanged for production payloads. Also addresses review feedback
folded into the extracted file:
- `FeedEventSemantic` is now `private` (Greptile P2).
- `ExitPlanMode`/`AskUserQuestion` routing deduplicated via a shared
  `dedicatedApprovalEvent(for:)` helper (CodeRabbit).
- Added coverage for Codex `beforeShellExecution` (telemetry) and generic-agent
  dedicated-approval-tool routing through the `.toolStartMaybeApproval` branch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@austinywang
Copy link
Copy Markdown
Contributor Author

Fixed: the regression test never actually compiled on CI

While iterating on this PR I found the tests job was red — not from an assertion failure, but because cmuxTests failed to build:

FeedEventClassificationTests.swift:24: error: cannot find 'CMUXCLI' in scope

Root cause: CMUXCLI.classifyFeedEvent is compiled into the cmux_cli executable module (the CLI tool), but the test is hosted by cmux.app and imported cmux_DEV. The app-hosted test bundle can neither see nor link the CLI executable's symbols (demangling confirms the symbol is cmux_cli.CMUXCLI.classifyFeedEvent). @testable import cmux_cli won't work either — that's exactly why the only existing CLI test (CMUXCLIErrorOutputRegressionTests) drives the CLI as a subprocess. So the two-commit red/green never validated anything: the test failed to compile on both commits, before any assertion ran.

Fix (latest commit): extracted the pure classification logic into CLI/FeedEventClassifier.swift (struct FeedEventClassifier, single source of truth), compiled into both the cmux-cli and cmuxTests targets. The decision is now a genuine compiled unit test — 10 tests run and pass locally via xcodebuild -scheme cmux-unit (previously: Executed 0 tests, build failure). Behavior is unchanged for production payloads; the review-feedback refactor (dedicatedApprovalEvent(for:), private FeedEventSemantic) moved into that file. Verified the test genuinely catches the bug: classifying hermes-agent/pre_tool_call/terminal returns ("PreToolUse", false) — flipping the registry entry back to .toolStartMaybeApproval makes terminal actionable and fails the assertion.

🤖 Generated with Claude Code

@austinywang austinywang merged commit c45e431 into main May 30, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hermes pre_tool_call feed may surface as “Terminal needs approval” without pending approval

1 participant