Skip to content

feat: Client-tool execution + permission system#3370

Open
BinaryMuse wants to merge 50 commits intomainfrom
mkt/tool-perms
Open

feat: Client-tool execution + permission system#3370
BinaryMuse wants to merge 50 commits intomainfrom
mkt/tool-perms

Conversation

@BinaryMuse
Copy link
Copy Markdown
Member

@BinaryMuse BinaryMuse commented Mar 31, 2026

Summary

Adds client-side tool execution to Atuin AI, starting with atuin_history. The server can request tool calls, which are executed locally with a permission system, and results are sent back to continue the conversation.

Architecture

  • Modular extraction: Decomposed the monolithic inline.rs into stream.rs (SSE framing), dispatch.rs (event handling), state.rs (domain state), tools/ (tool types + execution), and permissions/ (rule resolution)
  • ToolTracker: Single source of truth for tool execution lifecycle — replaces the previous split between pending_tool_calls and ConversationEvent preview fields. Tracks each tool from permission check through completion, including cached previews for shell command output
  • Permission system: File-based permission rules with tree-sitter shell command parsing for scoped matching (e.g. git * allows all git subcommands). Supports auto-allow, deny, and interactive ask with an inline selection UI
  • Shell command execution: VT100 terminal emulator for live output preview in a Viewport widget, with interrupt support (Ctrl+C)

Ships with

  • atuin_history tool active by default — lets the AI search your command history
  • Other tools (read_file, execute_shell_command, str_replace, file_create, file_insert) are defined but gated behind capability flags
  • Permission denied flows surface errors inline on the tool call rather than as separate system messages
  • Markdown list rendering fix (items no longer break across lines)
  • Integrates main's new Hub auth flow and ai.opening settings (send_cwd, send_last_command)
  • New ai.capabilities settings to control the capabilities advertised to the server
  • If ai.enabled is None, app will prompt to enable, disable, or cancel. Disabling releases ? keybind.

…oundary

This change extracts four new abstractions and sets a clear architectural
rule that will guide subsequent refactoring:

**Mutation boundary (Phase 1a):** AppState no longer owns an
mpsc::Sender. State is passive data; the event loop (or stream task) is
the only thing that emits events. handle_client_tool_call is now a pure
mutation — the CheckToolCallPermission event is sent by the caller.

**AppContext (Phase 1b):** Session-scoped API configuration (endpoint,
token, send_cwd) is now a single Clone struct instead of three
individually-cloned bindings in the event loop.

**ClientContext + ChatRequest (Phase 1c):** Machine identity (OS, shell,
distro) is computed once per session via ClientContext::detect() instead
of on every SSE request. ChatRequest wraps the per-turn message/session
payload, replacing the inline request body construction in
create_chat_stream.

**ToolDescriptor (Phase 1d):** Centralizes tool metadata — canonical
names, display verbs, progressive/past verbs, and client/server
classification — into static descriptors. Replaces four separate
name-to-text match sites and fixes a bug where is_client was checked
against 'file_read' (wrong) instead of 'read_file' (correct).

Also fixes several clippy warnings in modified code.
…rames

**State decomposition (Phase 2a):** AppState is replaced by three types
with clear ownership:

- Conversation: owns the event log and session_id. All pure query
  and event-manipulation methods (events_to_messages, current_command,
  has_any_command, etc.) move here.
- Interaction: owns ephemeral UI state (mode, is_input_blank,
  confirmation_pending, streaming_status, error).
- Session: the top-level type containing conversation, interaction,
  pending_tool_calls, exit_action, and stream_abort. Cross-cutting
  lifecycle methods (start_streaming, cancel_streaming, add_tool_call,
  etc.) stay here.

**Dispatch extraction (Phase 2b):** The 400-line match event in the
spawn_blocking loop is now a 5-line dispatch call. All 12 handlers are
named functions in a new tui/dispatch.rs module. inline.rs shrinks
from ~640 lines to ~240 lines.

**Stream launch centralization (Phase 2c):** The duplicated 8-line
stream launch ritual (present in ContinueAfterTools, SubmitInput, and
Retry) is replaced by a single launch_stream function that takes a
setup callback for pre-work. Each handler collapses to one line.

**Stream frame split (Phase 2d):** ChatStreamEvent is replaced by
StreamFrame with explicit Content (TextChunk, ToolCall, ToolResult)
and Control (Done, Error, StatusChanged) variants. run_chat_stream
now dispatches on frame type, with apply_content_frame and
apply_control_frame as separate functions.
Extracts three abstractions that untangle the ~180-line
on_check_tool_permission handler:

**PermissionResolver (permissions/resolver.rs):** Composes the
PermissionWalker + PermissionChecker into a single new/check API.
The handler no longer imports Walker, Checker, or Request directly.

**ToolOutcome + ClientToolCall::execute (tools/mod.rs):** Each tool
variant owns its execution logic. ReadToolCall::execute handles both
directory listing and file reading (previously only directory listing
worked — file reading was a no-op). Returns ToolOutcome::Success or
ToolOutcome::Error, replacing the inline match arms.

**PendingToolCall state transitions (tools/mod.rs):** mark_asking(),
mark_executing(), mark_denied() methods formalize the state machine
instead of direct enum variant assignment.

**Session::complete_tool_call (tui/state.rs):** Combines
add_tool_result + pending_tool_calls.retain into one method,
replacing the repeated cleanup pattern in the handler.

The handler drops from ~180 lines to ~60 lines.
Add the history database to AppContext and plumb it through dispatch
so client tools can run async database queries. AtuinHistory::execute
uses atuin-client's Database::search with fuzzy matching, the first
filter mode from the tool call, and configurable limit.

Also fix two pre-existing bugs that prevented client tools from
working end-to-end: AtuinHistory::matches_rule had a todo!() panic
that crashed the permission check task, and on_select_permission
Allow was discarding the tool call instead of executing it.
Format results with local timezone timestamp and human-readable
duration (e.g. 3s, 1m23s, 120ms) alongside command, cwd, and exit
code.
Parse shell commands with tree-sitter-bash (POSIX family) and
tree-sitter-fish to extract all subcommands from compound
expressions (&&, ||, pipes, subshells, $(...), etc). Wire into
ShellToolCall::matches_rule for scope matching.

Scope matching supports three wildcard styles:
- `ls *` (space before *): word-boundary match
- `ls*` (no space): prefix/glob match
- `git * amend` (middle *): matches zero+ words between segments

Also fixes: variable assignments excluded from command text,
fallback parser double-split bug, and shell field parsed from
tool call input for correct parser selection.
Add 77 adversarial tests covering nested substitutions, variable
assignments, control flow, redirections, subshells, background jobs,
real-world commands, fish-specific syntax, and scope matching edge
cases.

Fix: remove `concatenation` from BASH_LEAVES so that subcommands
inside argument concatenations (e.g. `make -j$(nproc)`) are properly
extracted.

Known limitations verified by tests:
- `find -exec` body is opaque to tree-sitter (not parsed as commands)
- `[` in test conditions is not extracted as a command
- `eval`/`exec`/`source` argument bodies are not recursively parsed
Shell tool calls now execute locally with a streaming VT100 preview in
the TUI. The full stdout and stderr are captured separately and sent to
the LLM as structured results with exit code and duration.

Key changes:
- ToolOutcome::Structured variant with separated stdout/stderr/exit code/duration
- execute_shell_command_streaming uses vt100 crate for ANSI/progress bar handling
- Shared execute_tool dispatch eliminates duplicated shell execution paths
- begin_tool_call/finish_tool_call for ToolCall persistence in chat output
- Ctrl+C interrupts running commands instead of exiting the app
- Viewport component in eye_declare for fixed-height tail rendering
Replace the split state model (pending_tool_calls VecDeque + preview
field on ConversationEvent::ToolCall) with a unified ToolTracker that
owns each tool call through its full lifecycle, including after
completion. This eliminates the preview copy-back dance, the two-place
preview lookup, and the TurnBuilder second-pass update_previews step.

Key changes:
- New ToolTracker/TrackedTool/ToolPhase types replace PendingToolCall/ToolCallState
- ConversationEvent::ToolCall drops its preview field (now purely API-facing)
- shell_abort_tx moves from Session to TrackedTool.abort_tx (per-tool, not per-session)
- TurnBuilder takes &ToolTracker reference, looks up previews inline
- Fix spinner not animating during shell preview (work around eye_declare
  interval reset by computing frame from system clock)
- Fix word wrapping in shell output preview (use truncation instead of
  word-boundary wrapping for VT100 content)
- Use multi-thread tokio runtime for AI commands
# Conflicts:
#	crates/atuin-ai/src/commands/inline.rs
@BinaryMuse BinaryMuse changed the title feat: Permission system for client tool calls feat: Client-tool execution + permission system Apr 9, 2026
@BinaryMuse
Copy link
Copy Markdown
Member Author

@greptileai Please review this draft

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 9, 2026

Greptile Summary

Adds client-side tool execution to Atuin AI with a file-based permission system, decomposing the previous monolithic inline.rs into focused modules (stream.rs, dispatch.rs, state.rs, tools/, permissions/). The atuin_history tool is active by default; shell/file tools are gated behind capability flags. Previously flagged issues (capability gating, AlwaysAllow stubs, todo!() in permission matching) appear resolved in this revision.

Confidence Score: 5/5

Safe to merge; remaining findings are non-blocking P2 style/consistency issues on non-default code paths.

All remaining comments are P2: the WRITE descriptor/dispatch mismatch only manifests when write capability is explicitly enabled (not default), and filter_modes truncation is a minor schema clarity issue. Core paths (atuin_history, permission flows, capability gating) look correct.

crates/atuin-ai/src/tools/descriptor.rs and crates/atuin-ai/src/tools/mod.rs (WRITE tool dispatch gap)

Important Files Changed

Filename Overview
crates/atuin-ai/src/tools/mod.rs Core tool execution: ReadToolCall, WriteToolCall, ShellToolCall, AtuinHistoryToolCall. filter_modes Vec truncated to first element; otherwise well-structured with proper permission hooks.
crates/atuin-ai/src/tools/descriptor.rs WRITE descriptor lists str_replace/file_create/file_insert as canonical names but try_from only handles create_file — mismatch when write capability is enabled.
crates/atuin-ai/src/tui/dispatch.rs Event dispatch for all TUI actions. AlwaysAllowInDir/AlwaysAllow now write rules and execute tools correctly. Permission flows look sound.
crates/atuin-ai/src/stream.rs SSE framing and capability enforcement. Capability gating now rejects un-advertised tools at the receive layer with a tool error, closing the previously flagged gap.
crates/atuin-ai/src/permissions/check.rs Permission check logic: ask → deny → allow priority per file, deepest-first. Defaults to Ask when no rule matches. Clean.
crates/atuin-ai/src/permissions/writer.rs Rule persistence using toml_edit. Correctly handles inline tables, de-duplication, and creates parent dirs. Well-tested.
crates/atuin-ai/src/tui/state.rs Domain state: Conversation, Interaction, Session. events_to_messages serialization logic is correct; streaming lifecycle transitions look sound.

Reviews (5): Last reviewed commit: "and I type for a living" | Re-trigger Greptile

@atuinsh atuinsh deleted a comment from greptile-apps bot Apr 10, 2026
@socket-security
Copy link
Copy Markdown

socket-security bot commented Apr 10, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedcargo/​eye_declare@​0.3.0 ⏵ 0.4.010010093100100

View full report

@BinaryMuse BinaryMuse marked this pull request as ready for review April 10, 2026 13:49
@BinaryMuse BinaryMuse requested a review from ellie April 10, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants