feat: Client-tool execution + permission system#3370
feat: Client-tool execution + permission system#3370BinaryMuse wants to merge 50 commits intomainfrom
Conversation
…oundary This change extracts four new abstractions and sets a clear architectural rule that will guide subsequent refactoring: **Mutation boundary (Phase 1a):** AppState no longer owns an mpsc::Sender. State is passive data; the event loop (or stream task) is the only thing that emits events. handle_client_tool_call is now a pure mutation — the CheckToolCallPermission event is sent by the caller. **AppContext (Phase 1b):** Session-scoped API configuration (endpoint, token, send_cwd) is now a single Clone struct instead of three individually-cloned bindings in the event loop. **ClientContext + ChatRequest (Phase 1c):** Machine identity (OS, shell, distro) is computed once per session via ClientContext::detect() instead of on every SSE request. ChatRequest wraps the per-turn message/session payload, replacing the inline request body construction in create_chat_stream. **ToolDescriptor (Phase 1d):** Centralizes tool metadata — canonical names, display verbs, progressive/past verbs, and client/server classification — into static descriptors. Replaces four separate name-to-text match sites and fixes a bug where is_client was checked against 'file_read' (wrong) instead of 'read_file' (correct). Also fixes several clippy warnings in modified code.
…rames **State decomposition (Phase 2a):** AppState is replaced by three types with clear ownership: - Conversation: owns the event log and session_id. All pure query and event-manipulation methods (events_to_messages, current_command, has_any_command, etc.) move here. - Interaction: owns ephemeral UI state (mode, is_input_blank, confirmation_pending, streaming_status, error). - Session: the top-level type containing conversation, interaction, pending_tool_calls, exit_action, and stream_abort. Cross-cutting lifecycle methods (start_streaming, cancel_streaming, add_tool_call, etc.) stay here. **Dispatch extraction (Phase 2b):** The 400-line match event in the spawn_blocking loop is now a 5-line dispatch call. All 12 handlers are named functions in a new tui/dispatch.rs module. inline.rs shrinks from ~640 lines to ~240 lines. **Stream launch centralization (Phase 2c):** The duplicated 8-line stream launch ritual (present in ContinueAfterTools, SubmitInput, and Retry) is replaced by a single launch_stream function that takes a setup callback for pre-work. Each handler collapses to one line. **Stream frame split (Phase 2d):** ChatStreamEvent is replaced by StreamFrame with explicit Content (TextChunk, ToolCall, ToolResult) and Control (Done, Error, StatusChanged) variants. run_chat_stream now dispatches on frame type, with apply_content_frame and apply_control_frame as separate functions.
Extracts three abstractions that untangle the ~180-line on_check_tool_permission handler: **PermissionResolver (permissions/resolver.rs):** Composes the PermissionWalker + PermissionChecker into a single new/check API. The handler no longer imports Walker, Checker, or Request directly. **ToolOutcome + ClientToolCall::execute (tools/mod.rs):** Each tool variant owns its execution logic. ReadToolCall::execute handles both directory listing and file reading (previously only directory listing worked — file reading was a no-op). Returns ToolOutcome::Success or ToolOutcome::Error, replacing the inline match arms. **PendingToolCall state transitions (tools/mod.rs):** mark_asking(), mark_executing(), mark_denied() methods formalize the state machine instead of direct enum variant assignment. **Session::complete_tool_call (tui/state.rs):** Combines add_tool_result + pending_tool_calls.retain into one method, replacing the repeated cleanup pattern in the handler. The handler drops from ~180 lines to ~60 lines.
Add the history database to AppContext and plumb it through dispatch so client tools can run async database queries. AtuinHistory::execute uses atuin-client's Database::search with fuzzy matching, the first filter mode from the tool call, and configurable limit. Also fix two pre-existing bugs that prevented client tools from working end-to-end: AtuinHistory::matches_rule had a todo!() panic that crashed the permission check task, and on_select_permission Allow was discarding the tool call instead of executing it.
Format results with local timezone timestamp and human-readable duration (e.g. 3s, 1m23s, 120ms) alongside command, cwd, and exit code.
Parse shell commands with tree-sitter-bash (POSIX family) and tree-sitter-fish to extract all subcommands from compound expressions (&&, ||, pipes, subshells, $(...), etc). Wire into ShellToolCall::matches_rule for scope matching. Scope matching supports three wildcard styles: - `ls *` (space before *): word-boundary match - `ls*` (no space): prefix/glob match - `git * amend` (middle *): matches zero+ words between segments Also fixes: variable assignments excluded from command text, fallback parser double-split bug, and shell field parsed from tool call input for correct parser selection.
Add 77 adversarial tests covering nested substitutions, variable assignments, control flow, redirections, subshells, background jobs, real-world commands, fish-specific syntax, and scope matching edge cases. Fix: remove `concatenation` from BASH_LEAVES so that subcommands inside argument concatenations (e.g. `make -j$(nproc)`) are properly extracted. Known limitations verified by tests: - `find -exec` body is opaque to tree-sitter (not parsed as commands) - `[` in test conditions is not extracted as a command - `eval`/`exec`/`source` argument bodies are not recursively parsed
Shell tool calls now execute locally with a streaming VT100 preview in the TUI. The full stdout and stderr are captured separately and sent to the LLM as structured results with exit code and duration. Key changes: - ToolOutcome::Structured variant with separated stdout/stderr/exit code/duration - execute_shell_command_streaming uses vt100 crate for ANSI/progress bar handling - Shared execute_tool dispatch eliminates duplicated shell execution paths - begin_tool_call/finish_tool_call for ToolCall persistence in chat output - Ctrl+C interrupts running commands instead of exiting the app - Viewport component in eye_declare for fixed-height tail rendering
Replace the split state model (pending_tool_calls VecDeque + preview field on ConversationEvent::ToolCall) with a unified ToolTracker that owns each tool call through its full lifecycle, including after completion. This eliminates the preview copy-back dance, the two-place preview lookup, and the TurnBuilder second-pass update_previews step. Key changes: - New ToolTracker/TrackedTool/ToolPhase types replace PendingToolCall/ToolCallState - ConversationEvent::ToolCall drops its preview field (now purely API-facing) - shell_abort_tx moves from Session to TrackedTool.abort_tx (per-tool, not per-session) - TurnBuilder takes &ToolTracker reference, looks up previews inline - Fix spinner not animating during shell preview (work around eye_declare interval reset by computing frame from system clock) - Fix word wrapping in shell output preview (use truncation instead of word-boundary wrapping for VT100 content) - Use multi-thread tokio runtime for AI commands
# Conflicts: # crates/atuin-ai/src/commands/inline.rs
|
@greptileai Please review this draft |
Greptile SummaryAdds client-side tool execution to Atuin AI with a file-based permission system, decomposing the previous monolithic Confidence Score: 5/5Safe to merge; remaining findings are non-blocking P2 style/consistency issues on non-default code paths. All remaining comments are P2: the WRITE descriptor/dispatch mismatch only manifests when write capability is explicitly enabled (not default), and filter_modes truncation is a minor schema clarity issue. Core paths (atuin_history, permission flows, capability gating) look correct. crates/atuin-ai/src/tools/descriptor.rs and crates/atuin-ai/src/tools/mod.rs (WRITE tool dispatch gap) Important Files Changed
Reviews (5): Last reviewed commit: "and I type for a living" | Re-trigger Greptile |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Summary
Adds client-side tool execution to Atuin AI, starting with
atuin_history. The server can request tool calls, which are executed locally with a permission system, and results are sent back to continue the conversation.Architecture
inline.rsintostream.rs(SSE framing),dispatch.rs(event handling),state.rs(domain state),tools/(tool types + execution), andpermissions/(rule resolution)pending_tool_callsandConversationEventpreview fields. Tracks each tool from permission check through completion, including cached previews for shell command outputgit *allows all git subcommands). Supports auto-allow, deny, and interactive ask with an inline selection UIViewportwidget, with interrupt support (Ctrl+C)Ships with
atuin_historytool active by default — lets the AI search your command historyai.openingsettings (send_cwd, send_last_command)ai.capabilitiessettings to control the capabilities advertised to the serverai.enabledisNone, app will prompt to enable, disable, or cancel. Disabling releases?keybind.