Sits between AI coding agents and slow CLI commands — records every call to a SQLite store, lets the agent decide when to reuse a recent result instead of re-running.
AI coding agents (Claude Code, Codex, Aider, etc.) call shell commands constantly. Some of those commands are slow:
| Command | Cold cost | Reasonable reuse window |
|---|---|---|
system_profiler SPFontsDataType |
10–30s | hours/days |
osascript -e 'tell app "Calendar"...' |
1–3s | minutes |
gh api repos/... |
200–800ms | seconds/minutes |
find / -name ... |
unbounded | depends |
A human runs each of these once. An AI agent re-runs them every conversation, every session, every restart. The cumulative tax shows up as latency in every reply.
agent-cacher records every shell-command invocation (the wrapper still runs the real command — always) and exposes the history through both a CLI and an MCP server. The agent can look up "did I already run system_profiler SPFontsDataType recently? what was the SHA256? when?" and decide for itself whether to re-run or fetch the cached output.
- Explicit lookup, not transparent intercept — the wrapper always runs the real command and records the result. The agent decides whether to query the cache and reuse a previous output via the CLI or MCP tools (
cache.lookup,cache.fetch). Mode A (PATH shadowing → return cache directly) was rejected because stale-cache hits become un-traceable bugs in the agent's reasoning chain. - Honest about staleness — every recorded call carries a timestamp, duration, exit code, and a SHA256 of the captured stdout. The agent reads those and judges freshness like a human reading a
last modifiedcolumn. - Append-only history — calls are never updated or deleted. Output bytes are deduplicated by SHA256, so the same
system_profileroutput recorded 100 times consumes the same disk as recording it once. - Freshness lives in the agent, not the system — no TTL config, no FSEvents, no AI policy daemon. The LLM agent looking at the timestamp + diff already knows that fonts don't change in 3 hours and
gh apidoes in 30 seconds. - Measure before optimizing —
analyzers/analyze_calls.pymines Claude Code transcripts to find hot/slow commands worth wrapping.
Earlier drafts of this repo described a transparent-intercept design (Mode A): the wrapper would replace slow-command on PATH, and on cache hit return the saved output without running the underlying command. We pivoted to Mode B (explicit lookup) because:
- Mode A's worst failure is a silent stale hit — the agent gets old data, reasons on it, and the bug is invisible because the cache is invisible.
- LLM agents already think about freshness when given the data ("3 hours ago, output 4.2 MB, sha256 abc…"). Pushing TTLs back to the LLM beats pre-baking them.
- Wrappers stay simple: they're loggers, not policy engines.
The proposal.md and design.md for this pivot live under openspec/changes/explicit-cache-lookup/.
agent-cacher/
├── Package.swift
├── Sources/
│ ├── AgentCacherCore/ # SQLite schema, store, query, fingerprint, hash
│ ├── AgentCacherCLI/ # `cacher` binary — record-internal, lookup, fetch, recent, diff
│ ├── AgentCacherMCP/ # `cacher-mcp` stdio MCP server
│ └── AgentCacherWrappers/ # wrapper-template.sh + sample wrappers
├── Tests/AgentCacherCoreTests/ # XCTest suite (schema, store, query, CLI e2e, MCP parity, wrapper)
├── analyzers/ # Python scripts that mine Claude Code transcripts
└── openspec/ # Spectra change proposals + specs (source of truth)
# 1. Build
swift build -c release
# 2. Install the CLI
ln -sf "$PWD/.build/release/cacher" ~/bin/cacher
# 3. Install a wrapper. Sample: gh api wrapper.
cp Sources/AgentCacherWrappers/wrapper-template.sh ~/bin/
cp Sources/AgentCacherWrappers/gh-api-wrapper.sh ~/bin/gh # shadows the real `gh`
chmod +x ~/bin/gh ~/bin/wrapper-template.sh
# 4. Use the wrapped command — the wrapper still really runs `gh api`,
# just records it on the way out.
gh api repos/anthropics/claude-code
gh api repos/anthropics/claude-code # second call still goes to GitHub
# 5. The agent (or you) can now see the history:
cacher recent --binary gh --json
cacher lookup "gh api" --limit 5 --json
cacher fetch 1 # raw bytes of the first recorded call
cacher diff 1 2 # unified diff between two calls' outputs
# 6. From an MCP-aware agent, register cacher-mcp as an MCP server.
# The agent gets tools: cache.lookup, cache.fetch, cache.recent, cache.diff.The default DB lives at ~/Library/Application Support/agent-cacher/cache.db on macOS; override with --db-path (CLI) or AGENT_CACHER_DB_PATH env var.
Core (schema, store, query, fingerprint, output-hash, WAL concurrency) plus CLI, MCP server, and wrapper template are implemented and tested. The Python analyzer remains the offline transcript-mining tool. First-class wrappers for system_profiler, osascript, etc. land in subsequent changes.
MIT