Skip to content

PsychQuant/agent-cacher

Repository files navigation

agent-cacher

Sits between AI coding agents and slow CLI commands — records every call to a SQLite store, lets the agent decide when to reuse a recent result instead of re-running.

The problem

AI coding agents (Claude Code, Codex, Aider, etc.) call shell commands constantly. Some of those commands are slow:

Command Cold cost Reasonable reuse window
system_profiler SPFontsDataType 10–30s hours/days
osascript -e 'tell app "Calendar"...' 1–3s minutes
gh api repos/... 200–800ms seconds/minutes
find / -name ... unbounded depends

A human runs each of these once. An AI agent re-runs them every conversation, every session, every restart. The cumulative tax shows up as latency in every reply.

agent-cacher records every shell-command invocation (the wrapper still runs the real command — always) and exposes the history through both a CLI and an MCP server. The agent can look up "did I already run system_profiler SPFontsDataType recently? what was the SHA256? when?" and decide for itself whether to re-run or fetch the cached output.

Design principles

  1. Explicit lookup, not transparent intercept — the wrapper always runs the real command and records the result. The agent decides whether to query the cache and reuse a previous output via the CLI or MCP tools (cache.lookup, cache.fetch). Mode A (PATH shadowing → return cache directly) was rejected because stale-cache hits become un-traceable bugs in the agent's reasoning chain.
  2. Honest about staleness — every recorded call carries a timestamp, duration, exit code, and a SHA256 of the captured stdout. The agent reads those and judges freshness like a human reading a last modified column.
  3. Append-only history — calls are never updated or deleted. Output bytes are deduplicated by SHA256, so the same system_profiler output recorded 100 times consumes the same disk as recording it once.
  4. Freshness lives in the agent, not the system — no TTL config, no FSEvents, no AI policy daemon. The LLM agent looking at the timestamp + diff already knows that fonts don't change in 3 hours and gh api does in 30 seconds.
  5. Measure before optimizinganalyzers/analyze_calls.py mines Claude Code transcripts to find hot/slow commands worth wrapping.

Design evolution: Mode A → Mode B

Earlier drafts of this repo described a transparent-intercept design (Mode A): the wrapper would replace slow-command on PATH, and on cache hit return the saved output without running the underlying command. We pivoted to Mode B (explicit lookup) because:

  • Mode A's worst failure is a silent stale hit — the agent gets old data, reasons on it, and the bug is invisible because the cache is invisible.
  • LLM agents already think about freshness when given the data ("3 hours ago, output 4.2 MB, sha256 abc…"). Pushing TTLs back to the LLM beats pre-baking them.
  • Wrappers stay simple: they're loggers, not policy engines.

The proposal.md and design.md for this pivot live under openspec/changes/explicit-cache-lookup/.

Structure

agent-cacher/
├── Package.swift
├── Sources/
│   ├── AgentCacherCore/          # SQLite schema, store, query, fingerprint, hash
│   ├── AgentCacherCLI/           # `cacher` binary — record-internal, lookup, fetch, recent, diff
│   ├── AgentCacherMCP/           # `cacher-mcp` stdio MCP server
│   └── AgentCacherWrappers/      # wrapper-template.sh + sample wrappers
├── Tests/AgentCacherCoreTests/   # XCTest suite (schema, store, query, CLI e2e, MCP parity, wrapper)
├── analyzers/                    # Python scripts that mine Claude Code transcripts
└── openspec/                     # Spectra change proposals + specs (source of truth)

Quickstart

# 1. Build
swift build -c release

# 2. Install the CLI
ln -sf "$PWD/.build/release/cacher" ~/bin/cacher

# 3. Install a wrapper. Sample: gh api wrapper.
cp Sources/AgentCacherWrappers/wrapper-template.sh ~/bin/
cp Sources/AgentCacherWrappers/gh-api-wrapper.sh ~/bin/gh   # shadows the real `gh`
chmod +x ~/bin/gh ~/bin/wrapper-template.sh

# 4. Use the wrapped command — the wrapper still really runs `gh api`,
#    just records it on the way out.
gh api repos/anthropics/claude-code
gh api repos/anthropics/claude-code   # second call still goes to GitHub

# 5. The agent (or you) can now see the history:
cacher recent --binary gh --json
cacher lookup "gh api" --limit 5 --json
cacher fetch 1                       # raw bytes of the first recorded call
cacher diff 1 2                      # unified diff between two calls' outputs

# 6. From an MCP-aware agent, register cacher-mcp as an MCP server.
#    The agent gets tools: cache.lookup, cache.fetch, cache.recent, cache.diff.

The default DB lives at ~/Library/Application Support/agent-cacher/cache.db on macOS; override with --db-path (CLI) or AGENT_CACHER_DB_PATH env var.

Status

Core (schema, store, query, fingerprint, output-hash, WAL concurrency) plus CLI, MCP server, and wrapper template are implemented and tested. The Python analyzer remains the offline transcript-mining tool. First-class wrappers for system_profiler, osascript, etc. land in subsequent changes.

License

MIT

About

Sits between AI agents and slow CLI commands — records every call to SQLite, lets the agent decide when to reuse cached output. Mode B explicit-lookup design.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors