Eliminate token burn with the coolest MCP on the net.
Burn fewer tokens. Ship cooler agents.
A standalone Model Context Protocol (MCP) server that gives any MCP-compatible coding agent — Claude Code, Cursor, OpenAI Codex CLI, Gemini CLI, OpenCode, Grok CLI — a sandboxed runtime, an FTS5 knowledge base, and a multi-messenger delivery channel. Built from scratch on the MCP spec. Zero outbound dependencies beyond the four pinned ones in package.json. MIT-licensed, audit-readable end-to-end.
When an agent needs to analyse a directory, a JSON dump, or 47 source files, the temptation is to Read every file and let the model figure it out from raw text. That's how 750 KB of cached context disappears in a single afternoon: every turn re-pays the read cost.
Don't pull data into the model. Push code at the data and pull back the answer.
"Across these 47 TypeScript files, find every
awaitthat's missing atry/catch."
| Approach | Bytes consumed | Tokens (rough) |
|---|---|---|
Read × 47 (/src/**/*.ts) |
~700 KB raw text in context | ~175,000 |
ctx_execute (one shell+grep call, prints summary) |
~3.6 KB summary | ~900 |
The 195× reduction isn't theoretical — it's what a real morning-brief pipeline measures every day. The agent's job is to write a script, not to memorise the repo.
ctx_execute runs that script in a sandboxed subprocess (11 supported runtimes), captures stdout, optionally filters with an intent keyword, indexes the full output in FTS5 (so the agent can search it later without re-reading), and returns only the compact summary to the context window.
- Pretzel Porter adapter — Context Cooler now targets self-hosted LLMs. Pretzel Porter is a Claude Code-style terminal agent that runs entirely on a local or privately-hosted Ollama model. Pick
pretzel-porterat install time; the adapter writes into~/.pretzel-porter/agent.config.local.json. - Grok CLI native adapter — first-class support for xAI Grok CLI / Build TUI.
install.py --platform=grok(or interactive) now writes a native[mcp_servers.context-cooler]table to~/.grok/config.toml(stdlib TOML, no extra deps). Works alongside the existing.claude.jsoncompatibility layer; enables project.grok/config.tomltoo.
- Installs on any machine. The installer defaults to your home directory (great for Grok CLI / Cursor / Claude Code standalone users), auto-creates the data dir, and runs only the universal steps (build + register MCP + dbs + timestamp). Override with
$CONTEXT_COOLER_HOMEor--data-dir. --data-dirflag sets where the SQLite databases live.
- Platform adapters — one-shot installers for Claude Code, Cursor, OpenAI Codex CLI, Gemini CLI, OpenCode, Pretzel Porter, and Grok CLI. Pick one or all of them at install time. See "Platform adapters".
- Exit classification —
ctx_executenow returns a structuredstatus:success | runtime_error | timeout | sandbox_violation | language_unavailable. Agents can branch on the failure mode instead of parsing stderr. - Local update reminder —
ctx_doctorreads~/.context-cooler/last-upgrade.txt(purely local, no network call) and surfaces a "last upgraded N days ago" warning when it's older than 30 days. - Polished installer —
install.pynow walks you through platform selection and install path interactively (stdlibinput(), no new dependencies). Non-TTY runs default to all platforms. - Backwards compatible — every v4.5 tool keeps the same name, schema, and on-success response shape. The new fields (
status,exit_code,duration_ms) are additive.
First-time install:
git clone https://github.com/Blackfrost-AI/context-cooler.git
cd context-cooler
python3 install.pyUpdate to the latest version:
cd context-cooler
python3 install.py --updateFirst-time install:
git clone https://github.com/Blackfrost-AI/context-cooler.git
cd context-cooler
python install.pyUpdate to the latest version:
cd context-cooler
python install.py --updateWindows notes: iMessage delivery is macOS-only. Telegram, Slack, and Discord work on all platforms. For full shell sandboxing support, install WSL (
wsl --install) and run the installer from inside WSL.
python3 install.py # Interactive — asks which agents to register
python3 install.py --platform=claude-code # Register one platform, skip prompt
python3 install.py --platform=all # Register every supported agent
python3 install.py --non-interactive # Use defaults, no prompts (for CI)
python3 install.py --dry-run # Preview changes without writing
python3 install.py --verify # Check installation status
python3 install.py --uninstall # Show uninstall notes
python3 install.py --update # git pull + rebuild + re-register
python3 install.py --accept-disclaimer # Skip disclaimer prompt (CI/scripts)
python3 install.py --data-dir /custom/path # Custom data directoryEvery install runs these four universal steps:
- Builds the MCP server (
npm install+npx tsc). - Registers
context-coolerwith each selected platform adapter (Claude Code, Cursor, Codex, Gemini, OpenCode, Pretzel Porter, Grok CLI). Each adapter writes atomically (tmp file + rename) to that platform's MCP config file. - Initialises SQLite databases (
stats.db+sessions.db) under the data directory (default: your home directory, override with$CONTEXT_COOLER_HOMEor--data-dir). The directory is auto-created. - Records the install timestamp in
<data-dir>/context/last-upgrade.txtsoctx_doctorcan remind you to upgrade later.
- Node.js 18+ (for the MCP server)
- Python 3.8+ (for the installer and helper scripts — stdlib only, no pip dependencies)
- SQLite (bundled with Python and Node.js via better-sqlite3)
Each adapter writes a single MCP-server entry (stdio, command node, args [abs-path-to-dist/server.js]) into the configuration file the host actually reads. Atomic write: tmp file + rename. Dry-run prints the path it would write to, then exits without touching disk.
| Platform | Config file written | Adapter |
|---|---|---|
| Claude Code | ~/.claude.json (mcpServers map) |
src/adapters/claude-code.ts |
| Cursor | ~/.cursor/mcp.json (mcpServers map) |
src/adapters/cursor.ts |
| OpenAI Codex CLI | ~/.codex/mcp_servers.json (mcpServers map) |
src/adapters/codex.ts |
| Gemini CLI | ~/.gemini/settings.json (mcpServers map) |
src/adapters/gemini.ts |
| OpenCode | ~/.config/opencode/opencode.json (mcp map) |
src/adapters/opencode.ts |
| Pretzel Porter | ~/.pretzel-porter/agent.config.local.json (mcpServers map) |
src/adapters/pretzel-porter.ts |
| Grok CLI | ~/.grok/config.toml ([mcp_servers] table) |
src/adapters/grok.ts |
Each adapter is under 80 lines and only depends on Node stdlib. They are also reachable from the command line for scripted installs:
node dist/adapters/index.js list
# {"adapters":["claude-code","cursor","codex","gemini","opencode","pretzel-porter","grok"]}
node dist/adapters/index.js install \
--server="$(pwd)/dist/server.js" \
--platform=cursor \
--dry-run
# {"platform":"cursor","configPath":"/Users/you/.cursor/mcp.json","ok":true,"detail":"would register context-cooler -> ..."}install.py calls this CLI under the hood, one platform at a time.
Context Saver is a single MCP server that any MCP-compatible agent auto-discovers. When the agent needs to run code, search data, or deliver messages, it calls our tools directly — there's nothing to skip or bypass.
┌──────────────────────────────────────────────────────────────────┐
│ ANY MCP-Compatible AI Agent │
│ Claude Code / Cursor / Codex / Gemini CLI / OpenCode / Pretzel Porter / Grok CLI / Custom │
└───────────────────────────┬──────────────────────────────────────┘
│
MCP Protocol (stdio)
│
┌───────────────────────────▼──────────────────────────────────────┐
│ context-cooler (Node.js MCP Server) │
│ │
│ 10 Tools: Core Libraries: │
│ • ctx_execute (sandbox) • sandbox.ts (11 languages) │
│ • ctx_execute_file (file inject) • exit-classify.ts (status) │
│ • ctx_batch (multi-cmd) • filter.ts (intent scoring)│
│ • ctx_search (FTS5 query) • db.ts (SQLite + FTS5) │
│ • ctx_index (store data) • chunker.ts (markdown/JSON) │
│ • ctx_fetch_index (HTTP→index) • redact.ts (secret strip) │
│ • ctx_session (P1-P4 state) • env.ts (config loader) │
│ • ctx_stats (aggregation) │
│ • ctx_deliver (4 backends) Adapters (v4.6): │
│ • ctx_doctor (health check) • claude-code / cursor / │
│ codex / gemini / opencode / pretzel-porter / grok │
│ Databases: │
│ • stats.db (runs + fts_index) │
│ • sessions.db (events + snapshots) │
└──────────────────────────────────────────────────────────────────┘
│
Compact output (100-500 B)
instead of raw dump (3-50 KB)
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ Agent Context Window │
│ 70-98% smaller than raw API responses │
└──────────────────────────────────────────────────────────────────┘
After install, every selected platform's config file ends up with an entry like this (Claude Code shown):
{
"mcpServers": {
"context-cooler": {
"type": "stdio",
"command": "node",
"args": ["/path/to/context-cooler/dist/server.js"],
"env": {}
}
}
}For Grok CLI the native entry (written by the new grok adapter) looks like:
[mcp_servers.context-cooler]
command = "node"
args = ["/path/to/context-cooler/dist/server.js"]
env = {}
enabled = trueAny MCP client (Claude Code, Cursor, Codex CLI, Gemini CLI, OpenCode, Pretzel Porter, Grok CLI) auto-discovers the 10 tools and calls them natively.
Run code in 11 languages with intent-driven output filtering. Full output is indexed in FTS5; only the filtered summary enters the context window.
Supported languages: JavaScript, TypeScript, Python, Shell, Ruby, PHP, Perl, Go, Rust, R, Elixir.
ctx_execute(language="python", code="...", intent="check balance")
→ 120 B summary instead of 3 KB raw dump
v4.6 structured result. Every call returns:
Status semantics:
| Status | When |
|---|---|
success |
Process exited 0. |
runtime_error |
Non-zero exit, no other classifier matched. |
timeout |
Killed because args.timeout elapsed. |
sandbox_violation |
Non-zero exit + stderr matched a kernel/sandbox block pattern (operation not permitted, seccomp, EPERM, sandbox-exec ... deny). |
language_unavailable |
The runtime executable wasn't on PATH (spawn ENOENT, command not found). |
Same as ctx_execute but injects a file's content as a variable (FILE_CONTENT) into the execution environment.
Run multiple commands and/or search queries in a single MCP call. Each command is executed sequentially with its own intent filter.
ctx_batch(commands=[
{"language": "python", "code": "...", "intent": "summary"},
{"language": "shell", "code": "...", "intent": "top 5"}
], queries=["previous error rates"])
Search previously indexed data using SQLite FTS5 with BM25 ranking. Supports phrase matching, boolean operators, and prefix queries.
ctx_search(queries=["deployment errors", "position changes"])
Index content (text, JSON, or file paths) into FTS5 with automatic chunking. Markdown is chunked by headings, JSON by key paths, plain text by 50-line blocks. 4096 byte max per chunk, 100 KB per entry, 10 K max rows with auto-pruning.
Fetch a URL, convert HTML to markdown (via Turndown), and index the content. Follows redirects, enforces 1 MB cap.
ctx_fetch_index(url="https://docs.example.com/api", source="API docs")
Log events with P1-P4 priority, take snapshots before compaction, and restore state after. Snapshots fit within a strict 2 KB budget (40% P1 / 30% P2 / 20% P3 / 10% P4).
ctx_session(action="log", event_type="decision", priority="high", data={...})
ctx_session(action="snapshot") # Before compaction
ctx_session(action="restore") # After compaction
ctx_session(action="stats") # Event counts and sizes
Aggregate stats across both stats.db and sessions.db. Shows total runs, bytes saved, compression ratios, and session event counts.
Send messages via iMessage (macOS), Telegram, Slack, or Discord. Auto-detects available backend based on environment variables.
| Backend | Requirement | Platform |
|---|---|---|
| iMessage | imsg CLI |
macOS only |
| Telegram | TELEGRAM_BOT_TOKEN + TELEGRAM_CHAT_ID |
All |
| Slack | SLACK_WEBHOOK_URL |
All |
| Discord | DISCORD_WEBHOOK_URL |
All |
Checks the data directory, databases, FTS5 tables, skills directory, 5 language runtimes, mcporter availability, all 4 delivery backends, and (v4.6) the local upgrade reminder. Returns a pass/warn/fail report.
The upgrade reminder reads ~/.context-cooler/last-upgrade.txt (an ISO 8601 timestamp written by install.py), compares it to today, and surfaces a warn if it's older than 30 days. No network call — this is purely a local file comparison.
Every code execution goes through ctx_execute. The full output is captured, filtered by intent, and indexed. Only the compact summary (100-500 bytes) enters the context window.
Agent → ctx_execute → sandbox (11 langs) → intent filter → 120 B summary
↓
FTS5 index (full data preserved)
Skills return minimal output by default. --verbose is required for full data. ctx_execute auto-injects --verbose so it gets the full data to filter, but only returns the compact result.
# Default: 3 fields per item (~80 bytes)
{"s": "AAPL", "qty": "100", "pnl": "1500.00"}
# Verbose (only ctx_execute sees this): 12+ fields (~350 bytes)
{"symbol": "AAPL", "qty": "100", "side": "long", "market_value": "18500", ...}Pass an intent string and Context Saver extracts only matching fields using fast keyword scoring:
"check balance"→ returns equity, buying_power, cash (3 fields out of 40+)"find losing"→ returns only positions with negative P&L"top 5 movers"→ returns top 5 items sorted by change
Smart wrapper dict handling: {"count": 8, "positions": [...]} → automatically unwraps and recurses.
P1-P4 priority events survive conversation compaction via 2 KB snapshots stored in SQLite. Critical decisions and alerts are always preserved; informational queries are dropped first.
For deterministic tasks, bypass the LLM entirely. Schedule pipelines via launchd/cron:
launchd → python3 pipeline.py → ctx_execute → ctx_deliver → iMessage/Telegram
No agent. No model. No tokens.
Measured on a live agent instance running 8 positions, 20-symbol movers, daily briefs:
| Call | Raw Output | After Filtering | Savings |
|---|---|---|---|
account |
357 B | 95 B | 73.4% |
positions (8 holdings) |
2,739 B | 822 B | 70.0% |
movers (20 symbols) |
2,284 B | 367 B | 83.9% |
| Pipeline total | 5,380 B | 1,284 B | 76.1% |
Measured-in-real-tokens correction (2026 audit). The percentages above are byte ratios measured against the tool's own forced
--verboseoutput and converted at an asserted "1.5 tokens/byte". A token audit (counted with a real sub-word tokenizer) found the real rate is ~0.30 tokens/byte for JSON — so the byte-based headlines (and the "750K→200K tok/day", "195×") are roughly 5× overstated when restated in tokens. Counted honestly as(compact_baseline − returned)/compact_baseline, the win depends heavily on how you call the tool: passingfields(or anintent) on a large output saves ~85–99% real tokens; calling it with no filter on small outputs is roughly break-even-to-negative because of a fixed per-call metadata overhead. Rule of thumb: usefields/intent, and reach forctx_executewhen the raw output is large and you only need part of it — not for small results you'd happily read directly.
Zero-token morning brief pipeline: launchd triggers Python directly — no LLM tokens consumed.
WITHOUT Context Saver:
agent calls skill → 3 KB raw JSON floods context → 40 wasted fields
agent calls skill → 5 KB raw JSON floods context → 50 irrelevant records
agent calls skill → 20 KB raw JSON floods context → 200 search results
Session compacts → all working state lost → 20 KB cold restart
Daily token burn: ~750,000 tokens
WITH Context Saver v4.6:
agent calls ctx_execute → 120 B summary enters context → full data indexed
agent calls ctx_execute → 300 B filtered enters context → only matching records
agent calls ctx_batch → 500 B combined → one MCP call, not three
Session compacts → 2 KB snapshot preserved → instant resume
Daily token burn: ~200,000 tokens (73% reduction)
This tool executes code and registers itself into your AI agents' configs. Read this honestly. A 2026 security + performance audit (REDFORGE) found and fixed several serious issues; the items below describe the actual behavior of the TypeScript MCP server (
src/,dist/) after those fixes.
- Code execution is OPT-IN and NOT sandboxed.
ctx_execute*runs your code in a plain subprocess with no filesystem jail, no network namespace, no seccomp/capability drop. It refuses to run unless you setCTX_ALLOW_EXEC=1. There is no containment when enabled — only enable it on a host you control. A real OS sandbox (bubblewrap/seccomp/landlock,sandbox-exec, a container, or a WASM isolate) is on the roadmap; until then, treat enabling execution as granting the tool your full shell privileges. ("sandbox_violation" is a best-effort post-hoc stderr label, not a preventative control.) - List-arg subprocess (no shell-string concat).
ctx_deliverusesexecFileSyncarray args; skill mode uses list-argexecuteArgv; file content is passed via the environment, not interpolated into shell source. - Environment allowlist. Only a minimal set of vars (plus any you opt in via
CTX_EXEC_ENV_ALLOW) is forwarded to executed code — your other secrets/tokens are withheld. Returned output is also redaction-scanned. - Secret redaction. Multi-shape detector (API keys, Bearer, Stripe/Alpaca, AWS access+secret keys incl.
INI/env
key=value, GitHub/Slack/Google tokens, JWTs, PEM private-key blocks) applied to both the index and the returned payload. Still regex-based — not a guarantee; don't rely on it for high-stakes secrets. - Filesystem read confinement.
ctx_execute_file/ctx_indexonly read inside<data>/workspace+ the current project dir (extend withCTX_FS_ALLOW) — not your whole home directory. - SSRF protections.
ctx_fetch_indexallows only http/https, blocks loopback/RFC1918/link-local (incl.169.254.169.254), and re-validates the destination on every redirect hop. - Input validation at the MCP boundary. Every tool's args are
zod-parsed with size/range bounds (code ≤10 MB, timeout 100 ms–10 min, batch ≤100 cmds, queries ≤50, search limit 1–50). - Installer defaults to deny. Non-interactive runs register no platforms unless you pass an explicit
--platform;--platform=allrequires confirmation (--yes);--update(which pulls + runs remote code) requires--allow-remote-code;--uninstallactually removes the entries it added. - Parameterized SQL (no SQL injection), E.164 phone validation, atomic config writes, index caps (100 KB/entry, 10 K rows), snapshot budget clamp (256–65536) — all confirmed.
- Dependencies are pinned exact (
npm cirecommended). NOTE: the MCP SDK transitively pulls a full Express/Hono/cors/ajv/jose HTTP+OAuth stack (~130 packages) that this stdio server does not use — treat it as attack surface and advisory-scan it in CI.better-sqlite3runs a native build on install.
| Variable | Default | Description |
|---|---|---|
CONTEXT_COOLER_HOME |
your home directory | Root data directory (dbs live under <dir>/context/, optional .env) |
CTX_SNAPSHOT_BUDGET |
2048 |
Max bytes for session snapshots (256-65536) |
CTX_FTS_ENABLED |
1 |
Set to 0 to disable FTS5 indexing |
TELEGRAM_BOT_TOKEN |
— | Telegram bot token for ctx_deliver |
TELEGRAM_CHAT_ID |
— | Default Telegram chat ID |
SLACK_WEBHOOK_URL |
— | Slack incoming webhook URL |
DISCORD_WEBHOOK_URL |
— | Discord webhook URL |
context-cooler/
├── src/
│ ├── server.ts # MCP server entry point (stdio transport)
│ ├── tools/
│ │ ├── execute.ts # ctx_execute — sandboxed execution + status
│ │ ├── execute-file.ts # ctx_execute_file — file-aware execution
│ │ ├── batch.ts # ctx_batch — multi-command pipeline
│ │ ├── search.ts # ctx_search — FTS5 knowledge query
│ │ ├── index.ts # ctx_index — content indexing
│ │ ├── fetch-index.ts # ctx_fetch_index — HTTP fetch + index
│ │ ├── session.ts # ctx_session — P1-P4 session continuity
│ │ ├── stats.ts # ctx_stats — usage aggregation
│ │ ├── deliver.ts # ctx_deliver — multi-messenger delivery
│ │ └── doctor.ts # ctx_doctor — health check + upgrade reminder
│ ├── lib/
│ │ ├── sandbox.ts # Subprocess runner (11 languages)
│ │ ├── exit-classify.ts# v4.6 — status classifier
│ │ ├── filter.ts # Intent-driven keyword scoring
│ │ ├── db.ts # SQLite + FTS5 connection management
│ │ ├── chunker.ts # Markdown/JSON/text chunking
│ │ ├── redact.ts # Secret redaction patterns
│ │ └── env.ts # Environment and config loader
│ └── adapters/ # v5.2 — platform installers (≤80 lines each)
│ ├── claude-code.ts
│ ├── cursor.ts
│ ├── codex.ts
│ ├── gemini.ts
│ ├── opencode.ts
│ ├── pretzel-porter.ts # Pretzel Porter (self-hosted LLM)
│ ├── grok.ts # Grok CLI ~/.grok/config.toml (TOML)
│ ├── types.ts
│ ├── util.ts # atomic write, JSON + TOML read/splice helpers
│ └── index.ts # CLI entry point + registry
├── install.py # Cross-platform installer (interactive in v4.6)
├── package.json # Node.js dependencies
├── tsconfig.json # TypeScript configuration
├── skill.json # MCP server manifest
└── docs/
└── ARCHITECTURE.md # Detailed architecture documentation
MIT
{ "success": true, "status": "success", // success | runtime_error | timeout | sandbox_violation | language_unavailable "exit_code": 0, "duration_ms": 47, "summary": { ... }, // or "output": "..." for non-JSON stdout "raw_bytes": 3127, "summary_bytes": 96, "bytes_saved": 3031, "savings_pct": 96.9, "indexed": true, "stderr": "..." // present only when stderr is non-empty }