epublate

A rich TUI that translates ePub story books with an LLM — preserving formatting and keeping a per-project lore bible so that characters, places, and events stay consistent across the entire book.

Status: v1 release-ready (0.1.0). All milestones M0 through M6 are landed: the project skeleton + CI, ePub round-trip, single-segment translation, glossary v1 with cascade, Project Dashboard + batch + cost meter / budget cap, helper-LLM extractor (book intake + pre-pass), and M6 (high-contrast theme + global cheat sheet on ? / f1, Settings screen on s, snapshot baselines, opt-in epubcheck integration with a --strict toggle, packaging polish, and a tag-triggered release pipeline). See docs/PRD.md for the full plan, docs/USAGE.md for the curator walkthrough, and CHANGELOG.md for what shipped when.

Why

Translating long story books with off-the-shelf LLM tools fails in three ways: format gets flattened, proper-noun translations drift ("Élise" → "Elise" → "Eliza" across chapters), and each call has no memory of prior decisions. epublate fixes all three:

Format preservation. ePub structure, inline tags, footnotes, and assets round-trip exactly. The model never sees raw HTML.
Consistency. A per-project glossary tracks characters, places, events, items, and recurring phrases. Locked entries are enforced by a mechanical validator — drift is impossible by construction.
Memory. Every segment, decision, and LLM call is persisted in a single SQLite file. Quit and resume mid-book with zero rework.

Features (planned for v1)

ePub 2 and ePub 3 round-trip with structural preservation.
Per-project lore bible with three-tier status (proposed / confirmed / locked) and cascade re-translation on changes.
Tone presets so the LLM gets the audience and register right out of the gate (literary fiction by default; presets for classic literature, historical fiction, children's picture books, middle grade, YA, fairytale / folklore, genre fiction, noir / hard-boiled crime, horror / gothic, cozy romance, explicit adult, humor / comedy, memoir / biography, poetry / verse, religious / spiritual, technical manuals, academic prose, and journalism). Curator picks one in the New Project modal or epublate new --style-profile and can swap / edit it later from Settings (PRD F-STYLE-1/2). The helper-LLM intake even suggests a preset based on its read of the source (F-STYLE-3) — and the New Project modal auto-detects the right tone before you hit Create by sniffing a few blocks of the picked ePub through the helper LLM (F-STYLE-4; toggle in Settings with A, env override EPUBLATE_AUTO_TONE_SNIFF).
Works with any OpenAI-compatible chat-completions endpoint (OpenAI, Azure, OpenRouter, Together, Ollama, vLLM, llama.cpp).
Rich Textual TUI: side-by-side reader, glossary editor, batch mode with curator inbox, cost meter and budget caps.
Resumable: SQLite WAL mode, atomic exports, append-only event log.
Optional embeddings for style retrieval and fuzzy entity matching.

PDF support is a non-goal in v1, but the format-handling layer is designed to make it pluggable later.

A tour, in pictures

Project Dashboard	Reader

Glossary curator	Curator Inbox

Settings	Help / cheat sheet (press `?`)

See docs/USAGE.md for the full curator walkthrough. The screenshots above are regenerated from the live TUI by scripts/capture_screenshots.py (uv run python scripts/capture_screenshots.py); both SVGs and PNGs land in docs/screenshots/.

Quickstart (development)

This project uses uv for everything — environment, dependencies, tests, builds. You only need uv on your PATH; it will install Python for you.

git clone https://github.com/<you>/epublate.git
cd epublate

uv python install                 # installs the right Python (.python-version)
uv sync --all-extras --dev        # creates the venv + installs deps

uv run pytest                     # tests (no network, no LLM keys needed)
uv run ruff check .               # lint
uv run mypy src/epublate          # types

Run from Docker (no local Python needed)

Each push to main and every release publishes a multi-arch image to GitHub Container Registry. The image bundles a JRE so the optional epubcheck validator works out of the box.

# Pull the latest image
docker pull ghcr.io/madpin/epublate:latest

# CLI commands work the same as the local install. Mount a host
# directory at /data so projects, recents, and the UI config persist
# across runs.
docker run --rm -it \
    -v "$PWD/epublate-data:/data" \
    ghcr.io/madpin/epublate:latest --mock-llm new docs/Sample.epub \
    --source-lang en --target-lang pt --out /data/projects/sample

# The Textual TUI needs a real terminal — pass --tty (or -it):
docker run --rm -it \
    -v "$PWD/epublate-data:/data" \
    ghcr.io/madpin/epublate:latest --mock-llm

Build the image yourself with docker build -t epublate:dev . from the repo root. The Dockerfile is a two-stage build powered by uv that resolves dependencies against uv.lock for reproducible images.

Try it on the sample book

The repo ships a real ePub at docs/Sample.epub so you can take the M2 single-segment translation flow for a spin without hunting for a book.

The TUI is the home — start here

# Launch the TUI. The Projects screen lists recently-opened projects
# (stored at ~/.config/epublate/recents.json) and lets you create or
# open projects without leaving the terminal.
uv run epublate --mock-llm

# In-TUI keys on the Projects screen:
#   n      → new project (modal: source ePub, target lang, out dir)
#   o      → open project by path
#   enter  → open the highlighted recent project
#   delete → drop the highlighted entry from recents (files untouched)
#   r      → refresh / prune missing entries
#   T      → cycle theme (dark / light / high-contrast)
#   ? / F1 → context-aware cheat sheet
#   q      → quit

The Dashboard, Reader, Glossary, and Inbox are all reachable from the new-project / open-project flow above; pressing q on any inner screen pops back to the Projects landing page.

Or skip the TUI for scripting / CI

# 1) Bootstrap a project from the sample (no LLM keys required).
#    `--out` must point at a fresh / empty directory; the command refuses
#    to overwrite an existing one. `rm -rf` first if you want to re-run.
rm -rf /tmp/epublate-sample
uv run epublate --mock-llm new docs/Sample.epub \
    --source-lang en --target-lang pt --out /tmp/epublate-sample

# 2) Open the Project Dashboard for a specific project.
#    Bindings: o = Reader, g = Glossary, i = Inbox, b = Batch,
#              B = Set/Clear budget cap, e = Intake (M5), s = Settings,
#              r = Refresh, q = Back. Global: T = cycle theme
#              (dark / light / high-contrast), ? or F1 = cheat sheet.
#    Reader: t = translate, a = accept, e = edit, r = retry,
#            j/k = next/prev segment, J/K = next/prev chapter, q = back.
uv run epublate --mock-llm open /tmp/epublate-sample

# 3) Headless: translate every pending segment via the worker pool.
#    Failures land in the Inbox; the run pauses if the budget cap is hit.
uv run epublate --mock-llm batch /tmp/epublate-sample \
    --concurrency 2 --budget 1.00

# 4) Triage: list flagged segments, proposed glossary entries, and alerts.
uv run epublate inbox /tmp/epublate-sample

# 5) Inspect spend / token / cache-hit stats for the project.
uv run epublate stats /tmp/epublate-sample --json

# 6) Manage the budget cap from the CLI (or via the Dashboard's `B` key).
uv run epublate budget set /tmp/epublate-sample 5.00
uv run epublate budget show /tmp/epublate-sample
uv run epublate budget clear /tmp/epublate-sample

# 7) Export the (possibly partial) translated ePub.
uv run epublate export /tmp/epublate-sample --out /tmp/epublate-sample.epub

# 7b) Validate the exported ePub against epubcheck (PRD F-IO-6, M6).
#     --epubcheck = warn-only summary; --strict = fail with exit 3 on
#     any error. Both require the optional [epubcheck] extra and Java.
uv run epublate export /tmp/epublate-sample \
    --out /tmp/epublate-sample.epub --strict

Swap --mock-llm for a real OpenAI-compatible endpoint by exporting EPUBLATE_LLM_BASE_URL, EPUBLATE_LLM_API_KEY, and EPUBLATE_LLM_MODEL (any combination of OpenAI, Azure, OpenRouter, Together, Ollama, vLLM, or llama.cpp works — see PRD §6.1). EPUBLATE_LLM_HELPER_MODEL (optional) selects a cheaper model for the M5 helper-LLM extractor; if unset, the helper falls back to EPUBLATE_LLM_MODEL (PRD F-LLM-2 — same endpoint, cheap model):

export EPUBLATE_LLM_BASE_URL=https://api.openai.com/v1
export EPUBLATE_LLM_API_KEY=sk-...
export EPUBLATE_LLM_MODEL=gpt-5-mini
export EPUBLATE_LLM_HELPER_MODEL=gpt-5-mini  # optional, defaults to $EPUBLATE_LLM_MODEL

uv run epublate open /tmp/epublate-sample
uv run epublate batch /tmp/epublate-sample --concurrency 2 --budget 5.00

For local dev you can drop the same variables into a .env file instead of exporting them every shell session — see .env.example for a template:

cp .env.example .env   # then fill in EPUBLATE_LLM_API_KEY etc.
uv run epublate open /tmp/epublate-sample

The CLI auto-loads ./.env first, then <project_dir>/.env for project-scoped subcommands. Real shell variables always win, pytest skips loading entirely so the test suite stays hermetic, and EPUBLATE_DISABLE_DOTENV=1 short-circuits loading for sealed CI jobs. .env and .envrc are already in .gitignore, so secrets won't be committed by accident.

# M5: helper-LLM book intake on a fresh project (opt-in, costs tokens).
uv run epublate new docs/Sample.epub \
    --source-lang en --target-lang pt \
    --out /tmp/epublate-sample --intake

# M5: re-run intake on an existing project at any time.
uv run epublate intake /tmp/epublate-sample --max-segments 30

# M5: opt-in pre-pass on batch — surfaces fresh proper nouns before
# each chapter so the translator's prompt sees them immediately.
uv run epublate batch /tmp/epublate-sample --extract

End users (after release) will install with:

uv tool install epublate                    # persistent install
uv tool install "epublate[epubcheck]"       # also wire post-export ePub validation
uvx epublate path/to/book.epub              # ephemeral run

The [epubcheck] extra installs the epubcheck Python wrapper (which bundles the upstream Java JAR). It also requires a JRE on your system. Without it, epublate export --strict still works — it just prints a one-line "epubcheck skipped" summary instead of running validation.

Project layout (planned)

epublate/
  pyproject.toml         # PEP 621 metadata + deps (single source of truth)
  uv.lock                # committed for reproducible installs
  .python-version        # pinned Python; uv installs it on demand
  src/epublate/
    app/                 # Textual UI
    core/                # project, pipeline, segmentation, validators, cache
    formats/             # base.py + epub.py (pdf.py later)
    llm/                 # OpenAI-compatible client + prompts
    glossary/            # the lore bible
    embeddings/          # optional
    db/                  # schema + migrations
  tests/                 # pytest suite + fixtures
  docs/PRD.md            # the spec
  AGENTS.md              # cross-tool agent guidance
  CLAUDE.md              # → AGENTS.md (symlink)
  .cursor/rules/         # Cursor rules

Documentation

docs/PRD.md — product requirements, architecture, schema, prompting strategy, roadmap, open questions.
AGENTS.md — invariants, conventions, and rules for AI agents (Cursor, Claude Code, Codex, Aider, …) working on the codebase.
.cursor/rules/ — focused, scoped Cursor rules.

Contributing

Read AGENTS.md. The hard invariants there are non-negotiable (format preservation, glossary consistency, resumability, local-first, OpenAI-compatible only, uv-only).
Bootstrap with uv sync --all-extras --dev.

Before opening a PR:

uv run pytest
uv run ruff check .
uv run ruff format --check .
uv run mypy src/epublate

New requirements get a stable PRD ID (F-…, NFR-…) and are referenced from the PR description.

License

MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

epublate

Why

Features (planned for v1)

A tour, in pictures

Quickstart (development)

Run from Docker (no local Python needed)

Try it on the sample book

The TUI is the home — start here

Or skip the TUI for scripting / CI

Project layout (planned)

Documentation

Contributing

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src/epublate		src/epublate
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

epublate

Why

Features (planned for v1)

A tour, in pictures

Quickstart (development)

Run from Docker (no local Python needed)

Try it on the sample book

The TUI is the home — start here

Or skip the TUI for scripting / CI

Project layout (planned)

Documentation

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages