Ouroforge is a local-first, evidence-native prototype for game-authoring loops. It turns a declared goal into a local run, captures evidence from the runtime, records what happened, and proposes the next change without giving agents or browser surfaces trusted write authority.
The name is Ouroboros (the serpent that feeds on its own tail) + Forge. The loop is intentionally inspectable:
Seed → Run → Evidence → Evaluation → Journal → Mutation → (back to Seed)
Status: pre-release private MVP with public-readiness and public-alpha launch-governance evidence recorded for a future manual visibility review. It runs one reproducible local demo today. Ouroforge is not a Godot replacement and makes no compatibility promises — treat it as an inspectable prototype, not a public launch or support commitment.
A Seed describes intent and acceptance criteria. Ouroforge runs that intent locally, captures runtime evidence, renders a deterministic verdict, journals the result, and records mutation proposals when the run falls short. The long-term ambition is agentic game authoring where AI can suggest changes, but evidence and review decide which changes become real.
The current MVP is useful as a reproducible local demo and contract suite for the loop. It is not a hosted service, production editor, release pipeline, or broad engine replacement.
Current checked-in behavior includes:
- Seed validation and local run execution for
seeds/platformer.yaml. - Local project validation and minimal 2D project scaffolding.
- Generated run evidence under
runs/with ledger, journal, evaluator, mutation, comparison, and dashboard read models. - A minimal browser runtime/probe path driven through local Chrome/Chromium.
- Read-only static dashboard and authoring cockpit surfaces over exported JSON.
- Fixture-backed contracts for scene, asset, tilemap, source-preview, sandbox, review, and public-readiness documentation boundaries.
- Source Mutation Preview v1 is complete as inert preview/review/sandbox evidence only; source patch apply to the trusted maintainer worktree remains unimplemented and forbidden until a separate later governance gate authorizes it.
- 3D Capability Gate v1 is complete as bounded local 3D evidence: scene graph, camera/projection, mesh/material refs, render smoke, collision/trigger, animation, probe/evaluator compatibility, deterministic demo/regression fixtures, normalized dashboard read models, and escaped read-only Studio inspection. It is not production 3D readiness, broad 3D compatibility, native export, plugin runtime, hosted/cloud behavior, or a Godot replacement claim.
Generated run, dashboard, screenshot, sandbox, and local tool artifacts are local state and stay untracked unless a future issue explicitly scopes a deterministic fixture.
Install Rust + Cargo, Node.js, Python 3, and Chrome/Chromium at a standard path
(or set OUROFORGE_CHROME=/path/to/chrome). No Playwright, database, cloud
service, account system, or hosted runtime is required.
cargo fmt --check
cargo test
cargo run -p ouroforge-cli -- seed validate seeds/platformer.yaml
cargo run -p ouroforge-cli -- project validate examples/project-workspace-fixtures/valid
cargo run -p ouroforge-cli -- project init .omx/tmp/project-scaffold-smoke --template minimal-2d
cargo run -p ouroforge-cli -- run .omx/tmp/project-scaffold-smoke/seeds/platformer.yaml \
--project .omx/tmp/project-scaffold-smoke --scenario-pack smoke --workers 1
rm -rf .omx/tmp/project-scaffold-smoke
cargo run -p ouroforge-cli -- run seeds/platformer.yaml --workers 4The run command prints a run directory such as runs/run-.... Project-bound runs
add optional project context to run.json, ledger, journal, and dashboard export;
legacy runs without --project stay compatible. Generated run artifacts are
intentionally git-ignored.
cargo run -p ouroforge-cli -- evidence list runs/<run-id>
cargo run -p ouroforge-cli -- journal show runs/<run-id>
cargo run -p ouroforge-cli -- mutation list runs/<run-id>
cargo run -p ouroforge-cli -- compare runs/<run-id> runs/<run-id>cargo run -p ouroforge-cli -- dashboard export \
--runs-root runs --output examples/evidence-dashboard/dashboard-data.json
python3 -m http.server 8000 --bind 127.0.0.1 --directory .- Evidence dashboard: http://127.0.0.1:8000/examples/evidence-dashboard/
- Authoring cockpit: http://127.0.0.1:8000/examples/authoring-cockpit/
- Runtime demo: http://127.0.0.1:8000/examples/game-runtime/
The current quickstart command audit is recorded in
docs/fresh-clone-onboarding-command-audit-v1.md.
For an isolated fresh-clone-style smoke, run
scripts/fresh-clone-smoke.sh as documented in
docs/fresh-clone-smoke-v1.md. Troubleshooting
and cleanup guidance lives in
docs/fresh-clone-troubleshooting-cleanup-v1.md.
These notes clarify expected generated state and cleanup boundaries without
changing repository visibility, release status, or trusted-write authority.
Ouroforge's loop is built around evidence over assertion:
- Seed — declare intent and acceptance criteria.
- Run — execute a local runtime/demo path and collect generated artifacts.
- Evidence — capture bounded runtime, browser, project, scenario, and probe outputs as inspectable files.
- Evaluation — produce a deterministic verdict from the evidence.
- Journal — summarize what actually happened with evidence references.
- Mutation proposal — record proposed next changes as reviewable data, not trusted source writes.
- Repeat — a later reviewed change can become the next seed/run cycle.
The Rust core and local filesystem own trusted state. Agents, browser workers, and Chrome DevTools Protocol observations are evidence inputs only.
seeds/platformer.yaml— the MVP seed used by the local demo.examples/game-runtime/andexamples/runtime-probe/— minimal local runtime and probe pages.examples/evidence-dashboard/— read-only evidence dashboard over exported dashboard JSON.examples/authoring-cockpit/— read-only authoring cockpit over generated evidence and proposal data.examples/*-v1,examples/*-v2, andexamples/*-regression— milestone fixtures, scenario packs, and evidence smokes.
Ouroforge's current safety boundary is conservative:
- Trusted authority: Rust CLI/core code and the local filesystem.
- Evidence only: agents, browser workers, and CDP observations can inform proposals but cannot apply them.
- Read-only browser surfaces: dashboard and cockpit pages render exported JSON and copyable commands; they do not write files, run commands, or accept source mutations.
- No command bridge: browser/UI surfaces do not invoke local commands or local server command bridges.
- No source apply authority: source-preview, sandbox, stale-target, rollback, and review artifacts are evidence/governance boundaries unless a later explicit issue authorizes trusted apply.
- Generated-state isolation:
runs/,target/, dashboard exports,.omx/,.omc/,.openchrome/,.claude/, and sandbox outputs remain local ignored state.
Security and trust-boundary references:
SECURITY.mddocs/evidence-fidelity-trust-boundary-v1.mddocs/public-alpha-security-trust-boundary-v1.mddocs/public-alpha-disclosure-and-sandbox-limitations-v1.mddocs/artifact-write-policy-v1.md
Ouroforge does not currently provide:
- hosted/cloud execution, accounts, authentication, authorization, or multi-tenant behavior;
- production readiness, support/security SLA, compatibility stability, or secure sandboxing for arbitrary untrusted content;
- native export, packaging, signing, publishing, deployment, or release automation;
- plugin runtime, marketplace, visual scripting, or third-party code-loading ecosystem;
- browser trusted writes, local command bridges, auto-apply, auto-merge, or reviewer bypass;
- source patch apply to the trusted maintainer worktree.
Public release still requires fresh evidence gates in
docs/public-readiness-audit.md,
docs/public-launch-checklist.md, and the
manual visibility-decision process. The launch-governance and communication-pack
docs are preparation artifacts, not a visibility toggle or publication event.
The roadmap and per-milestone completion records — with each milestone's
evidence chain and non-goals — live in docs/roadmap.md.
Cross-cutting boundaries are in
Non-goals and maturity boundaries; they are
not repeated per milestone here. Earlier completed milestones — including
Safe Source Mutation Apply, the GDD-to-Playable Prototype v1 prototype lane,
the Plugin / Extension System v1 lane, the Full Studio Editor lane, the
Godot-Plus Demo lane, and the Autonomous QA / Playtest Swarm v1 lane — keep
their full evidence chain and per-issue records in
docs/roadmap.md; only the current Era's snapshot is
summarised below.
Current state. Era E (Milestones 20–25) is recorded as complete: Loop Coverage Metric v1 (M20), Second Game Class and Loop Generalization v1 (M21), Trust Gradient v1 (M22, a GO design gate for bounded auto-apply), Multi-Iteration Evolve Campaigns v1 (M23), Game Complexity Ladder v1 (M24), and End-to-End Provenance Bundle and Audit Surface v1 (M25). Against #1's north-star (loop coverage × game complexity × trust), the descriptive, evidence-cited posture is: loop coverage across two game classes (collect-and-exit and the Signal Gate Platformer), one Game Complexity Ladder rung satisfied (collect-and-exit), and a GO Trust Gradient for bounded, reversible, audited, default-off auto-apply. These are descriptive metrics, not maturity, production, or Godot-replacement claims.
Earlier foundations remain recorded. The GDD-to-Playable Prototype v1
milestone is complete (including Autonomous QA / Playtest Swarm v1), while the
#1 and #23 anchors are deliberately kept open as ongoing north-star tracks. The
full per-era completion history and evidence chains live in
docs/roadmap.md and the matching docs/*.md contracts.
Next. Era E Milestone 26: Era E Refresh and Layer-3 Re-evaluation. Engine
growth stays demand-driven and rung-justified. The Layer-3 go/defer decision
(native export, plugin runtime, hosted/cloud, and distributed orchestration /
Elixir) is DEFER per
docs/layer3-reevaluation-v1.md, reaffirming
the ADR #92 NO-GO; Rust-first / local-first is preserved absent a GO.
- Contribution workflow and review expectations:
CONTRIBUTING.md - Security policy and vulnerability reporting:
SECURITY.md - License:
LICENSE
Before opening a PR, run:
cargo fmt --check
cargo test
cargo clippy --all-targets --all-features -- -D warnings
node --check examples/evidence-dashboard/dashboard.js && node examples/evidence-dashboard/dashboard.test.cjs
node --check examples/authoring-cockpit/cockpit.js && node examples/authoring-cockpit/cockpit.test.cjsPer-milestone evidence steps live in the matching docs/*.md files. Keep
generated/local runtime state untracked.
Use docs/README.md as the expanded documentation index. The
README keeps only the most common starting points so public-alpha readers do not
have to scan every milestone contract first.
| Reader question | Start here |
|---|---|
| How does the loop work in detail? | docs/architecture.md |
| What is complete and what is next? | docs/roadmap.md |
| What is the trust boundary? | docs/README.md#safetytrust-boundaries |
| Where are milestone references grouped? | docs/README.md |
| What wording is forbidden or risky? | docs/public-wording-guardrail-v1.md, docs/public-wording-audit-process-v1.md |
| Where is the final docs IA audit? | docs/docs-link-wording-audit-pa1.5.3.md |
crates/ouroforge-core— trusted core models and evidence APIs for seeds, runs, ledgers, browser smoke, scenarios, evaluator, journal, mutation proposals, project/scene contracts, source-preview boundaries, and dashboard read models.crates/ouroforge-cli— CLI entrypoints for seed, run, evidence, journal, mutation, dashboard, scene, project, source-preview, and related commands.seeds/— MVP seed examples.examples/— runtime demos, read-only UIs, fixtures, scenario packs, and regression examples.docs/— architecture, roadmap, trust-boundary/evidence contracts, milestone notes, public-readiness audits, and governance handoff docs.
Do not commit generated or local runtime/tool state: runs/, target/,
examples/evidence-dashboard/dashboard-data.json, dashboard-data/, sandbox/,
.claude/, .openchrome/, .omc/, .omx/. See
docs/artifact-write-policy-v1.md for the
trusted-write categories and generated-output/source-like collision policy.