Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 135 additions & 0 deletions packages/kilo-docs/pages/kiloclaw/control-ui/recover-with-kilo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
---
title: "Recover with Kilo"
description: "How to use Recover with Kilo to fix KiloClaw instances"
---

# Recover with Kilo
Comment thread
LigiaZ marked this conversation as resolved.

## What is Recover with Kilo?

Recover with Kilo is a self-repair capability built into KiloClaw. When your instance has configuration issues, broken integrations, or other operational problems, you can launch a Kilo CLI Run — an AI-powered repair agent that diagnoses and fixes the issue directly on your KiloClaw machine, without needing to contact support.

Think of it as an automated sysadmin that runs inside your instance. You describe the problem in plain English, and the agent investigates, identifies the root cause, and applies fixes.

## When to Use It

Use Recover with Kilo when your KiloClaw instance is behaving unexpectedly, for example:

- A channel (Slack, Discord, Telegram) stopped working
- Gateway connections failing or timing out
- Model provider errors or configuration issues
- Missing or corrupted OpenClaw config
- Environment variables not propagating correctly
- Integration setup problems (GitHub, Linear, etc.)
- The instance is running but something isn't working as expected

## Prerequisites

| Requirement | Details |
|---|---|
| KiloClaw instance | Must be provisioned and have been deployed at least once (controller must include the CLI run routes) |
| Instance status | Must be running — the Fly Machine must be started |
| Feature flag | `KILOCLAW_KILO_CLI=true` must be set on the instance (enabled by default on new deploys) |
| API key | `KILO_API_KEY` must be configured on the instance (set during provisioning) |

If you see the error "Instance needs redeploy to support recovery", your instance was provisioned before this feature existed. You'll need to redeploy the instance to get the latest controller that supports CLI runs.

## How It Works

When you trigger a CLI run, the system executes this flow:

1. You describe the problem
2. Web app calls tRPC router
3. KiloClaw Worker -> Instance DO
4. Fly Machine controller endpoint `POST /_kilo/cli-run/start`
5. Controller spawns: `kilo run --auto "<system prompt + your description>"`
6. Agent diagnoses + fixes the issue on the machine
7. Process exits -> status updated in DB

### The System Prompt

Your description is wrapped in a system prompt that gives the repair agent full context about the machine's architecture:

- Key file paths (OpenClaw config at `/root/.openclaw/openclaw.json`, MCP servers, workspace, CLI config)
- Controller and gateway health endpoints
- Diagnostic commands (`openclaw doctor`, `jq empty` for config validation, etc.)
- Architecture details (controller on port 18789, gateway on 3001, loopback binding)
- Safety rules (don't expose secrets, preserve managed KiloClaw plugins, use SIGUSR1 for gateway restart)

## Using Recover with Kilo

### Starting a CLI Run

| Property | Value |
|---|---|
| Input | A description of the problem (1–10,000 characters) |
| Output | `{ ok: true, startedAt, id }` — the run has been initiated |

Example prompts:
- "My Telegram channel stopped receiving messages"
- "Gateway keeps crashing with exit code 1"
- "The GitHub integration is failing to authenticate"
- "Run openclaw doctor and fix any issues it finds"
- "Check if my MCP server config is valid"

### Monitoring Progress

While the run is in progress, you can poll for its status:

| Field | Description |
|---|---|
| `status` | `running`, `completed`, `failed`, or `cancelled` |
| `output` | Live stdout/stderr output from the agent (capped at ~1MB, newest first) |
| `exitCode` | Process exit code (0 = success) |
| `startedAt` | ISO timestamp when the run began |
| `completedAt` | ISO timestamp when the run finished |
| `prompt` | Your original problem description |

### Canceling a Run

If a run is taking too long or you want to stop it, you can cancel it. The process receives `SIGTERM`, then `SIGKILL` after 5 seconds if it hasn't exited.

## Error Responses

| Error Code | HTTP Status | Meaning |
|---|---|---|
| `kilo_cli_run_instance_not_running` | 409 | Instance is not in running status — start it first |
| `kilo_cli_run_already_active` | 409 | Another CLI run is already in progress — wait or cancel it first |
| `kilo_cli_run_no_active_run` | 409 | No active run to cancel |
| `controller_route_unavailable` | 404 | Controller is too old — redeploy the instance |
| Kilo CLI not enabled | 400 | `KILOCLAW_KILO_CLI` feature flag is not set |
| API key not configured | 400 | `KILO_API_KEY` is missing from the instance |

## Data Persistence

Each CLI run is recorded in the `kiloclaw_cli_runs` database table with:

| Column | Purpose |
|---|---|
| `id` | UUID primary key |
| `user_id` | Owner of the instance |
| `instance_id` | Which instance (null for legacy single-instance users) |
| `prompt` | Your problem description |
| `status` | `running`, `completed`, `failed`, `cancelled` |
| `started_at` | When the run started |
| `completed_at` | When the run finished |
| `exit_code` | Process exit code |
| `output` | Full captured output from the agent |
| `initiated_by_admin_id` | If an admin triggered the run (support troubleshooting) |

You can view your run history through the `listKiloCliRuns` endpoint, which returns up to 50 of your most recent runs.

### Admin Access

Support admins can also trigger CLI runs on behalf of users. These runs are recorded with `initiated_by_admin_id` set, so they appear in the history with `initiatedBy: 'admin'`. Admins have additional capabilities:

- `forceRetryRecovery` — manually trigger the Fly machine reconciliation recovery
- `cleanupRecoveryPreviousVolume` — delete a retained recovery volume after an instance restore

## Limitations

- **One run at a time**: Only one CLI run can be active per instance. Concurrent requests get a 409 conflict.
- **Output cap**: Agent output is capped at ~1MB. Older output is truncated from the front.
- **Instance must be running**: You cannot start a CLI run on a stopped, restoring, or destroyed instance.
- **Controller memory**: The controller only holds the active run in memory. If you poll for a completed run after a newer run has started, the original output is no longer available. The DB retains the last known state.
- **Lost outcomes**: If the controller reports no active run for a DB row still marked running, the run is recorded as failed with the output "[run state unavailable: controller no longer has an active CLI run for this record]". This doesn't necessarily mean the run failed — it means the outcome couldn't be captured before the controller moved on.
34 changes: 31 additions & 3 deletions packages/kilo-docs/pages/kiloclaw/faq/pricing.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,15 @@ KiloClaw uses Kilo Gateway credits by default — if you route requests through

## Instance Hosting

KiloClaw hosting is **free during the beta period**. Each user gets a dedicated machine (2 shared vCPUs, 3 GB RAM, 10 GB SSD) at no cost.
KiloClaw hosting uses a per-billing-period flat charge, not per-minute or per-action. A KiloClaw subscription is a recurring credit deduction tied to a specific instance. There is no metered hourly/per-minute hosting bill — the deduction happens once at the start of each billing period.

> ℹ️ **Info**
> Beta pricing is subject to change. Paid hosting tiers may be introduced after the beta period ends. Any changes will be announced in advance.
Each instance is a dedicated machine (`performance-cpu-1x`, 3 GB RAM, 10 GB SSD).

| Plan | Period | Cost per period |
|---|---|---|
| Standard | 1 month | $9.00 |
| Commit | 6 months (paid upfront) | $48.00 |
| Trial | 7 days | $0 |

## Model Inference

Expand All @@ -29,3 +34,26 @@ To see which models are currently free, check the [Kilo Leaderboard](https://kil
You can add Gateway credits from your [Kilo account](https://app.kilo.ai). Credits are shared across all Kilo products (VSCode extension, CLI, Cloud Agents, and KiloClaw).

See [Adding Credits](/docs/getting-started/adding-credits) and [Gateway Usage and Billing](/docs/gateway/usage-and-billing) for details.

## Frequently Asked Questions

**How am I billed for a KiloClaw instance?**
A flat $9/month (Standard) or $48 prepaid for 6 months (Commit, ≈$8/mo). Hardware tier doesn't change the price. Inference (LLM token) usage is billed separately from your Kilo credit balance.

**Do I pay per minute or only when active?**
No. Hosting is a flat per-period charge. Stopping an instance does not pause hosting billing.

**Is there a free trial?**
Yes — 7 days, no credit card required, automatic on your first instance.

**What if I run out of credits?**
If auto top-up is on, we top up automatically. Otherwise, your subscription goes past-due. After 14 days past-due, your instance is stopped and you have 7 more days to pay before the instance is destroyed. Paying any time before destruction auto-restarts the instance.

**Will my data be lost?**
Only after instance destruction (14 + 7 = up to 21 days after the failed renewal). Records of the instance and subscription are retained for audit, but the underlying compute and volume are torn down.

**Can I see my usage?**
Yes — your Kilo credit ledger shows each KiloClaw hosting deduction as a separate `kiloclaw-subscription:...` entry. You'll also receive emails at every milestone (renewal failed, suspension, destruction warning, destroyed).

**Are there caps?**
One active instance per personal context and per organization.
Loading