Skip to content

Beta → 1.0 hardening + data-loss repair (24 hardening items + C1-C6)#10

Merged
senna-lang merged 16 commits into
mainfrom
release/1.0-hardening
Jun 11, 2026
Merged

Beta → 1.0 hardening + data-loss repair (24 hardening items + C1-C6)#10
senna-lang merged 16 commits into
mainfrom
release/1.0-hardening

Conversation

@senna-lang

@senna-lang senna-lang commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Implements three openspec changes: reliability/safety/performance hardening toward the 1.0.0 release, repair of an ongoing distillation data-loss incident, and a rewrite of the loci prime injected prompt to drive agent-initiated memory usage.

release-1.0-hardening — 24/24

  • B1 WAL + busy_timeout / B2 user_version-based migration mechanism / B3 transactional distillation / B4 atomic settings.json writes + .bak backup / B5 meta table (model/prompt versions) + drift warnings
  • H1 distill_status (pending/skipped/distilled) / H2 flock-based distill lock / H3 server double-start prevention / H4 search connection leak fix / H5 full socket reads / H6 .jsonl cleanup on claude timeout / H7 loci hook uninstall / H8 memory.db 0o600
  • P1 FK indexes / P2 sequential indexer reads / P3 tree-sitter cache
  • Q1 sha256 consolidation / Q2 datetime.now(UTC) / Q3 error reporting in distill_all / Q4 assert→raise / Q5 finer-grained config exceptions / Q6 constant naming / Q7 llm.py tests / Q8 tobytes()

repair-distill-and-revive-context — 6/6

  • C1 collect file paths from tool_use (exchange_files) / C2 many-to-many symbols / C3 repair migration (reset lost distillations to pending + orphan cleanup + drop bm25_text) / C4 drop INSERT OR IGNORE, verify persistence / C5 lighter loci context output + --full / C6 link only symbols appearing in conversation bodies

improve-prime-injection — P1-P3

Rewrites the prompt that loci prime injects at session start, so agents actually reach for loci search / loci context on their own (real-world logs showed near-zero spontaneous usage):

  • P1 Triggers rewritten from user-question-driven ("when asked where X is") to agent-action-driven: before editing/refactoring a function, before starting a new implementation, on encountering a known error
  • P2 loci context (reverse lookup) promoted to an independent section alongside loci search, with the design intent spelled out: touching a symbol = recalling memory about that symbol
  • P3 Concrete, non-placeholder command examples embedded for both commands (loci search "BM25 RRF fusion ranking", loci context --symbol "SymbolResolver.extract")
  • CLAUDE_MD_SECTION collapsed to a minimal redirect — PRIME_TEXT is now the single source of behavioral guidance, removing the duplicated-and-drifting wording between the two
  • New tests/test_prime_cmd.py: 8 tests covering the PRIME_TEXT contract and inject_claude_md idempotency
  • P4 (when-not-to-search guidance) deliberately deferred

Verification

  • All CI gates green: pytest 226 passed / pyright 0 errors / ruff clean
  • C3 migration applied to real DBs (logosyncs / arxiv-newspaper): lost distillations restored to pending, orphaned rooms/symbols removed, legacy bm25_text column dropped

Notes

  • The 1.0.0 tag is not cut yet
  • Re-distilling exchanges reset to pending (claude API cost) is left to each project's hook

🤖 Generated with Claude Code

senna-lang and others added 14 commits June 10, 2026 21:51
…ework

B1: get_connection sets WAL journal mode and 10s busy_timeout for
concurrent hook access. B2: PRAGMA user_version-based sequential
migration mechanism (_MIGRATIONS, _run_migrations) replaces the ad-hoc
ALTER TABLE; new DBs stamp the latest version directly, existing DBs
apply pending migrations transactionally.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…fig errors

H4: search connections close on exception paths. H6: claude --print
TimeoutExpired still cleans up side-effect .jsonl files. P2: indexer reads
incrementally and skips parsing rows at or before last_ply_end. Q5: config
catches specific exception types and hoists the sys import.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
H3: loci server start verifies PID liveness + ping before launching, and
cleans up stale socket/PID files. H5: client and server read until newline
so responses larger than the recv buffer are not truncated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… (Lane A)

B3: save_palace_object writes atomically (rollback leaves status pending).
H1: distill_status column (migration v2) with pending/skipped/distilled;
status command distinguishes all three. B5: meta table (migration v3) records
embedding_model and prompt_version (prompt sha256); check_drift warns in
index/distill/search/status. P1: indexes on rooms/symbols/palace_objects FKs
(migration v4). H8: memory.db chmod 0o600. P3: tree-sitter resolver/cache reuse
across a distill batch. Q2: datetime.now(UTC). Q3: distill_all returns
(count, errors).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
B4: _write_settings writes via a temp file + os.replace with a .json.bak
backup, so a crash mid-write never corrupts settings.json. install_hooks
now routes through it. H7: uninstall_hooks removes only codeatrium hooks
(Stop index, SessionStart server/distill/prime, legacy SessionEnd distill),
preserves user hooks, prunes emptied entries/sections, and is idempotent;
exposed as loci hook uninstall.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…n (H2, Q8)

H2: distill now holds an fcntl.flock(LOCK_EX|LOCK_NB) on distill.lock;
a second invocation hits BlockingIOError and exits 0. The OS releases the
lock on process death, so the stale-PID detection and re-acquire dance is
gone. Q8: embedding bytes come from ndarray.astype(float32).tobytes()
instead of struct.pack, dropping the struct import.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop user_content/agent_content from the default `loci context --json`
output and add verbatim_ref ("{source_path}:ply={ply_start}"), making it
symmetric with `loci search`. Full conversation text is now opt-in via the
new --full flag or fetchable through `loci show <verbatim_ref>`.

- context SQL now JOINs conversations to fetch source_path + ply_start
- default JSON emits 9 fields: symbol_name, symbol_kind, file_path,
  signature, line, exchange_id, exchange_core, specific_context, verbatim_ref
- --full restores user_content/agent_content
- human output shows verbatim_ref instead of full text
- update agent-facing docs (prime_cmd.py, CLAUDE.md) to point at
  `loci show <verbatim_ref>` for full text
- add tests/test_search_cmd.py covering the output contract

openspec: repair-distill-and-revive-context (item C5)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (C1-C4,C6)

Implements the C-core lane of repair-distill-and-revive-context.

C1 — indexer captures tool_use file paths. parse_exchanges now collects
file_path/notebook_path from assistant Edit/Write/Read/MultiEdit/NotebookEdit
tool_use blocks into the new exchange_files(exchange_id, file_path) table
(migration v5 + new-DB executescript). External paths (site-packages,
node_modules, venv) are excluded at capture time. distiller's files_touched
uses exchange_files as the primary source and the regex extraction as fallback.

C2 — symbols are now many-to-many. symbols.id becomes
sha256(symbol_name:file_path:palace_object_id) with dedup_hash kept as
sha256(symbol_name:file_path), so one symbol can link to every conversation
that discusses it. Migration v6 rebuilds existing symbol ids.

C3 — repair migration v7: rebuilds palace_objects to drop a residual legacy
bm25_text column, resets distill_status to pending for exchanges marked
distilled but lacking a palace_objects row (so they re-distill), and deletes
orphan rooms/vec_palace/symbols rows.

C4 — save_palace_object no longer relies on INSERT OR IGNORE. It uses explicit
existence checks for dedup and verifies the palace_objects row exists after
insert, raising on failure so distill_status stays pending instead of silently
losing distilled (and billed) results.

C6 — only symbols whose name appears in the exchange body (user_content +
agent_content) are linked, cutting reverse-lookup noise.

Tests added across test_indexer/test_distiller/test_db; full suite green
(typecheck 0 errors, ruff clean). The unrelated pre-existing
test_status_hook::test_prime_outputs_instructions failure belongs to the C5
lane and is untouched here.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…laude (Q1,Q6,Q7)

Q1: extract the duplicated _sha256 helper into codeatrium.utils.sha256;
distiller and indexer import it instead of each defining their own.
Q6: name the server-start polling magic number
(_SERVER_STARTUP_POLL_ATTEMPTS) in server_cmd.py. Q7: add tests/test_llm.py
covering call_claude command flags, JSON parsing, and side-effect .jsonl
cleanup on both success and TimeoutExpired.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The test relied on an ambient .codeatrium/ in the cwd, so it passed
locally (repo has one) but failed in CI's clean checkout where loci prime
exits silently. Set up a tmp .codeatrium/ and chdir into it so the test
no longer depends on the working directory.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code review

Two bugs found and validated.


Bug 1: Ply-skip boundary mismatch after malformed JSON lines

File: src/codeatrium/indexer.py lines 196-204

continue
if ply <= last_ply_end:
raw_entries.append(None)
ply += 1
else:
try:
raw_entries.append(json.loads(line))
ply += 1
except json.JSONDecodeError:
continue

The skip loop increments ply for every non-empty line unconditionally, including malformed JSON. However, last_ply_end was written by the old parser which only advanced its counter on successfully-parsed entries — malformed lines hit except json.JSONDecodeError: continue and were never appended or counted. The two values live in different coordinate systems.

Impact: For any existing DB whose already-indexed region contains malformed JSON lines, the skip boundary fires one position too early per bad line, pushing a previously-indexed entry back into the parse region. The newly-stored ply_end then diverges further on each incremental run, causing progressive boundary drift.

Suggested fix: Only advance ply in the skip region when the line parses successfully, so both regions use the same coordinate system as the stored last_ply_end:

```python
if ply <= last_ply_end:
try:
raw_entries.append(json.loads(line))
except json.JSONDecodeError:
continue
ply += 1
```


Bug 2: WAL sidecar files not chmod 0o600 — sensitive data potentially world-readable

File: src/codeatrium/db.py lines 197-207

con.enable_load_extension(False)
con.execute("PRAGMA journal_mode=WAL")
con.execute("PRAGMA busy_timeout=10000")
con.row_factory = sqlite3.Row
return con
def init_db(db_path: Path) -> None:
"""DB を初期化してスキーマを作成する(冪等)"""
db_path.parent.mkdir(parents=True, exist_ok=True)
con = get_connection(db_path)
os.chmod(db_path, 0o600)

`init_db` calls `os.chmod(db_path, 0o600)` only on the main `memory.db` file, but `get_connection` enables WAL mode (`PRAGMA journal_mode=WAL`), which causes SQLite to create `memory.db-wal` and `memory.db-shm` sidecar files containing the same verbatim conversation and code data. These sidecar files inherit the process umask (commonly `0o644`), leaving them readable by other local users on a shared system.

Suggested fix:

```python
os.chmod(db_path, 0o600)
for suffix in ("-wal", "-shm"):
sidecar = db_path.parent / (db_path.name + suffix)
if sidecar.exists():
os.chmod(sidecar, 0o600)
```

senna-lang and others added 2 commits June 11, 2026 12:18
Bug 1 (indexer): the incremental skip loop counted every non-empty line
including malformed JSON, but last_ply_end is in successfully-parsed-line
coordinates. A malformed line in the already-indexed region shifted the
skip boundary one position early, drifting ply_end on each run. Now the
skip region validate-parses lines so malformed ones don't occupy a
position, matching the stored coordinate system. Adds a regression test.

Bug 2 (db): init_db chmod'd only memory.db, but WAL mode creates
memory.db-wal / -shm sidecars holding the same verbatim data with umask
perms (often 0o644). Now chmod the sidecars to 0o600 too.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…te examples

Implement openspec change improve-prime-injection (P1-P3 + dual-source
consolidation):

- P1: rewrite triggers from user-question-driven to agent-action-driven
  (before edit/refactor, before new implementation, on known errors)
- P2: promote loci context (reverse lookup) to an independent section,
  spelling out the symbol-to-memory recall design intent
- P3: embed concrete command examples for loci search and loci context
- Collapse CLAUDE_MD_SECTION to a minimal redirect; PRIME_TEXT becomes
  the single source of behavioral guidance
- Add tests/test_prime_cmd.py covering the PRIME_TEXT contract and
  inject_claude_md idempotency
- Re-inject the CLAUDE.md marker block with the new section

P4 (when-not-to-search guidance) deliberately omitted per review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

# Conflicts:
#	CLAUDE.md
#	src/codeatrium/cli/prime_cmd.py
@senna-lang senna-lang merged commit de40818 into main Jun 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant