Beta → 1.0 hardening + data-loss repair (24 hardening items + C1-C6) by senna-lang · Pull Request #10 · senna-lang/Codeatrium

senna-lang · 2026-06-11T03:03:15Z

Implements three openspec changes: reliability/safety/performance hardening toward the 1.0.0 release, repair of an ongoing distillation data-loss incident, and a rewrite of the loci prime injected prompt to drive agent-initiated memory usage.

release-1.0-hardening — 24/24

B1 WAL + busy_timeout / B2 user_version-based migration mechanism / B3 transactional distillation / B4 atomic settings.json writes + .bak backup / B5 meta table (model/prompt versions) + drift warnings
H1 distill_status (pending/skipped/distilled) / H2 flock-based distill lock / H3 server double-start prevention / H4 search connection leak fix / H5 full socket reads / H6 .jsonl cleanup on claude timeout / H7 loci hook uninstall / H8 memory.db 0o600
P1 FK indexes / P2 sequential indexer reads / P3 tree-sitter cache
Q1 sha256 consolidation / Q2 datetime.now(UTC) / Q3 error reporting in distill_all / Q4 assert→raise / Q5 finer-grained config exceptions / Q6 constant naming / Q7 llm.py tests / Q8 tobytes()

repair-distill-and-revive-context — 6/6

C1 collect file paths from tool_use (exchange_files) / C2 many-to-many symbols / C3 repair migration (reset lost distillations to pending + orphan cleanup + drop bm25_text) / C4 drop INSERT OR IGNORE, verify persistence / C5 lighter loci context output + --full / C6 link only symbols appearing in conversation bodies

improve-prime-injection — P1-P3

Rewrites the prompt that loci prime injects at session start, so agents actually reach for loci search / loci context on their own (real-world logs showed near-zero spontaneous usage):

P1 Triggers rewritten from user-question-driven ("when asked where X is") to agent-action-driven: before editing/refactoring a function, before starting a new implementation, on encountering a known error
P2 loci context (reverse lookup) promoted to an independent section alongside loci search, with the design intent spelled out: touching a symbol = recalling memory about that symbol
P3 Concrete, non-placeholder command examples embedded for both commands (loci search "BM25 RRF fusion ranking", loci context --symbol "SymbolResolver.extract")
CLAUDE_MD_SECTION collapsed to a minimal redirect — PRIME_TEXT is now the single source of behavioral guidance, removing the duplicated-and-drifting wording between the two
New tests/test_prime_cmd.py: 8 tests covering the PRIME_TEXT contract and inject_claude_md idempotency
P4 (when-not-to-search guidance) deliberately deferred

Verification

All CI gates green: pytest 226 passed / pyright 0 errors / ruff clean
C3 migration applied to real DBs (logosyncs / arxiv-newspaper): lost distillations restored to pending, orphaned rooms/symbols removed, legacy bm25_text column dropped

Notes

The 1.0.0 tag is not cut yet
Re-distilling exchanges reset to pending (claude API cost) is left to each project's hook

🤖 Generated with Claude Code

…ework B1: get_connection sets WAL journal mode and 10s busy_timeout for concurrent hook access. B2: PRAGMA user_version-based sequential migration mechanism (_MIGRATIONS, _run_migrations) replaces the ad-hoc ALTER TABLE; new DBs stamp the latest version directly, existing DBs apply pending migrations transactionally. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…fig errors H4: search connections close on exception paths. H6: claude --print TimeoutExpired still cleans up side-effect .jsonl files. P2: indexer reads incrementally and skips parsing rows at or before last_ply_end. Q5: config catches specific exception types and hoists the sys import. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

H3: loci server start verifies PID liveness + ping before launching, and cleans up stale socket/PID files. H5: client and server read until newline so responses larger than the recv buffer are not truncated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… (Lane A) B3: save_palace_object writes atomically (rollback leaves status pending). H1: distill_status column (migration v2) with pending/skipped/distilled; status command distinguishes all three. B5: meta table (migration v3) records embedding_model and prompt_version (prompt sha256); check_drift warns in index/distill/search/status. P1: indexes on rooms/symbols/palace_objects FKs (migration v4). H8: memory.db chmod 0o600. P3: tree-sitter resolver/cache reuse across a distill batch. Q2: datetime.now(UTC). Q3: distill_all returns (count, errors). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

B4: _write_settings writes via a temp file + os.replace with a .json.bak backup, so a crash mid-write never corrupts settings.json. install_hooks now routes through it. H7: uninstall_hooks removes only codeatrium hooks (Stop index, SessionStart server/distill/prime, legacy SessionEnd distill), preserves user hooks, prunes emptied entries/sections, and is idempotent; exposed as loci hook uninstall. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…n (H2, Q8) H2: distill now holds an fcntl.flock(LOCK_EX|LOCK_NB) on distill.lock; a second invocation hits BlockingIOError and exits 0. The OS releases the lock on process death, so the stale-PID detection and re-acquire dance is gone. Q8: embedding bytes come from ndarray.astype(float32).tobytes() instead of struct.pack, dropping the struct import. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Drop user_content/agent_content from the default `loci context --json` output and add verbatim_ref ("{source_path}:ply={ply_start}"), making it symmetric with `loci search`. Full conversation text is now opt-in via the new --full flag or fetchable through `loci show <verbatim_ref>`. - context SQL now JOINs conversations to fetch source_path + ply_start - default JSON emits 9 fields: symbol_name, symbol_kind, file_path, signature, line, exchange_id, exchange_core, specific_context, verbatim_ref - --full restores user_content/agent_content - human output shows verbatim_ref instead of full text - update agent-facing docs (prime_cmd.py, CLAUDE.md) to point at `loci show <verbatim_ref>` for full text - add tests/test_search_cmd.py covering the output contract openspec: repair-distill-and-revive-context (item C5) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… (C1-C4,C6) Implements the C-core lane of repair-distill-and-revive-context. C1 — indexer captures tool_use file paths. parse_exchanges now collects file_path/notebook_path from assistant Edit/Write/Read/MultiEdit/NotebookEdit tool_use blocks into the new exchange_files(exchange_id, file_path) table (migration v5 + new-DB executescript). External paths (site-packages, node_modules, venv) are excluded at capture time. distiller's files_touched uses exchange_files as the primary source and the regex extraction as fallback. C2 — symbols are now many-to-many. symbols.id becomes sha256(symbol_name:file_path:palace_object_id) with dedup_hash kept as sha256(symbol_name:file_path), so one symbol can link to every conversation that discusses it. Migration v6 rebuilds existing symbol ids. C3 — repair migration v7: rebuilds palace_objects to drop a residual legacy bm25_text column, resets distill_status to pending for exchanges marked distilled but lacking a palace_objects row (so they re-distill), and deletes orphan rooms/vec_palace/symbols rows. C4 — save_palace_object no longer relies on INSERT OR IGNORE. It uses explicit existence checks for dedup and verifies the palace_objects row exists after insert, raising on failure so distill_status stays pending instead of silently losing distilled (and billed) results. C6 — only symbols whose name appears in the exchange body (user_content + agent_content) are linked, cutting reverse-lookup noise. Tests added across test_indexer/test_distiller/test_db; full suite green (typecheck 0 errors, ruff clean). The unrelated pre-existing test_status_hook::test_prime_outputs_instructions failure belongs to the C5 lane and is untouched here. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…laude (Q1,Q6,Q7) Q1: extract the duplicated _sha256 helper into codeatrium.utils.sha256; distiller and indexer import it instead of each defining their own. Q6: name the server-start polling magic number (_SERVER_STARTUP_POLL_ATTEMPTS) in server_cmd.py. Q7: add tests/test_llm.py covering call_claude command flags, JSON parsing, and side-effect .jsonl cleanup on both success and TimeoutExpired. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

The test relied on an ambient .codeatrium/ in the cwd, so it passed locally (repo has one) but failed in CI's clean checkout where loci prime exits silently. Set up a tmp .codeatrium/ and chdir into it so the test no longer depends on the working directory. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

claude · 2026-06-11T03:11:23Z

Code review

Two bugs found and validated.

Bug 1: Ply-skip boundary mismatch after malformed JSON lines

File: src/codeatrium/indexer.py lines 196-204

Codeatrium/src/codeatrium/indexer.py

Lines 195 to 205 in a4b7425

    
               continue 
        
           if ply <= last_ply_end: 
        
               raw_entries.append(None) 
        
               ply += 1 
        
           else: 
        
               try: 
        
                   raw_entries.append(json.loads(line)) 
        
                   ply += 1 
        
               except json.JSONDecodeError: 
        
                   continue

The skip loop increments ply for every non-empty line unconditionally, including malformed JSON. However, last_ply_end was written by the old parser which only advanced its counter on successfully-parsed entries — malformed lines hit except json.JSONDecodeError: continue and were never appended or counted. The two values live in different coordinate systems.

Impact: For any existing DB whose already-indexed region contains malformed JSON lines, the skip boundary fires one position too early per bad line, pushing a previously-indexed entry back into the parse region. The newly-stored ply_end then diverges further on each incremental run, causing progressive boundary drift.

Suggested fix: Only advance ply in the skip region when the line parses successfully, so both regions use the same coordinate system as the stored last_ply_end:

```python
if ply <= last_ply_end:
try:
raw_entries.append(json.loads(line))
except json.JSONDecodeError:
continue
ply += 1
```

Bug 2: WAL sidecar files not chmod 0o600 — sensitive data potentially world-readable

File: src/codeatrium/db.py lines 197-207

Codeatrium/src/codeatrium/db.py

Lines 196 to 208 in a4b7425

    
               con.enable_load_extension(False) 
        
               con.execute("PRAGMA journal_mode=WAL") 
        
               con.execute("PRAGMA busy_timeout=10000") 
        
               con.row_factory = sqlite3.Row 
        
               return con 
        
           def init_db(db_path: Path) -> None: 
        
               """DB を初期化してスキーマを作成する（冪等）""" 
        
               db_path.parent.mkdir(parents=True, exist_ok=True) 
        
               con = get_connection(db_path) 
        
               os.chmod(db_path, 0o600)

`init_db` calls `os.chmod(db_path, 0o600)` only on the main `memory.db` file, but `get_connection` enables WAL mode (`PRAGMA journal_mode=WAL`), which causes SQLite to create `memory.db-wal` and `memory.db-shm` sidecar files containing the same verbatim conversation and code data. These sidecar files inherit the process umask (commonly `0o644`), leaving them readable by other local users on a shared system.

Suggested fix:

```python
os.chmod(db_path, 0o600)
for suffix in ("-wal", "-shm"):
sidecar = db_path.parent / (db_path.name + suffix)
if sidecar.exists():
os.chmod(sidecar, 0o600)
```

Bug 1 (indexer): the incremental skip loop counted every non-empty line including malformed JSON, but last_ply_end is in successfully-parsed-line coordinates. A malformed line in the already-indexed region shifted the skip boundary one position early, drifting ply_end on each run. Now the skip region validate-parses lines so malformed ones don't occupy a position, matching the stored coordinate system. Adds a regression test. Bug 2 (db): init_db chmod'd only memory.db, but WAL mode creates memory.db-wal / -shm sidecars holding the same verbatim data with umask perms (often 0o644). Now chmod the sidecars to 0o600 too. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…te examples Implement openspec change improve-prime-injection (P1-P3 + dual-source consolidation): - P1: rewrite triggers from user-question-driven to agent-action-driven (before edit/refactor, before new implementation, on known errors) - P2: promote loci context (reverse lookup) to an independent section, spelling out the symbol-to-memory recall design intent - P3: embed concrete command examples for loci search and loci context - Collapse CLAUDE_MD_SECTION to a minimal redirect; PRIME_TEXT becomes the single source of behavioral guidance - Add tests/test_prime_cmd.py covering the PRIME_TEXT contract and inject_claude_md idempotency - Re-inject the CLAUDE.md marker block with the new section P4 (when-not-to-search guidance) deliberately omitted per review. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> # Conflicts: # CLAUDE.md # src/codeatrium/cli/prime_cmd.py

senna-lang and others added 14 commits June 10, 2026 21:51

merge: embedding (H3,H5) and search/llm/index (H4,H6,P2,Q5) lanes

bf0fbfd

merge: lighten loci context output, add --full flag (C5)

207f13e

merge: repair silent data loss and revive code reverse-lookup (C1-C4,C6)

df07d8c

docs(readme): document loci hook uninstall and context --full

a4b7425

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

senna-lang and others added 2 commits June 11, 2026 12:18

senna-lang merged commit de40818 into main Jun 11, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beta → 1.0 hardening + data-loss repair (24 hardening items + C1-C6)#10

Beta → 1.0 hardening + data-loss repair (24 hardening items + C1-C6)#10
senna-lang merged 16 commits into
mainfrom
release/1.0-hardening

senna-lang commented Jun 11, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senna-lang commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

release-1.0-hardening — 24/24

repair-distill-and-revive-context — 6/6

improve-prime-injection — P1-P3

Verification

Notes

Uh oh!

claude Bot commented Jun 11, 2026

Code review

Bug 1: Ply-skip boundary mismatch after malformed JSON lines

Bug 2: WAL sidecar files not chmod 0o600 — sensitive data potentially world-readable

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senna-lang commented Jun 11, 2026 •

edited

Loading