fix(memory): ingestion reliability and merge bloat by slvnlrt · Pull Request #604 · spacedriveapp/spacebot

slvnlrt · 2026-06-24T19:47:36Z

Summary

Four related fixes to the memory ingestion pipeline and the maintenance merge. They were found during a real ingestion run that produced a runaway loop of duplicate memories. All are backend-agnostic; the only schema change is an additive migration on ingestion_files.

Fixes

1. Deterministic chunk completion — src/agent/ingestion.rs
The per-chunk ingestion agent was required to call memory_persistence_complete. If it didn't (e.g. a smaller model skipping the signal), the chunk was marked failed even though memory_save had already committed. Chunk completion is now decided by the agent run returning Ok; the persistence-contract state is kept only for observability. This removes the false failure that fed the retry loop below.

2. Bounded retries with backoff + quarantine — src/agent/ingestion.rs (+ migration)
A failed ingestion file was re-processed on every poll cycle indefinitely (no cap, no backoff), continuously re-creating memories. Adds attempts + next_attempt_at to ingestion_files: failures back off exponentially and are quarantined after a maximum attempt count, so the poll loop stops re-processing them.

3. Ingest delete removes the source file — src/api/ingest.rs
Deleting an ingest file only removed the ingestion_files row, leaving the file on disk; the poll loop then re-discovered it and re-created the row ("reappears"). Delete now removes the disk file and purges both ingestion_files and ingestion_progress.

4. Canonical merge instead of concatenation — src/memory/maintenance.rs
Maintenance merged near-duplicate memories by concatenating their content (winner\n\nloser), bloating a single memory with many reworded copies of the same fact. It now keeps the importance-winner's content as the single canonical version.

Migration

migrations/20260623000001_ingestion_retry_budget.sql — additive only (ADD COLUMN attempts, ADD COLUMN next_attempt_at). ingestion_files.status has no CHECK constraint, so the new terminal quarantined value needs no schema change.

Tests

New unit tests: retry/backoff + quarantine decision, the merge canonical-content behavior, and the delete purge (asserts the disk file is removed and both tracking tables are emptied).
Existing maintenance integration tests updated to assert the canonical (non-concatenated) survivor content.
cargo test --lib — all green.

Note

Four fixes addressing a runaway ingestion loop: deterministic chunk completion tracking, bounded retries with exponential backoff and quarantine, deletion of source files on ingest delete, and canonical-content merging instead of concatenation.

_{Written by Tembo for commit d6239885. This will update automatically on new commits.}

A chunk was marked as failed when the per-chunk ingestion agent did not emit the memory_persistence_complete signal, even though memory_save had already committed. Combined with the retry loop this re-saved the same memories indefinitely. Chunk completion is now decided by the agent run returning Ok (the contract state is kept for observability only).

A failed ingestion file was re-processed on every poll cycle forever (no cap, no backoff). Add attempts + next_attempt_at columns to ingestion_files: failed files back off exponentially and are quarantined after a max attempt count, so the poll loop stops re-processing them.

Deleting an ingest file only removed the ingestion_files row, leaving the file on disk so the poll loop re-discovered it and re-created the row. Delete now removes the disk file and purges both ingestion_files and ingestion_progress.

Maintenance merged near-duplicate memories by concatenating their content, bloating a memory with many reworded copies of the same fact. Keep the importance-winner's content as the single canonical version.

coderabbitai · 2026-06-24T19:47:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1312bab9-c2a4-449a-9628-a73e767ececa

📥 Commits

Reviewing files that changed from the base of the PR and between e4a2eec and fd987e7.

📒 Files selected for processing (3)

src/agent/ingestion.rs
src/api/ingest.rs
src/memory/maintenance.rs

🚧 Files skipped from review as they are similar to previous changes (3)

src/api/ingest.rs
src/memory/maintenance.rs
src/agent/ingestion.rs

Walkthrough

Adds ingestion retry-budget tracking with quarantine/backoff handling, updates ingest-file deletion to purge disk and database rows, and changes memory merge output to keep the winner text while adjusting tests for the new behavior.

Changes

Ingestion retry flow

Layer / File(s)	Summary
Retry budget contract and state `migrations/20260623000001_ingestion_retry_budget.sql`, `src/agent/ingestion.rs`	Adds retry-budget columns, retry decision helpers, and retry-state loading from `ingestion_files`.
Retry gate and failure handling `src/agent/ingestion.rs`	Skips quarantined or not-yet-due files, then increments attempts and either quarantines or reschedules the file after failure, or completes and cleans up on success.
Chunk completion and tests `src/agent/ingestion.rs`	Changes chunk success to follow the LLM result, logs saved-memory counts, and updates tests for backoff, quarantine, and chunk completion behavior.

Ingest file purge

Layer / File(s)	Summary
Purge helper, endpoint, and test `src/api/ingest.rs`	Adds disk-and-database purge logic for ingest files, wires the delete endpoint to it, and verifies the helper with a SQLite-backed test.

Memory merge content

Layer / File(s)	Summary
Canonical merged content `src/memory/maintenance.rs`	Changes `merged_memory_content` to return the winner text, updates existing merge assertions, and adds tests for non-concatenating near-duplicate merges.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Ingestion: memory_persistence_complete false negatives cause duplicate memories on retry #538 — The PR changes ingestion retry/failure handling in src/agent/ingestion.rs, which matches the issue’s retry-related scope.

Suggested reviewers

jamiepine

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title matches the main changes: ingestion reliability fixes and maintenance merge deduplication.
Description check	✅ Passed	The description accurately summarizes the ingestion and merge changes in the pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/api/ingest.rs (1)

241-245: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Avoid abbreviated binding name e.

Use error to match the project convention (and the spelled-out error already used in delete_ingest_file at Line 286).

♻️ Proposed rename

         match tokio::fs::remove_file(&path).await {
             Ok(()) => {}
-            Err(e) if e.kind() == std::io::ErrorKind::NotFound => {}
-            Err(e) => return Err(e.into()),
+            Err(error) if error.kind() == std::io::ErrorKind::NotFound => {}
+            Err(error) => return Err(error.into()),
         }

As per coding guidelines: "Don't abbreviate variable names. Use queue not q, message not msg, channel not ch."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/api/ingest.rs` around lines 241 - 245, Rename the abbreviated error
binding in the `remove_file` match inside `delete_ingest_file` to `error` to
match the project’s naming convention and the existing usage later in the same
function. Update both the `NotFound` guard and the fallback `Err(...)` arm to
use the spelled-out `error` name consistently.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/agent/ingestion.rs`:
- Around line 355-360: The completion flow in ingestion handling deletes
progress before the source file removal can fail, which can leave the source
intact but chunk markers cleared. In the ingestion success path around
complete_ingestion_file, delete_progress, and tokio::fs::remove_file, move the
source-file deletion to happen before clearing ingestion_progress, and only call
delete_progress after remove_file succeeds so a failed delete cannot trigger
chunk reprocessing.

In `@src/memory/maintenance.rs`:
- Around line 318-320: The fallback path in the merge logic returns
loser_trimmed directly when winner_trimmed is empty, which bypasses
MAX_MERGED_MEMORY_CONTENT_BYTES enforcement. Update the trimming flow in the
relevant merge function in maintenance.rs so the loser fallback also goes
through the same max-size truncation used for normal merged content, preserving
the byte limit regardless of which side wins.

---

Nitpick comments:
In `@src/api/ingest.rs`:
- Around line 241-245: Rename the abbreviated error binding in the `remove_file`
match inside `delete_ingest_file` to `error` to match the project’s naming
convention and the existing usage later in the same function. Update both the
`NotFound` guard and the fallback `Err(...)` arm to use the spelled-out `error`
name consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4b014c1b-e1df-4aa2-9ddd-cf84d4556a76

📥 Commits

Reviewing files that changed from the base of the PR and between ac52277 and d623988.

📒 Files selected for processing (4)

migrations/20260623000001_ingestion_retry_budget.sql
src/agent/ingestion.rs
src/api/ingest.rs
src/memory/maintenance.rs

tembo · 2026-06-24T19:53:29Z

-    if loser_trimmed.is_empty() {
-        return winner_trimmed.to_string();
+    if winner_trimmed.is_empty() {
+        return loser_trimmed.to_string();


Edge-case: if winner_trimmed is empty, this early-return skips the MAX_MERGED_MEMORY_CONTENT_BYTES truncation. Might be worth applying the same truncation logic here too.

Suggested change

return loser_trimmed.to_string();

if winner_trimmed.is_empty() {

if loser_trimmed.len() <= MAX_MERGED_MEMORY_CONTENT_BYTES {

return loser_trimmed.to_string();

}

let boundary = loser_trimmed.floor_char_boundary(MAX_MERGED_MEMORY_CONTENT_BYTES);

return loser_trimmed[..boundary].to_string();

}

tembo · 2026-06-24T19:53:29Z

+    .await?
+    {
+        let path = ingest_dir.join(&filename);
+        match tokio::fs::remove_file(&path).await {


Since this deletes a disk path derived from ingestion_files.filename, consider rejecting any non-normal path components (e.g. .. / absolute paths) before join as defense-in-depth.

Suggested change

match tokio::fs::remove_file(&path).await {

let filename_path = Path::new(&filename);

if filename_path

.components()

.any(|c| !matches!(c, std::path::Component::Normal(_)))

{

anyhow::bail!("invalid ingest filename: {filename}");

}

let path = ingest_dir.join(filename_path);

- ingestion: remove the source file before clearing chunk progress, so a failed remove_file cannot drop progress markers and cause already-ingested chunks to be re-processed (and re-create memories). - memory: apply the MAX_MERGED_MEMORY_CONTENT_BYTES cap on the empty-winner fallback path in merged_memory_content. - api: reject non-normal path components (`..`, absolute) before deleting an ingest file (defense-in-depth on a destructive op); rename error binding for consistency.

Cross-backend fixes raised by the upstream PR reviews, applied to the integration branch: ingest delete-before-clear-progress, merged_memory_content size cap on the empty-winner path, ingest delete path-component guard, agent_id on the SQLite backend store, deterministic (weight-descending) edge ordering in graph traversal, chunked IN-list queries to stay under SQLite's bind limit, and FTS-index ensure on set_embedding. Feature-off: 891 tests green. Feature-on: clippy clean.

slvnlrt added 4 commits June 24, 2026 21:45

fix(memory): keep canonical content on merge instead of concatenating

d623988

Maintenance merged near-duplicate memories by concatenating their content, bloating a memory with many reworded copies of the same fact. Keep the importance-winner's content as the single canonical version.

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread src/agent/ingestion.rs Outdated

Comment thread src/memory/maintenance.rs Outdated

tembo Bot reviewed Jun 24, 2026

View reviewed changes

style: apply rustfmt

e4a2eec

slvnlrt mentioned this pull request Jun 24, 2026

refactor(memory): introduce a pluggable MemoryBackend trait slvnlrt/spacebot#1

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(memory): ingestion reliability and merge bloat#604

fix(memory): ingestion reliability and merge bloat#604
slvnlrt wants to merge 6 commits into
spacedriveapp:mainfrom
slvnlrt:pr/ingestion-fixes

slvnlrt commented Jun 24, 2026 •

edited by tembo Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

tembo Bot Jun 24, 2026

Uh oh!

tembo Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        match tokio::fs::remove_file(&path).await {
+        let filename_path = Path::new(&filename);
+        if filename_path
+            .components()
+            .any(|c| !matches!(c, std::path::Component::Normal(_)))
+        {
+            anyhow::bail!("invalid ingest filename: {filename}");
+        }
+        let path = ingest_dir.join(filename_path);

Uh oh!

Conversation

slvnlrt commented Jun 24, 2026 • edited by tembo Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fixes

Migration

Tests

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tembo Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

slvnlrt commented Jun 24, 2026 •

edited by tembo Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading