fix(index): refresh PQ storage row ids after fragment-reuse remap#7315
Open
xuanyu-z wants to merge 1 commit into
Open
fix(index): refresh PQ storage row ids after fragment-reuse remap#7315xuanyu-z wants to merge 1 commit into
xuanyu-z wants to merge 1 commit into
Conversation
jackye1995
approved these changes
Jun 17, 2026
jackye1995
left a comment
Contributor
There was a problem hiding this comment.
looks good to me, pending rebase
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
`ProductQuantizationStorage::new` remaps its PQ codes through the fragment-reuse index but left the `row_ids` field bound to the pre-remap addresses. After a deferred compaction (`defer_index_remap`), an IVF_PQ (or IVF_HNSW_PQ) index then returned stale, compacted-away row addresses, and the scanner take failed with "The input to a take operation specified fragment id N but this fragment does not exist". This happened even for merge-only compaction and was only observable when the query fetched row content; the existing `test_read_ivf_pq_index_v3_with_defer_index_remap` projects no columns (it counts row ids without taking), so it missed the bug. Refresh `row_ids` from the remapped batch alongside `pq_code`. Also adds correctness coverage across the full deferred-compaction lifecycle (fragment-reuse window, materialized deletions, and physical remap + reuse-index trim) for IVF_FLAT/PQ/SQ/RQ and merge-only IVF_HNSW_*, plus a scalar bitmap deletion case. Unlike the existing `test_read_*_with_defer_index_remap` tests, these project and fetch row content (a take), which is what surfaces stale addresses. Stacked on lance-format#7217 (IVF_RQ fragment-reuse fix); the tests assume that fix is present.
8eb2712 to
2df3470
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a correctness bug where IVF_PQ (and IVF_HNSW_PQ) return stale row addresses after a deferred compaction (
defer_index_remap), and adds correctness coverage across the deferred-compaction lifecycle for the vector index types.ProductQuantizationStorage::newremaps its PQ codes through the fragment-reuse index but left therow_idsfield bound to the pre-remap addresses. So once an index is loaded from a version carrying a fragment-reuse index, an IVF_PQ / IVF_HNSW_PQ query returns compacted-away row addresses and the scanner take fails with:This happens even for merge-only compaction and is only observable when the query fetches row content (a take) — the existing
test_read_ivf_pq_index_v3_with_defer_index_remapprojects no columns (counts row ids without taking), so it didn't catch it. IVF_FLAT/SQ/RQ already refresh their row ids correctly. The fix refreshesrow_idsfrom the remapped batch alongsidepq_code.Test coverage added
Parameterized helpers that, unlike the existing
test_read_*_with_defer_index_remaptests, project and fetch row content so stale addresses surface:check_vector_defer_compaction— fragment-reuse window correctness, with and without materialized deletions.check_vector_remap_and_trim— merge-only, then physicalremap_column_index+cleanup_frag_reuse_index, asserting the reuse index trims to zero versions and results stay consistent.Covered (passing): IVF_FLAT, IVF_SQ, IVF_RQ, IVF_PQ across window/deletions/remap+trim; IVF_HNSW_SQ/PQ merge-only; a scalar bitmap deletion case.
Two known gaps are intentionally not turned into perpetually-ignored tests (they're noted in code comments instead): IVF_HNSW_* and the inverted/FTS index desync under materialized deletions because their positional internal structures (HNSW graph node ids; FTS
num_tokens[doc_id]) are not realigned when the fragment-reuse drop removes rows. The HNSW case is the deferred #3993; the FTS case is a separate follow-up.Testing