Skip to content

fix: preserve rowid across data branch fallback changes#24174

Open
gouhongshen wants to merge 12 commits intomatrixorigin:mainfrom
gouhongshen:codex/fix-data-branch-gc-diff
Open

fix: preserve rowid across data branch fallback changes#24174
gouhongshen wants to merge 12 commits intomatrixorigin:mainfrom
gouhongshen:codex/fix-data-branch-gc-diff

Conversation

@gouhongshen
Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #23751

What this PR does / why we need it:

  • preserve rowid in data-branch CollectChanges batches so diff logic can match replayed rows precisely after GC and fallback reads
  • reapply snapshot read policy, retain-rowid, and PK filter on every BranchChangeHandle.Next() so handle rebuilds keep the same batch contract
  • teach hashdiff to read retained-rowid tuples correctly, including hidden physical columns and fake-PK tables
  • downgrade invalid rowid validation from process-fatal to error-return logging and remove debug investigation logs
  • keep regression coverage in place for the original diff_9 GC repro and the new rowid propagation paths

Copilot AI review requested due to automatic review settings April 23, 2026 03:17
@matrix-meow matrix-meow added the size/XL Denotes a PR that changes [1000, 1999] lines label Apr 23, 2026
@mergify mergify Bot added kind/bug Something isn't working kind/test-ci labels Apr 23, 2026
@gouhongshen
Copy link
Copy Markdown
Contributor Author

🔍 Multi-Agent Code Review (七人陪审团)

Powered by 7 specialized review agents (architect / bug hunter / security / performance / test / smell / judge), all running on Claude Opus 4.7.

📊 概览

指标
审查文件数 18
总发现数 28
🔴 Must Fix 7
🟡 Should Fix 13
🟢 Nit 8
🧨 破坏性测试结论 不足 (Insufficient)
🐶 补丁链 命中 (5 commit / 3h, 4 个 fix/restore/guard)

📝 总结

PR 修复方向是对的 —— 跨 data branch GC 之后保持 rowid 端到端一致、并修正 fallback 的 snapshot TS 语义,确实命中了 #23751 的根因。但实现方式啃了至少 4 处既有边界:用 context.Value 当核心 API 参数、frontend 反向依赖 disttae 内部 (unwrapTxnTable 反射)、新增一个与 LocalDataSource 高度重复的 materializedSnapshotDataSource、以及用"整段缓冲 + 重 inject"假装 producer 状态机可重启。5 commit / 3 小时 / 4 个补丁 的时间线本身就是强信号 —— 尤其 595df670 把 5 处 TS 一并从 currentTS 改回 snapshotTS,这个版本若合入将导致 snapshot 读到未来 tombstone 的静默数据正确性 bug,几乎是擦边过关。建议在合并前至少处理掉 7 个 🔴,并补足下方列出的破坏性测试。

🧨 破坏性测试审判

  • 结论: 不足 (Insufficient) —— 当前测试集足够让 CI 变绿,但抓不住本 PR 核心改动的回归。
  • 已覆盖:
  • 缺失 (按风险降序):
    1. validateLeadingRowID (fatal→log 降级) 零单测 —— 不再 panic ≠ 结果不被污染
    2. getTupleColumnValue 三种 tuple 宽度分支 (PR 核心 bug 修复点) 仅在 NoLCA 间接覆盖,visible-only / +commitTS-no-rowid / idx 越界均无单测
    3. BranchChangeHandle.Next() 幂等性 —— PR 摘要明说"每次 Next 重应用",测试只调一次
    4. materializedSnapshotDataSource 三测全 happy path,recordingTombstoner 永远 return (rowsOffset, false) —— 即使把 snapshotTS 写回 currentTS 测试也不会失败 (即 595df670 修的 bug 没有任何回归保护)
    5. retBatchPool tombVecCnt 1→3 后的 vec mismatch panic 0 触发用例
    6. ctx cancel / mpool 泄漏 / pState=nil / 大输入 OOM 全无
  • 裁决: 9 项破坏性测试维度命中 0 项 ❌ + 部分覆盖 2 项 ⚠️ + N/A 1 项,强烈建议补齐 (1)(2)(3)(4) 后再合并

🐶 补丁链分析

✅ 021b5475  03:17  preserve rowid in data branch changes        (主体)
🚩 0a5503ab  03:32  restore data branch hashmap build            (+15min, hashmap 索引偏移没全修)
🚩 ff96675c  04:01  align no-lca hashdiff stubs                  (主体加进来的 stripLeadingRowIDFromBatch 当场删除)
🚩 33257cc7  05:29  guard blockio commit-ts reads                (元数据契约假设跟新 schema 冲突, 用 IO 兜底)
🚩 595df670  06:27  use snapshot ts in materialized fallback     (5 处全部用错 currentTS, 差点合入数据正确性 bug)

诊断: 主体提交对影响面摸得不充分,后续 4 个补丁平均 47 分钟一个,节奏更接近"跑 CI 哪坏补一刀"而非回到设计层面修。其中 595df670 是 PR 内最危险的潜伏 bug,目前没有任何 review 留痕也没有回归测试。

建议:

  • (强烈推荐) 拆 PR:
    • PR-A: rowid schema 契约 + 测试 + ctx→显式参数
    • PR-B: blockio commit-ts 重构
    • PR-C: snapshot_materialized + 正确 TS 语义 + 资源上限
    • PR-D: hashdiff retain-rowid 解析
  • (退一步) 至少 squash 为 1 个干净 commit + 在 PR 描述中补一段 "为什么 readTS 必须是 snapshotTS" 的设计说明。

🔴 Must Fix (合并前必须修)

1. materializedSnapshotDataSource 全表物化无上限,可远程触发 OOM — snapshot_materialized_datasource.go:62-260

  • 类别: 性能 / 安全 / Bug
  • 发现者: 闪电 + 虫姐 + 铁壁 (三人共识)
  • 描述: buildCommittedInMemBatches 一次性把整张表 materialize 进 mpool,无行数/字节上限、无 spill、不查 mp.Cap、不检查 ctx.Done()outBatch.Append 全量拷贝峰值 2×、deletedRowsCache 无 LRU。10M × 100B ≈ 1GB 常驻;多租户共享 CN 上租户 A 一张大表即可拖垮其它租户 (DoS)。
  • 根治方案: 改为真正流式 DataSource,边走边产出边 Free;加 totalRows / totalBytes 硬上限;循环内 select { case <-ctx.Done() };失败路径清理已累积的 inMemBatches;deletedRowsCache 改 LRU。

2. tae/blockio/read.go 静默改写 non-appendable 块的 commit-ts 过滤语义 — read.go:835-944

  • 类别: Bug / 架构 / 异味
  • 发现者: 老K + 虫姐 + 补丁犬 (三人共识)
  • 描述: 33257cc7readABlkColumns 替换成对所有 block 都跑 readColumnsWithCommitTSFilter。老路径下 non-appendable 块不按 commit-ts 过滤,新路径所有块都过滤 —— 任何旧 caller 只要块以 T_TS 列收尾就静默吃行;同时探测失败时 deletes 返回空 bitmap,caller 完全感知不到 snapshot 过滤被忽略。这是个静默正确性后门,且修复方式是用 IO 替代显式契约 (fallback-on-fallback)。
  • 根治方案: 长期 —— BlockInfo.HasCommitTS 在写出时设置,调度层按 block 类型路由;短期 —— 仅当 len(idxes) > 0 时进入 commit-ts 过滤,rowid-only 走原始 readColumns;契约违反返 error 而非空 bitmap;FastLoadObjectMeta 结果挂 info 缓存避免热路径 IO。

3. snapshotTS vs currentTS 命名混乱,5 处用错差点合入数据正确性 bug — snapshot_materialized_datasource.go 全文

  • 类别: 异味 / Bug
  • 发现者: 补丁犬 (595df670 时间线证据)
  • 描述: 初版 5 处全部用 currentTS 读 tombstone,会把 snapshot 之后的删除算上,即"snapshot 读到未来 tombstone"。595df670 一并改 5 处属于 copy-paste 适配,而非把语义重写。目前没有任何回归测试保护这条边 (见破坏性测试结论第 4 点)。
  • 根治方案: API 只接受单一 readTS,由调用方在外定义清楚是 snapshot 还是 current;新增专门单测验证 "用 currentTS 跑会失败、用 snapshotTS 跑通过"。

4. updateCNDataBatch / updateCNTombstoneBatch 零行 / pk=nil NPE — logtailreplay/change_handle.go:2712, 2741

  • 类别: Bug
  • 发现者: 虫姐
  • 描述:
    • updateCNTombstoneBatch: NewConstFixed(..., pk.Length(), mp)pk == nil 直接 NPE。updateTombstoneBatchpkVec == nil 守卫,这里没有 —— fatal 降级为 NPE = 伪装
    • updateCNDataBatch: prependRowIDVectorIfNeededRowCount == 0 静默 return nil 不写 rowid,紧接着 vector.NewConstFixed(..., bat.Vecs[0].Length()) —— 若原 batch 仅 TS 列被剥掉则 len(Vecs) == 0 panic。
  • 根治方案: 入口 nil/empty 校验,违反契约返回 InternalError 而非 NPE;RowCount == 0 && retainRowID 时构造空 rowid 列保布局。

5. validateLeadingRowID 在 hot path,且 0 单测覆盖 — data_branch_hashdiff.go:1468 / 1526-1581 / 2058-2063

  • 类别: 性能 / 测试
  • 发现者: 闪电 + 测姐 (双人共识)
  • 描述: 每个 batch 全量扫 N 行 (~30µs+ /批);batchSampleRowsForLog 即使无失败也预生成数十 string + clone attrs。1000万行单边几十 ms 纯空跑 + GC 压力。同时 fatal→log 降级这一敏感改动 0 测试:rowid vec 类型错 / 全 null / EmptyRowid / length mismatch / nil 都没覆盖。
  • 根治方案: 失败信息构造 lazy 化 (zap.Object + MarshalLogObjectzap.Stringer);改抽样或 build tag;invariant 应在写入侧保证;补全反向单测验证"降级后结果不被污染"。

6. CollectChangescontext.Value 当核心 API 参数 — change_handle_rowid.go + policy + pk_filter

  • 类别: 架构 / 异味
  • 发现者: 老K + 补丁犬 + 虫姐 + 闪电 (四人共识)
  • 描述: 同一 API 的 3 个核心选项 (snapshotReadPolicy / retainRowID / pkFilter) 拆成 3 个 With/FromContext;BranchChangeHandle.Next 还要每次重 inject 一次,因为不信任 ctx。但下游 NewChangesHandler 只在构造时读一次 RetainRowIDFromContext,所以 per-Next 的 ctx wrap 实际是装饰性 dead work,纯 GC 垃圾,且制造"每次 Next 都会重新生效"的假象。retainRowID 是改变输出 batch 形状的核心契约,不是请求范围隐式上下文。
  • 根治方案: 新增 engine.CollectChangesOptions struct 显式参数;删掉 BranchChangeHandle.Next 里的 3 行 ctx 重塞。

7. frontend → disttae 反向依赖 + reflect 拆 *txnTabledata_branch_hashdiff.gosnapshot_scan.go

  • 类别: 架构
  • 发现者: 老K
  • 描述: frontend 直接调 disttae 内部 ScanSnapshotWithCurrentRanges,内部 unwrapTxnTable 用反射拆 *txnTable。永远撕不掉的反向依赖,后续 disttae 内部重构会被这层反射卡住。
  • 根治方案: 把 SnapshotScan 提升为 engine.Relation 方法,frontend 只依赖 engine 接口。

🟡 Should Fix (强烈建议)

  1. nextWithSnapshotRecovery / bufferCurrentRange 用整段缓冲掩盖 producer 状态机不可重置change_handle.go:177-258 (老K)。sub-range 大时 OOM。建议 ChangesHandle.Next 失败返回 retry hint,caller 重发。
  2. materializedSnapshotDataSourceLocalDataSource 高度重复 + 命名误导 — 新增 350 行只是 readTS 不同,且 "materialized" 实际只物化 in-mem 行,persisted 仍流式。建议复用 + 改名 snapshotDataSource (老K)。
  3. getTupleColumnValue 接受 3 种 tuple 宽度 + colIdx silently 切换data_branch_hashdiff.go:1486 (补丁犬 + 测姐)。上游 producer 应恒定 schema,违反者直接 panic;同时补 visible-only / +commitTS-no-rowid / idx 越界单测。
  4. filterBatchVecs[0] is T_Rowid 嗅探 retainRowID 布局change_handle.go:2032-2235 (虫姐 + 补丁犬)。用户表第一列若本身就是 T_Rowid,非 retainRowID 路径会被错误 +1,PK 解析到错列。建议 retainRowID 显式参数传入。
  5. 合成 rowid 假定 row offset = batch indexchange_handle.go:2567-2573 (虫姐 + 铁壁)。若未来在 prepend 之前插入 row-level filter,合成 rowid 与真实物理 offset 错位 → hashdiff false-positive → 误删/漏报。建议加 invariant 校验或 row-offset 显式参数;同时标记 __synthetic_rowid 拒绝非 hashdiff 模块使用。
  6. 错误日志泄露行级业务数据data_branch_hashdiff.go:210-280 (铁壁)。batchSampleRowsForLog + validateLeadingRowID 把整行业务数据 (含 hidden 列、PK 明文) 落到共享 service log,多租户场景下信息泄露。建议默认只输出长度/类型,必要时哈希;rowid 只输出 blockid 段。
  7. prependRowIDVectorIfNeeded 逐行 AppendFixed — 8192 行/batch × N batches × 接口调用 + 容量翻倍重分配 ~13 次/batch (闪电)。改 PreExtend + MustFixedColNoTypeCheck 直接写 slice;blockid 常量批量 fill。
  8. handleDelsOnLCA 由 bulk 退化为 per-row + 嵌套扫描data_branch_hashdiff.go:1217-1252 (闪电)。PK 冲突时 O(N·K²)。建议 hashmap 加 PopByVectorsStreamWithPayloadFilter 批量接口。
  9. CheckpointChangesHandle.Next 每 batch 重建 vec/attrs 切片change_handle.go:642-700 (闪电)。rowIDIdx 应一次性算出 + 缓存。
  10. h.cache = h.cache[1:] 不清零导致 GC pinchange_handle.go:303-353 (闪电)。常驻内存 ≈ 2×。先 h.cache[0] = nil 再 reslice。
  11. retainRowID schema 编织散落 4 份 update*Batchchange_handle.go:2581-2740 (老K + 补丁犬)。抽 batchSchema + finalizeBatch 一处生成。
  12. PKFilter "空但非 nil" 兜底,上游传半残branch_change_handle.go:54,117 (老K)。在构造点保证 segments 为空时返回 nil。
  13. 测试覆盖补齐: BranchChangeHandle.Next 多次调用一致性、retBatchPool tombVecCnt mismatch panic、retainRowID=false 反向 case、FastLoadObjectMeta 失败时 dataRelease() 是否调用、commit_ts 全 ≤ tsdeleteMask.Count==0metaColCnt==0 边界 (测姐)。

🟢 Nit

  • snapshotScanReaderConfig.hasBlockFilter bool 改为 *BlockReadFilter,nil 表示无 (老K)
  • removedLiveRows 计数后从未使用,死代码 (补丁犬)
  • shouldSkipCommittedInMemEntryBFSeqNum 越界 panic,加边界检查 + warn log (虫姐)
  • BranchChangeHandle.Next() 里 3 行重塞 ctx 直接删除 (虫姐 + 闪电)
  • ff96675c 删掉的 stripLeadingRowIDFromBatch 主体里加进来又当场没用上 —— 提交前自查 (补丁犬)
  • BVT diff_9.sql case 4 文本 diff 断言较脆,补 fake-PK + GC + base UPDATE 组合 (测姐)
  • TestUpdateDataBatch_RetainsSynthesizedRowID 单 blockid + 固定 rowOffset,blockid 零值合成的 rowid 与 validateLeadingRowIDBorrowBlockID().IsEmpty() 校验互打架 (测姐)
  • materializedSnapshotDataSource 改名 snapshotDataSource (老K)

💬 陪审团裁决摘要

  1. 方向对、实现啃边界: 6 位陪审员一致认可修复方向 (rowid 端到端 + snapshot fallback TS),但一致指出实现过程中绕开了至少 4 个既有抽象边界。
  2. 补丁链是强信号: 5 commit / 3h / 4 个 fix-on-fix,尤其 595df670 是 PR 内最危险的潜伏 bug —— currentTS 写法若合入会导致 snapshot 读到未来 tombstone,目前没有任何 review 留痕、没有回归测试
  3. 测试不足是硬伤: validateLeadingRowID (fatal 降级)、getTupleColumnValue (PR 核心 bug 修复)、Next() 幂等、materializedSnapshotDataSource snapshotTS 语义 —— 四类核心改动均缺反向验证,当前测试集只能让 CI 绿。
  4. 共识 🔴: materializedSnapshotDataSource OOM (3 票) / blockio 静默语义改写 (3 票) / ctx.Value 当 API 参数 (4 票)。
  5. 建议: 至少完成 7 个 🔴 + 破坏性测试 (1)(2)(3)(4) + squash commit 后再合;若资源允许,强烈推荐拆为 4 个 PR 让每个变更点单独被 review。

辛苦作者 🙏 修复方向非常有价值,以上意见仅为提升合入质量,欢迎讨论。

@gouhongshen
Copy link
Copy Markdown
Contributor Author

Thanks for the deep review. Pushed 16621ed addressing the actionable items; below is a per-item response.

Adopted (this PR)

  • Add PR template file #3 stronger snapshot regression test — added tsFilteringTombstoner + TestMaterializedSnapshotDataSourcePersistedTombstoneFiltersBySnapshotTS in pkg/vm/engine/disttae/snapshot_materialized_datasource_test.go. The fake tombstoner is TS-aware: a tombstone at TS=25 is invisible to a snapshot at TS=20, so if the code regresses from snapshotTS to currentTS the row gets filtered and the test fails. This locks down exactly the snapshotTS/currentTS distinction you flagged.
  • More documents are needed in Github REPO #4 NPE guards in updateCN{Data,Tombstone}Batch — added explicit nil-batch / missing-pk / empty-vec guards in pkg/vm/engine/disttae/logtailreplay/change_handle.go plus unit tests in change_handle_test.go.
  • [#4] Add a new document: CHANGELOG.md #5 validateLeadingRowID coverage + lazy logging — refactored validateLeadingRowID so the batchSampleRowsForLog call (and the field slice) is wrapped in a closure that only runs on the failure path, keeping the hot path allocation-free. Added TestValidateLeadingRowID covering nil/empty/missing-rowid/wrong-type/length-mismatch/null-rowid/empty-rowid/valid. Also bounded batchSampleRowsForLog per-vector length so it can't OOB on length-mismatch input (uncovered by the new test).

Declined / deferred (with reasoning)

  • integrate mysql parser from pingcap #2 blockio short-circuit when len(idxes)==0 — the suggested early-return is semantically wrong: even when only rowid is requested, callers in the rowid-retain path still rely on this code to apply delete visibility on appendable blocks. A correct fix needs a per-call HasCommitTS opt-in plumbed at write time, which is a larger refactor and out of scope for this PR. Filing as a follow-up.
  • (misc): parser demo #6 context-encoded retainRowID policy — this matches the existing convention in pkg/vm/engine/change_handle_policy.go / change_handle_pk_filter.go (WithSnapshotReadPolicy, WithPKFilter, …), and the per-Next() reapply is intentional: see repro_stale_read_3.sql from earlier in this PR — the FileNotFound rebuild path drops the original ctx, so a struct option that's only set once silently loses the policy mid-stream. Switching to a struct option would either break that path or duplicate state in two places.
  • Translate branchless implementations of intersection/union to Go assembly #7 reverse dep / reflection in unwrapTxnTable — preexisting in ScanSnapshotWithCurrentRanges, not introduced by this PR. Worth a dedicated cleanup PR.
  • Add README.md #1 OOM in materialized fallback — preexisting design of materializedSnapshotDataSource (full-table materialization is its semantic), not a regression here. Spilling/streaming is a separate workstream.

Will pick up #2 in a follow-up once a clean BlockInfo.HasCommitTS-style signal is in place.

@gouhongshen
Copy link
Copy Markdown
Contributor Author

🔁 Incremental Re-review — 16621ed

All 7 sub-agents (Opus 4.7) verdict on the fixup commit. Mutation-tested locally.

📊 Status of original 🔴 Must-Fix items

# Original 🔴 Status Verification
3 snapshotTS vs currentTS 0 regression coverage Fixed tsFilteringTombstoner + TestMaterializedSnapshotDataSourcePersistedTombstoneFiltersBySnapshotTSmutation-tested: reverting prod line 141 to &ds.currentTS makes the test fail with expected diff [0,1] → [0]
4 updateCN{Data,Tombstone}Batch NPE Fixed nil-batch / missing-pk / empty-vec guards + retainRowID && rowid==nil guard added; TestUpdateCN{Data,Tombstone}Batch_NilGuard covers all 4 paths; tests pass locally
5 validateLeadingRowID 0 单测 + hot-path alloc Fixed buildFields lazy closure (only runs on failure path → 0 alloc on green path); TestValidateLeadingRowID covers nil/empty/missing-rowid/wrong-type/length-mismatch/null/empty/valid (8 sub-tests); batchSampleRowsForLog OOB guard added
1 materialized OOM, no spill, no ctx cancel Deferred (acceptable) Author: "preexisting design of materializedSnapshotDataSource, not a regression". Fair scope call for a bug-fix PR; multi-tenant DoS risk should still be tracked as follow-up issue
2 tae/blockio/read.go silent semantic change for non-appendable blocks Deferred (partial reasoning) Author cites that early-return breaks rowid-only path on appendable blocks. ✔ But that doesn't address the original concern about non-appendable blocks silently gaining commit-ts filtering. Needs the BlockInfo.HasCommitTS opt-in plumbing as follow-up — confirm a tracking issue exists before merging
6 context.Value for retainRowID Deferred (justified) Author cites repro_stale_read_3.sql: FileNotFound rebuild path drops original ctx, so a one-shot struct option silently loses policy. Counter-argument: a struct option threaded through the rebuild path would also work and would be type-safe. Acceptable as scope deferral, but the per-Next() ctx re-injection is still dead work today (NewChangesHandler only reads it once at construction). At minimum delete the dead reapply, or document the rebuild-path contract.
7 Reverse dep + unwrapTxnTable reflection Pre-existing Confirmed: ScanSnapshotWithCurrentRanges was added in earlier commits. Acceptable scope deferral

🧨 Destructive-test verdict — UPDATED

  • Before: 不足 (Insufficient) — 0/9 hostile failure modes covered
  • After: 部分充分 (Partially Sufficient) — covers (1) hostile input, (2) boundary, (3) error injection, plus the most critical regression (snapshot vs current TS visibility)
  • Still missing (track as follow-up, not blocking):
    • getTupleColumnValue 3-tuple-width direct unit tests
    • BranchChangeHandle.Next() idempotency over multiple calls
    • retBatchPool tombVecCnt mismatch panic case
    • ctx cancel mid-stream / mpool leak

🐶 Patch-chain re-evaluation

16621ed is the 6th commit in 4h but unlike the previous 4, this one is purely additive test coverage + defensive guards on already-identified call sites — not another fix-on-fix patch. Pattern broken.

✅ Verification commands run locally

# All pass on PR head 16621ed
go test ./pkg/vm/engine/disttae/ -run TestMaterializedSnapshotDataSourcePersistedTombstoneFiltersBySnapshotTS -count=1
go test ./pkg/frontend/ -run TestValidateLeadingRowID -count=1
go test ./pkg/vm/engine/disttae/logtailreplay/ -run TestUpdateCN -count=1

# Mutation test on prod line 141 (snapshotTS → currentTS) → test correctly FAILS
# expected [0,1] vs actual [0]  ← confirms genuine destructive coverage

🎯 Final recommendation

🟢 Ready to merge with the following understanding:

  1. Blocking 🔴 items (3, 4, 5) are properly fixed and have real destructive test coverage — the mutation test on item 3 is particularly valuable.
  2. Deferred items (1, 2, 6, 7) need follow-up issues:
  3. The 🟡 / 🟢 items from the original review remain as quality-of-life follow-ups.

Nice turn-around from author. The mutation-tested snapshot regression test is exactly the kind of destructive coverage we asked for.

@LeftHandCold
Copy link
Copy Markdown
Contributor

I found one substantive correctness issue in pkg/vm/engine/disttae/snapshot_scan.go that looks worth fixing before merge.

In the new parallel snapshot path, splitSnapshotScanShards(relData, actualParallelism) happens before tombstones are attached to the relation data. Later we call relData.AttachTombstones(tombstones), but the already-created shard objects do not get updated. BlockListRelData.DataSlice() copies the current tombstones pointer into each shard when the split happens, so the shards can still carry nil / stale tombstones while scanSnapshotShardsParallel() reads them.

That means the parallel snapshot scan can miss tombstone filtering and return rows that were already deleted. The serial path does not have this problem because it attaches tombstones before constructing the source.

Suggested fix: attach tombstones before splitting, or explicitly re-attach tombstones to each shard after the split.

gouhongshen and others added 12 commits April 24, 2026 16:23
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Guard updateCNDataBatch / updateCNTombstoneBatch against nil/empty input that previously could NPE

- Lazy-build validateLeadingRowID log fields so success-path no longer pre-allocates sample strings

- Bound batchSampleRowsForLog by per-vector length to avoid out-of-bounds on length-mismatch failures

- Add unit tests for validateLeadingRowID covering nil, empty, wrong type, length mismatch, null and empty rowid

- Add unit tests for updateCN batch nil/empty guards

- Strengthen materializedSnapshotDataSource fallback test with a TS-aware fake tombstoner so reverting from snapshotTS to currentTS would now fail

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add unit coverage for String / SetOrderBy / GetOrderBy / SetFilterZM on
both remote-nil and remote-present branches. Lifts PR coverage above
the 75% gate without behavioural change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…false

Adds TestCDCSchema_NoRowIDWhenRetainRowIDFalse to guard the disttae
NewChangesHandler/NewChangesHandlerWithPartitionStateRange path used by
CDC and ISCP, which leaves retainRowID=false. The four subtests assert
that updateTombstoneBatch, updateDataBatch, fillInDeleteBatch, and
fillInInsertBatch never emit a T_Rowid vec or Row_ID attribute on that
path, so position-indexed CDC sinks cannot regress silently if future
changes flip the rowid plumbing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
BlockListRelData.DataSlice copies the parent's tombstones pointer at split
time. The previous order was:

  shards := splitSnapshotScanShards(relData, ...)   // shards.tombstones = nil
  ...
  relData.AttachTombstones(tombstones)              // only mutates parent
  scanSnapshotShardsParallel(shards, ...)           // each shard sees nil

so the parallel snapshot scan path silently skipped tombstone filtering and
could surface rows that were already deleted. The serial path was unaffected
because it attaches tombstones to relData before constructing the source.

Move AttachTombstones above splitSnapshotScanShards so DataSlice propagates
the pointer into every shard. Add TestSplitSnapshotScanShards_TombstoneInheritance
to pin the ordering invariant.

Reported in PR matrixorigin#24174 review comment 4302553554.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working kind/test-ci size/XXL Denotes a PR that changes 2000+ lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants