Skip to content

Fix AOF persistence and WATCH for collection-emptying RMW ops [main]#1678

Merged
badrishc merged 3 commits intomainfrom
badrishc/fix-aof-object-mutation-persistence-main
Apr 8, 2026
Merged

Fix AOF persistence and WATCH for collection-emptying RMW ops [main]#1678
badrishc merged 3 commits intomainfrom
badrishc/fix-aof-object-mutation-persistence-main

Conversation

@badrishc
Copy link
Copy Markdown
Collaborator

@badrishc badrishc commented Apr 7, 2026

Bug 1: Collection-emptying RMW operations (LPOP, ZREM, HDEL, SREM that empty a collection) were not logged to AOF. InPlaceUpdaterWorker and PostCopyUpdater returned false before setting NeedAofLog when HasRemoveKey was true, so PostRMWOperation never wrote the AOF entry. Data reappeared after restart.

Bug 2: InPlaceUpdaterWorker's HasRemoveKey path did not call IncrementVersion on the watch version map, so WATCH/MULTI/EXEC transactions did not detect the key modification and incorrectly committed.

Bug 3: AofProcessor.RecoverReplay disposed respServerSession in its finally block, but in multi-DB recovery it ran once per database, causing double-dispose and NullReferenceException in GarnetLatencyMetricsSession.Return(). Moved dispose to AofProcessor.Dispose() which runs exactly once.

Fixes #1675

Bug 1: Collection-emptying RMW operations (LPOP, ZREM, HDEL, SREM that
empty a collection) were not logged to AOF. InPlaceUpdaterWorker and
PostCopyUpdater returned false before setting NeedAofLog when
HasRemoveKey was true, so PostRMWOperation never wrote the AOF entry.
Data reappeared after restart.

Bug 2: InPlaceUpdaterWorker's HasRemoveKey path did not call
IncrementVersion on the watch version map, so WATCH/MULTI/EXEC
transactions did not detect the key modification and incorrectly
committed.

Bug 3: AofProcessor.RecoverReplay disposed respServerSession in its
finally block, but in multi-DB recovery it ran once per database,
causing double-dispose and NullReferenceException in
GarnetLatencyMetricsSession.Return(). Moved dispose to
AofProcessor.Dispose() which runs exactly once.

Fixes #1675

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 19:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes AOF persistence and WATCH version tracking for collection-emptying RMW operations, and corrects disposal behavior during multi-DB AOF recovery to avoid double-dispose failures.

Changes:

  • Ensure collection-emptying RMW ops set NeedAofLog so AOF records are written and replay correctly.
  • Increment WATCH version on key-removal paths so WATCH/MULTI/EXEC properly aborts when keys are emptied/deleted.
  • Move replay-session disposal into AofProcessor.Dispose() to prevent double-dispose during multi-DB recovery; add regression tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/Garnet.test/TransactionTests.cs Adds WATCH regression test for list deletion via LPOP.
test/Garnet.test/RespAofTests.cs Adds AOF recovery tests for deletion/emptying behavior across list, zset, hash, set.
test/Garnet.test/MultiDatabaseTests.cs Adds multi-DB AOF recovery regression test for emptying object collections.
libs/server/Storage/Functions/ObjectStore/RMWMethods.cs Sets AOF logging and WATCH version increment on key-removal/emptying paths.
libs/server/AOF/AofProcessor.cs Adjusts disposal to avoid double-dispose in multi-DB recovery.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@badrishc badrishc merged commit a18c8ff into main Apr 8, 2026
24 of 27 checks passed
@badrishc badrishc deleted the badrishc/fix-aof-object-mutation-persistence-main branch April 8, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AOF inconsistency: DELETE persists correctly but LPOP/ZREM/HDEL are not durable (data reappears after restart)

3 participants