Skip to content

Misleading error message can occur if a node is shut down during a scavenge #5489

Description

@timothycoleman

v26.0

If a node is shutdown during a scavenge, the scavenge can log an error message that makes it appear that the scavenge was the cause of the shutdown, rather than the shutdown being the cause of the scavenge error.

This is ugly and confusing. It does not have further implications though. There are no risks to data integrity, and the scavenge can be resumed without any problem.

It might be sensible to include stopping the scavenge in the graceful shutdown routine.

{
    "@t": "2026-02-04T06:29:39.2575806+00:00",
    "@mt": "SCAVENGING: Scavenge Failed. Total time taken: {elapsed}.",
    "@l": "Error",
    "@i": 3311317354,
    "@x": "KurrentDB.Core.Exceptions.FileBeingDeletedException: Exception of type 'KurrentDB.Core.Exceptions.FileBeingDeletedException' was thrown.
       at KurrentDB.Core.TransactionLog.Chunks.TFChunk.TFChunk.GetReaderWorkItem() in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Chunks/TFChunk/TFChunk.cs:line 1380
       at KurrentDB.Core.TransactionLog.Chunks.TFChunk.TFChunk.TFChunkReadSideScavenged.TryReadClosestForward(Int64 logicalPosition, CancellationToken token) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Chunks/TFChunk/TFChunkReadSide.cs:line 279
       at KurrentDB.Core.TransactionLog.Scavenging.DbAccess.ChunkReaderForExecutor`1.ReadInto(NonPrepare nonPrepare, Prepare prepare, CancellationToken token)+MoveNext() in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/DbAccess/ChunkReaderForExecutor.cs:line 65
       at KurrentDB.Core.TransactionLog.Scavenging.DbAccess.ChunkReaderForExecutor`1.ReadInto(NonPrepare nonPrepare, Prepare prepare, CancellationToken token)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()
       at KurrentDB.Core.TransactionLog.Scavenging.Stages.ChunkExecutor`2.ExecutePhysicalChunk(Single physicalWeight, ScavengePoint scavengePoint, IScavengeStateForChunkExecutorWorker`1 state, ITFChunkScavengerLog scavengerLogger, IChunkReaderForExecutor`2 sourceChunk, Stopwatch sw, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Stages/ChunkExecutor.cs:line 277
       at KurrentDB.Core.TransactionLog.Scavenging.Stages.ChunkExecutor`2.ExecutePhysicalChunk(Single physicalWeight, ScavengePoint scavengePoint, IScavengeStateForChunkExecutorWorker`1 state, ITFChunkScavengerLog scavengerLogger, IChunkReaderForExecutor`2 sourceChunk, Stopwatch sw, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Stages/ChunkExecutor.cs:line 277
       at KurrentDB.Core.TransactionLog.Scavenging.Stages.ChunkExecutor`2.<>c__DisplayClass12_0.<<Execute>b__2>d.MoveNext() in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Stages/ChunkExecutor.cs:line 156
    --- End of stack trace from previous location ---
       at KurrentDB.Core.TransactionLog.Scavenging.ParallelLoop.<RunWithTrailingCheckpointAsync>g__SpawnProcess|2_1[T](Func`4 process, Int32 slot, T item, CancellationToken token) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/ParallelLoop.cs:line 66
       at KurrentDB.Core.TransactionLog.Scavenging.ParallelLoop.RunWithTrailingCheckpointAsync[T](IAsyncEnumerable`1 source, Int32 degreeOfParallelism, Func`2 getCheckpointInclusive, Func`2 getCheckpointExclusive, Func`4 process, Action`1 emitCheckpoint, Action onConsiderEmit, CancellationToken token) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/ParallelLoop.cs:line 99
       at KurrentDB.Core.TransactionLog.Scavenging.ParallelLoop.RunWithTrailingCheckpointAsync[T](IAsyncEnumerable`1 source, Int32 degreeOfParallelism, Func`2 getCheckpointInclusive, Func`2 getCheckpointExclusive, Func`4 process, Action`1 emitCheckpoint, Action onConsiderEmit, CancellationToken token)
       at KurrentDB.Core.TransactionLog.Scavenging.Stages.ChunkExecutor`2.Execute(ExecutingChunks checkpoint, IScavengeStateForChunkExecutor`1 state, ITFChunkScavengerLog scavengerLogger, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Stages/ChunkExecutor.cs:line 99
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.Time(Stopwatch stopwatch, String name, Func`2 f, CancellationToken token) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 229
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.AfterCalculation(ScavengePoint scavengePoint, ITFChunkScavengerLog scavengerLogger, Stopwatch stopwatch, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 324
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.AfterAccumulation(ScavengePoint scavengepoint, ITFChunkScavengerLog scavengerLogger, Stopwatch stopwatch, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 315
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.StartNewAsync(ScavengePoint prevScavengePoint, ITFChunkScavengerLog scavengerLogger, Stopwatch stopwatch, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 302
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.RunInternal(ITFChunkScavengerLog scavengerLogger, Stopwatch stopwatch, CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 167
       at KurrentDB.Core.TransactionLog.Scavenging.Scavenger`1.ScavengeAsync(CancellationToken cancellationToken) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/TransactionLog/Scavenging/Scavenger.cs:line 96",
    "elapsed": "21:10:48.6795153",
    "ScavengeId": "290ddf2c-6b27-46c1-a39f-84de5111fa83",
    "SourceContext": "KurrentDB.Core.Services.Storage.StorageScavenger",
    "ProcessId": 24002,
    "ThreadId": 43
}

{
    "@t": "2026-02-04T06:29:40.4978425+00:00",
    "@mt": "SCAVENGING: Unexpected error when releasing the chunks lock",
    "@l": "Error",
    "@i": 1772716653,
    "@x": "System.ObjectDisposedException: Cannot access a disposed object.
    Object name: 'System.Threading.SemaphoreSlim'.
       at System.Threading.SemaphoreSlim.Release()
       at KurrentDB.Core.Synchronization.SemaphoreSlimLock.TryRelease(Guid acquisitionId) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/Synchronization/SemaphoreSlimLock.cs:line 51
       at KurrentDB.Core.Services.Storage.StorageScavenger.HandleCleanupWhenFinished(Task`1 newScavengeTask, IScavenger newScavenge, ILogger logger) in /home/runner/work/TrainStation/TrainStation/KurrentDB/src/KurrentDB.Core/Services/Storage/StorageScavenger.cs:line 189",
    "ScavengeId": "290ddf2c-6b27-46c1-a39f-84de5111fa83",
    "SourceContext": "KurrentDB.Core.Services.Storage.StorageScavenger",
    "ProcessId": 24002,
    "ThreadId": 43
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions