Skip to content

Fix semaphore permit drain, burst gate race, migration issues#204

Merged
vadv merged 8 commits intomasterfrom
fix/migration-connection-id-collision
Apr 21, 2026
Merged

Fix semaphore permit drain, burst gate race, migration issues#204
vadv merged 8 commits intomasterfrom
fix/migration-connection-id-collision

Conversation

@vadv
Copy link
Copy Markdown
Collaborator

@vadv vadv commented Apr 20, 2026

Summary

  • Semaphore permit drain on direct handoff. Each return_object handoff permanently consumed one semaphore permit. After max_size handoffs the pool could not create connections and stabilized at 3-4 out of 40 servers. Root cause: wrap_checkout calls permit.forget(), but the handoff path never called add_permits(1). Now both handoff and idle-queue paths restore the permit. Compensating add_permits in pre_replace_one removed.

  • Burst gate tokio::select! race. Without biased;, the select randomly picked among ready branches — a connection delivered via oneshot could be silently dropped when create_done or the backoff timer won. Fixed with biased; (oneshot first) and a try_recv drain that pushes orphaned connections to idle without double-counting permits.

  • Client ID collision after migration. New process counter started at 0, colliding with migrated IDs. Now advances past the highest migrated ID via fetch_max.

  • SCRAM passthrough state preserved. Migration payload v2 (backward compatible) carries the ClientKey so the new process skips ScramPending fallback.

  • Session mode xact_time fix. Previously recorded the entire session duration. Now per-transaction at each ReadyForQuery(Idle).

  • Adaptive anticipation budget. Scales with xact_p99 * 2 +/- 20% jitter, clamped to [5ms, 500ms]. Cold start: 100ms.

  • Diagnostic logging. Slow checkout warnings include full pool state. Phase-specific warnings for semaphore, burst gate, coordinator, and create failures.

Test plan

  • cargo test --lib pool::inner::tests — 45 tests pass (18 new: semaphore invariant across handoff/idle/mixed/concurrent paths, negative test for the old drain, pre_replace inflation guard, burst gate try_recv drain, anticipation budget edge cases)
  • cargo clippy -- --deny warnings — clean
  • Staging: servers=40, wait=0, avg_wait=0.1ms, qps=5000 (was: servers=3, wait=50, avg_wait=800ms, qps=400)
  • Staging with pool coordinator: servers=40, wait=0, antic_notify=61k
  • BDD: pool-pressure scenarios

@vadv vadv force-pushed the fix/migration-connection-id-collision branch 6 times, most recently from 1a98d43 to 31cf0f3 Compare April 20, 2026 16:44
Comment thread src/client/migration.rs Fixed
Comment thread src/client/migration.rs Fixed
Comment thread src/stats/mod.rs Dismissed
@vadv vadv force-pushed the fix/migration-connection-id-collision branch from 31cf0f3 to 3e2e345 Compare April 20, 2026 19:27
Migration fixes:
- Client ID collision: fetch_max counter after reconstruct
- SCRAM passthrough: serialize BackendAuthMethod in migration v2

Adaptive anticipation budget:
- Static 300-500ms replaced with xact_p99 × 2 ± 20% jitter
- Cold start default: 100ms ± 20%
- Clamped [5ms, 500ms]

Session mode stats:
- xact_time recorded per-transaction via session_xact_start field
- query_start_at not leaked through function signatures

Checkout diagnostics:
- Timeout logs include phase (semaphore/burst_gate/coordinator/create)
- Slow checkout (>500ms) logs pool state snapshot
@vadv vadv force-pushed the fix/migration-connection-id-collision branch from 3e2e345 to 86a67aa Compare April 20, 2026 19:31
Comment thread src/pool/inner.rs Fixed
Comment thread src/pool/inner.rs Dismissed
Comment thread src/pool/inner.rs Dismissed
Comment thread src/pool/inner.rs Fixed
dmitrivasilyev added 2 commits April 21, 2026 14:29
What was needed: pools stabilized at 3-4 connections out of 40 under load,
with 800ms checkout latency and 50+ waiting clients. The pool could not
grow because the semaphore that gates timeout_get was permanently drained.

What changed: return_object now restores the semaphore permit on both the
handoff (deliver to waiter) and idle-queue paths. Previously only the idle
path called add_permits(1) — each handoff permanently consumed one permit.
After max_size handoffs the semaphore was empty, blocking all new checkouts.

Additional fixes: biased select in burst gate prevents silent connection
loss on tokio::select! races; try_recv drain recovers orphaned connections
without double-counting permits; pre_replace_one no longer adds a
compensating permit (the drain it compensated for is gone). Migration
protocol v2 preserves SCRAM passthrough state. Session-mode xact_time
recorded per-transaction. Adaptive anticipation budget scales with xact_p99.
Diagnostic logging on slow checkouts and phase-level failures.
- restore_backend_auth_if_pending: &Option<T> → Option<&T>
- crate::config::BackendAuthMethod → use import
- Magic 50ms in burst gate → BURST_GATE_EXHAUSTED_BACKOFF constant
@vadv vadv changed the title Fix duplicate client_id after migration Fix semaphore permit drain, burst gate race, migration issues Apr 21, 2026
Regression test for the permit drain bug. 200 clients share 10 slots
for 5 seconds with 5ms hold time — thousands of handoff cycles. The
pool must remain at full size throughout; if the semaphore drains, the
pool shrinks and p99 latency exceeds the 600ms bound.
Comment thread src/client/migration.rs Dismissed
Comment thread src/pool/inner.rs Dismissed
Comment thread src/pool/inner.rs Dismissed
dmitrivasilyev added 4 commits April 21, 2026 15:11
Replace Rust internals (VecDeque, return_object, slots.size, yield_now,
tokio::select!, AtomicU64) with behavioral descriptions. Translate all
section headings. Add Russian-first definitions for foreign terms at
first use: anticipation, burst gate, thundering herd, direct handoff,
eviction, jitter, timeout cliff. Fix unnatural Russian: дропается,
протаймаутился, recycl'ится, апгрейднуть, спайк, backend-spawn'ов.
Rewrite glossary with behavioral descriptions.
Same bug as xact_time: query_start_at was set once before the inner
loop and never reset. In session mode the inner loop runs for the
entire session, so each query reported cumulative elapsed time instead
of individual query duration (observed as query_ms p50=102s).

Fix: reset query_start_at at the top of each inner-loop iteration
in session mode. One if-statement, same pattern as the xact_time fix.
New step 'server query_p99 is below N ms' reads the address stats
histogram directly. The scenario runs 20 session-mode clients for 3s
with 5ms hold time. Without the query_start_at reset fix, p99 would
be in the 100-second range. Bound set at 500ms to catch the bug
while tolerating scheduling noise.
@vadv vadv merged commit fa68a3b into master Apr 21, 2026
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants