Fix semaphore permit drain, burst gate race, migration issues by vadv · Pull Request #204 · ozontech/pg_doorman

vadv · 2026-04-20T10:45:27Z

Summary

Semaphore permit drain on direct handoff. Each return_object handoff permanently consumed one semaphore permit. After max_size handoffs the pool could not create connections and stabilized at 3-4 out of 40 servers. Root cause: wrap_checkout calls permit.forget(), but the handoff path never called add_permits(1). Now both handoff and idle-queue paths restore the permit. Compensating add_permits in pre_replace_one removed.
Burst gate tokio::select! race. Without biased;, the select randomly picked among ready branches — a connection delivered via oneshot could be silently dropped when create_done or the backoff timer won. Fixed with biased; (oneshot first) and a try_recv drain that pushes orphaned connections to idle without double-counting permits.
Client ID collision after migration. New process counter started at 0, colliding with migrated IDs. Now advances past the highest migrated ID via fetch_max.
SCRAM passthrough state preserved. Migration payload v2 (backward compatible) carries the ClientKey so the new process skips ScramPending fallback.
Session mode xact_time fix. Previously recorded the entire session duration. Now per-transaction at each ReadyForQuery(Idle).
Adaptive anticipation budget. Scales with xact_p99 * 2 +/- 20% jitter, clamped to [5ms, 500ms]. Cold start: 100ms.
Diagnostic logging. Slow checkout warnings include full pool state. Phase-specific warnings for semaphore, burst gate, coordinator, and create failures.

Test plan

cargo test --lib pool::inner::tests — 45 tests pass (18 new: semaphore invariant across handoff/idle/mixed/concurrent paths, negative test for the old drain, pre_replace inflation guard, burst gate try_recv drain, anticipation budget edge cases)
cargo clippy -- --deny warnings — clean
Staging: servers=40, wait=0, avg_wait=0.1ms, qps=5000 (was: servers=3, wait=50, avg_wait=800ms, qps=400)
Staging with pool coordinator: servers=40, wait=0, antic_notify=61k
BDD: pool-pressure scenarios

Migration fixes: - Client ID collision: fetch_max counter after reconstruct - SCRAM passthrough: serialize BackendAuthMethod in migration v2 Adaptive anticipation budget: - Static 300-500ms replaced with xact_p99 × 2 ± 20% jitter - Cold start default: 100ms ± 20% - Clamped [5ms, 500ms] Session mode stats: - xact_time recorded per-transaction via session_xact_start field - query_start_at not leaked through function signatures Checkout diagnostics: - Timeout logs include phase (semaphore/burst_gate/coordinator/create) - Slow checkout (>500ms) logs pool state snapshot

What was needed: pools stabilized at 3-4 connections out of 40 under load, with 800ms checkout latency and 50+ waiting clients. The pool could not grow because the semaphore that gates timeout_get was permanently drained. What changed: return_object now restores the semaphore permit on both the handoff (deliver to waiter) and idle-queue paths. Previously only the idle path called add_permits(1) — each handoff permanently consumed one permit. After max_size handoffs the semaphore was empty, blocking all new checkouts. Additional fixes: biased select in burst gate prevents silent connection loss on tokio::select! races; try_recv drain recovers orphaned connections without double-counting permits; pre_replace_one no longer adds a compensating permit (the drain it compensated for is gone). Migration protocol v2 preserves SCRAM passthrough state. Session-mode xact_time recorded per-transaction. Adaptive anticipation budget scales with xact_p99. Diagnostic logging on slow checkouts and phase-level failures.

- restore_backend_auth_if_pending: &Option<T> → Option<&T> - crate::config::BackendAuthMethod → use import - Magic 50ms in burst gate → BURST_GATE_EXHAUSTED_BACKOFF constant

Regression test for the permit drain bug. 200 clients share 10 slots for 5 seconds with 5ms hold time — thousands of handoff cycles. The pool must remain at full size throughout; if the semaphore drains, the pool shrinks and p99 latency exceeds the 600ms bound.

Replace Rust internals (VecDeque, return_object, slots.size, yield_now, tokio::select!, AtomicU64) with behavioral descriptions. Translate all section headings. Add Russian-first definitions for foreign terms at first use: anticipation, burst gate, thundering herd, direct handoff, eviction, jitter, timeout cliff. Fix unnatural Russian: дропается, протаймаутился, recycl'ится, апгрейднуть, спайк, backend-spawn'ов. Rewrite glossary with behavioral descriptions.

Same bug as xact_time: query_start_at was set once before the inner loop and never reset. In session mode the inner loop runs for the entire session, so each query reported cumulative elapsed time instead of individual query duration (observed as query_ms p50=102s). Fix: reset query_start_at at the top of each inner-loop iteration in session mode. One if-statement, same pattern as the xact_time fix.

New step 'server query_p99 is below N ms' reads the address stats histogram directly. The scenario runs 20 session-mode clients for 3s with 5ms hold time. Without the query_start_at reset fix, p99 would be in the 100-second range. Bound set at 500ms to catch the bug while tolerating scheduling noise.

vadv force-pushed the fix/migration-connection-id-collision branch 6 times, most recently from 1a98d43 to 31cf0f3 Compare April 20, 2026 16:44

github-advanced-security AI found potential problems Apr 20, 2026

View reviewed changes

Comment thread src/client/migration.rs Fixed

Comment thread src/client/migration.rs Fixed

Comment thread src/stats/mod.rs Dismissed

vadv force-pushed the fix/migration-connection-id-collision branch from 31cf0f3 to 3e2e345 Compare April 20, 2026 19:27

vadv force-pushed the fix/migration-connection-id-collision branch from 3e2e345 to 86a67aa Compare April 20, 2026 19:31

github-advanced-security AI found potential problems Apr 20, 2026

View reviewed changes

Comment thread src/pool/inner.rs Fixed

Comment thread src/pool/inner.rs Dismissed

Comment thread src/pool/inner.rs Dismissed

Comment thread src/pool/inner.rs Fixed

dmitrivasilyev added 2 commits April 21, 2026 14:29

Address review: idiomatic Option<&T>, named constant, shorter import

602ee08

- restore_backend_auth_if_pending: &Option<T> → Option<&T> - crate::config::BackendAuthMethod → use import - Magic 50ms in burst gate → BURST_GATE_EXHAUSTED_BACKOFF constant

vadv changed the title ~~Fix duplicate client_id after migration~~ Fix semaphore permit drain, burst gate race, migration issues Apr 21, 2026

github-advanced-security AI found potential problems Apr 21, 2026

View reviewed changes

Comment thread src/client/migration.rs Dismissed

Comment thread src/pool/inner.rs Dismissed

Comment thread src/pool/inner.rs Dismissed

dmitrivasilyev added 4 commits April 21, 2026 15:11

Fix corrupted Unicode in RU pool-pressure docs

4086205

vadv merged commit fa68a3b into master Apr 21, 2026
45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix semaphore permit drain, burst gate race, migration issues#204

Fix semaphore permit drain, burst gate race, migration issues#204
vadv merged 8 commits intomasterfrom
fix/migration-connection-id-collision

vadv commented Apr 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vadv commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vadv commented Apr 20, 2026 •

edited

Loading