Implement pool shutdown for ChannelDbConnectionPool and harden WaitHandleDbConnectionPool shutdown#4302
Conversation
…ndleDbConnectionPool shutdown (AB#44828) ChannelDbConnectionPool: - Shutdown() now transitions State to ShuttingDown (FR-001), completes the idle channel writer to unblock async waiters (FR-002, FR-007), and drains and destroys buffered idle connections (FR-003). Implementation is idempotent via Interlocked.CompareExchange (FR-006). - Startup() emits a trace and is a structural counterpart to Shutdown. - ReturnInternalConnection no longer asserts on TryWrite; if the channel was completed by a concurrent shutdown, the returning connection is destroyed (FR-004 race-window guard). - IdleConnectionChannel.Complete() exposes channel completion to the pool. WaitHandleDbConnectionPool: - Shutdown() is now idempotent (FR-006), disposes both _cleanupTimer and _errorTimer (FR-005), wakes threads parked in WaitHandle.WaitAny by releasing the pool semaphore (FR-007), and drains _stackNew/_stackOld (FR-003). - CleanupCallback and ErrorCallback short-circuit when State == ShuttingDown so an in-flight callback racing with Shutdown does not re-arm work. - The private TryGetConnection wait loop re-checks State after WaitHandle.WaitAny and bails out, so waiters cannot spin against a drained pool. Tests: - ChannelDbConnectionPoolShutdownTest (7 tests) covers state transition, drain, return-after-shutdown, idempotency, async waiter unblock, post-shutdown return, and Startup-after-Shutdown. - WaitHandleDbConnectionPoolShutdownTest (9 tests) covers state transition, cleanup/error timer disposal, drain, idempotency, callback no-op after shutdown, sync TryGetConnection short-circuit, and sync waiter unblock. Verified by running targeted suite: 340 tests passing across net8.0, net9.0, net10.0, and net462. Spec: specs/004-pool-shutdown.
There was a problem hiding this comment.
Pull request overview
This PR implements the “pool shutdown” contract (spec 004-pool-shutdown) across both connection pool implementations so that retiring a pool (via SqlConnectionFactory.QueuePoolForRelease) reliably stops background work, unblocks waiters, and destroys idle/returning connections.
Changes:
- Implement
ChannelDbConnectionPool.Shutdown()by transitioning toShuttingDown, completing/draining the idle channel, and making shutdown idempotent. - Harden
WaitHandleDbConnectionPool.Shutdown()by making it idempotent, disposing timers, draining idle stacks, and attempting to wake blocked synchronous waiters; guard timer callbacks during shutdown. - Add unit tests covering shutdown behavior for both pool types.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/ConnectionPool/ChannelDbConnectionPool.cs | Implements real shutdown behavior (state transition, channel completion/drain, safer return-on-shutdown) and traces Startup(). |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/ConnectionPool/IdleConnectionChannel.cs | Adds Complete() to allow the pool to terminate the channel writer on shutdown. |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/ConnectionPool/WaitHandleDbConnectionPool.cs | Adds shutdown idempotency, disposes both timers, drains idle stacks, attempts to unblock sync waiters, and guards timer callbacks during shutdown. |
| src/Microsoft.Data.SqlClient/tests/UnitTests/ConnectionPool/ChannelDbConnectionPoolShutdownTest.cs | Adds unit coverage for channel-pool shutdown semantics (drain, idempotency, unblock waiters, destroy-on-return). |
| src/Microsoft.Data.SqlClient/tests/UnitTests/ConnectionPool/WaitHandleDbConnectionPoolShutdownTest.cs | Adds unit coverage for wait-handle-pool shutdown semantics (timer disposal, drain, idempotency, unblock waiter, callback no-op). |
Code review fixes (WaitHandleDbConnectionPool):
- TryGetConnection shutdown short-circuit now releases the PoolSemaphore
slot when WaitAny was woken by SEMAPHORE_HANDLE (or
WAIT_ABANDONED+SEMAPHORE_HANDLE), preventing a semaphore-slot leak that
would starve future waiters.
- Shutdown() releases Volatile.Read(ref _waitCount) slots instead of
MaxPoolSize, fixing ArgumentOutOfRangeException when MaxPoolSize == 0
(unlimited pool) and ensuring every parked waiter is woken under high
contention. Kept SemaphoreFullException guard.
Test fix:
- Shutdown_UnblocksSyncWaiter replaces Thread.Sleep(200) with a bounded
poll on the private _waitCount field (via the existing reflection
helper) so the test is deterministic on slow CI agents.
Documentation cleanup:
- Removed inline 'FR-00x' spec-traceability labels from comments and test
section banners across both pool implementations and both shutdown
test files. Explanatory text is preserved; the spec coverage will live
in the PR description instead of the source.
…tionPool shutdown tests Replace reflection-based access of WaitHandleDbConnectionPool internals from the shutdown unit tests with direct field/method access. The pool's _waitCount, _errorTimer, _cleanupTimer, CleanupCallback, and ErrorCallback are now declared internal so the UnitTests assembly (covered by InternalsVisibleTo) can reach them without GetField/Invoke. The GetPrivateField<T> helper is removed. Per @mdaigle review feedback on dotnet#4302.
…pools Both pool implementations now delegate the connection drain in Shutdown() to the existing Clear() routine instead of duplicating the drain inline. ChannelDbConnectionPool.Shutdown(): keeps the CompareExchange election, State transition and _idleChannel.Complete() call, then calls Clear() (which bumps _clearGeneration and best-effort drains the idle channel). A final unbounded drain loop is retained as a mop-up because Clear() may short-circuit when another Clear() is concurrently in flight; the final loop is safe because the channel is already completed and no new writes can succeed. WaitHandleDbConnectionPool.Shutdown(): keeps the trace, idempotent State check, timer disposal and waiter wake-up, then replaces the inline _stackNew / _stackOld drain loops with a single Clear() call. Clear() additionally dooms every entry in _objectList, decrements the ExitFreeConnection metric per drained connection, and runs ReclaimEmancipatedObjects() — strictly more cleanup than the previous inline drain, with no behaviour change for normal Running-state operations. Per @mdaigle review feedback on dotnet#4302.
…ryGetConnection in shutdown tests PR dotnet#4295 (merged to main) added a required TimeoutTimer parameter to ChannelDbConnectionPool.TryGetConnection. The shutdown tests added in this PR still call the pre-dotnet#4295 3-arg overload, which compiles locally against PoolShutdown's older base but fails in the CI pipeline because the pipeline merges this PR's head into the post-dotnet#4295 main. Pass TimeoutTimer.StartNew(TimeSpan.FromSeconds(15)) at all 5 call sites to match the new signature.
…ryGetConnection in shutdown tests PR 4295 (merged to main) added a required TimeoutTimer parameter to ChannelDbConnectionPool.TryGetConnection. The shutdown tests added in this PR still call the pre-4295 3-arg overload, which compiles locally against PoolShutdown's older base but fails in the CI pipeline because the pipeline merges this PR's head into the post-4295 main. Pass TimeoutTimer.StartNew(TimeSpan.FromSeconds(15)) at all 5 call sites to match the new signature.
db3d616 to
9a45a66
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4302 +/- ##
==========================================
- Coverage 65.78% 63.93% -1.86%
==========================================
Files 286 281 -5
Lines 43696 66650 +22954
==========================================
+ Hits 28745 42610 +13865
- Misses 14951 24040 +9089
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
The previous comment claimed pruning was not implemented, but PR dotnet#4295 introduced PoolPruner which is constructed eagerly in the ctor (when MinPoolSize < MaxPoolSize) and arms/disarms its timer via UpdateTimer() calls from OpenNewInternalConnection and RemoveConnection. Update the comment so future maintainers see Startup correctly described as a no-op symmetrical counterpart of Shutdown, while accurately documenting that pruning's background timer is owned by PoolPruner and managed by the connection lifecycle paths, not by Startup.
Two new shutdown-state checks I added (async-path short-circuit and post-WaitAny re-check) used 'State != Running' while the rest of the file consistently uses 'State is not Running'. Align them, plus the adjacent comments, with the prevailing style. Trace strings are left as '!=' since they are human-readable log output.
…mment WaitHandleDbConnectionPool: when the shutdown fast-path after WaitAny returns early, the outer finally that normally releases CreationSemaphore on the CREATION_HANDLE path is bypassed. Mirror the existing PoolSemaphore compensation: on waitResult == CREATION_HANDLE (and the WAIT_ABANDONED variant), Release(1) on CreationSemaphore inside a try/catch SemaphoreFullException so accounting stays balanced. ChannelDbConnectionPool: the comment on _idleChannel.Complete() referenced InvalidOperationException, but IdleConnectionChannel.Complete wraps ChannelWriter.TryComplete which is idempotent and never throws. Rewrite the comment to describe TryComplete's actual semantics so future maintainers aren't misled about why this call is safe.
|
Some of the tests are a bit flaky in the pipeline. Please double check any tests that rely on timing. |
Commit c5a5544 added a CreationSemaphore.Release(1) in the shutdown fast-path based on the premise that the early 'return false' bypassed the outer finally that normally releases CreationSemaphore on CREATION_HANDLE. That premise was wrong: C# try/finally executes the finally on return, so the outer finally at line ~1196 already handled the release. My extra Release therefore double-released, tripping SemaphoreFullException (CreationSemaphore max=1) whenever the fast-path was taken with waitResult == CREATION_HANDLE. CI's WaitHandleDbConnectionPoolShutdownTest.Shutdown_UnblocksSyncWaiter hit this reliably in Debug config. Remove the CreationSemaphore else-if branch and rewrite the comment to explain why compensation is intentionally absent (outer finally handles it). Keep the PoolSemaphore compensation - that path has no outer finally, so the manual Release is genuinely required. Verified locally: Shutdown_UnblocksSyncWaiter and 24 other Shutdown-related tests pass on net9.0; 20/20 stress iterations of the previously-failing test pass.
9419ee7
Implements the Shutdown lifecycle method on both connection-pool implementations (ChannelDbConnectionPool and WaitHandleDbConnectionPool) so callers can deterministically stop a pool, drain its idle connections, and wake any parked waiters.
Compatibility and behavior:
• No public API surface change for end users —
Shutdown()is an internal pool lifecycle method exposed via IDbConnectionPool. Existing pooling behavior is unchanged untilShutdown()is invoked. •Shutdown()is idempotent and terminal: repeated calls are no-ops and the pool cannot be restarted (Startup()afterShutdown()does not resurrect the pool). • Active checked-out connections are not aborted; they are destroyed instead of pooled when their owners return them. • Parked waiters (syncWaitHandle.WaitAny` and async channel readers) are woken with a non-blocking failure signal rather than left to time out.Additional changes:
• ChannelDbConnectionPool: adds an
Interlocked.CompareExchange-guardedShutdown()that completes the idle channel writer, drains buffered idle connections, and routes in-flight returners throughRemoveConnectionvia arace-window guard in
ReturnInternalConnection.• WaitHandleDbConnectionPool: adds
Shutdown()that disposes the cleanup and error timers, releases the pool semaphore to wake parked waiters (usingVolatile.Read(ref _waitCount)so unlimited pools — MaxPoolSize == 0— work correctly), and drains
_stackNew/_stackOld. Cleanup-callback, error-callback, and theTryGetConnectionwait loop now short-circuit when the pool is shutting down. The wait-loop short-circuit also releases thesemaphore slot it consumed so accounting stays balanced.
• IdleConnectionChannel: adds a
Complete()helper wrappingChannelWriter.TryComplete()so the pool can close its idle channel without exposing the raw writer.• Adds 16 deterministic xUnit tests (7 channel + 9 wait-handle) covering state transition, idle drain, in-flight-returner destruction, timer disposal, callback no-op, idempotency, and waiter wake-up. All unit tests pass on net8.0/net9.0/net10.0/net462.
Out of scope (deferred to a follow-up):
• Transaction-root stasis on shutdown for the channel pool — connections enlisted in active distributed transactions still need to be staked to the transaction so the transaction-end callback can destroy them deterministically.