Increase the worker event-loop TTL to prevent thrashing — Closes #262#271
Merged
conradbzura merged 2 commits intoJul 2, 2026
Merged
Conversation
The worker service runs each dispatch's routine on a loop taken from a ResourcePool keyed by the constant "worker". The pool used ttl=0, so the loop's reference count fell to zero at the end of every dispatch and the finalizer tore the loop and its daemon thread down, only for the next dispatch to build them again. That per-dispatch churn, and the task drain the teardown runs, dominate dispatch latency and are implicated in a cross-loop lock error under concurrency. Give the pool a positive time-to-live, _WORKER_LOOP_TTL, so one warm loop and thread serve dispatches within the window and are torn down only once the loop sits idle past the TTL. An explicit stop still clears the pool at once, so no loop or thread outlives the service.
Assert one warm worker loop and daemon thread serve two sequential dispatches, that stopping the service reaps the warm thread with no factory-displacement warning, and that an idle loop is reaped once its time-to-live elapses. Add an autouse teardown that reaps any loop a test leaves warm, draining residual tasks across successive generations as the production finalizer does, so a cancelled task's follow-up cleanup is not stranded.
f08ebfe to
64d0f33
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reuse a single warm worker event-loop across dispatches instead of creating and destroying an OS thread + event loop on every dispatch. The worker-loop
ResourcePoolpreviously usedttl=0, so under sequential dispatch its refcount went 0→1→0 on every call — recreating and tearing down the loop (and running a full task drain) per dispatch. Give the pool a positive TTL (_WORKER_LOOP_TTL = 30.0) so the warm loop is reused within the window and torn down only when idle past the TTL or onstop. This removes the per-dispatch churn (~31% of the 0.9.3 → 0.10.0 dispatch-latency regression), cuts the latency tail, and clears the intermittent cross-loopLock ... bound to a different event loopcrash under concurrency.stopstill clears the pool immediately, so no loop or daemon thread leaks. Closes #262Proposed changes
Reuse the worker loop via a positive TTL (
runtime/worker/service.py)Add a documented module constant
_WORKER_LOOP_TTL = 30.0and pass it as the loop pool'sttl(was0). The pool is keyed by the constant"worker", so a positive TTL keeps one warm loop + thread across dispatches instead of churning one per call. The TTL bounds only how long an idle loop lingers; an explicitstopclears the pool at once.Reap warm loops left by unstopped test services (
tests/runtime/worker/conftest.py)Add an autouse
_reap_worker_loopsteardown: a persisted loop is observable, so a test that dispatches without stopping its service would otherwise leave a running daemon-thread loop until interpreter exit (where the task-factory finalizer logs a spurious displacement warning). The fixture reaps such loops after each test, draining residual tasks across successive generations — matching production_destroy_worker_loop's multi-generation drain — before stopping.Test cases
test_servicetest_servicestopis calledstoppedis set, and no displacement warning is loggedtest_service