Skip to content

Increase the worker event-loop TTL to prevent thrashing — Closes #262#271

Merged
conradbzura merged 2 commits into
wool-labs:masterfrom
conradbzura:262-reuse-worker-event-loop
Jul 2, 2026
Merged

Increase the worker event-loop TTL to prevent thrashing — Closes #262#271
conradbzura merged 2 commits into
wool-labs:masterfrom
conradbzura:262-reuse-worker-event-loop

Conversation

@conradbzura

Copy link
Copy Markdown
Contributor

Summary

Reuse a single warm worker event-loop across dispatches instead of creating and destroying an OS thread + event loop on every dispatch. The worker-loop ResourcePool previously used ttl=0, so under sequential dispatch its refcount went 0→1→0 on every call — recreating and tearing down the loop (and running a full task drain) per dispatch. Give the pool a positive TTL (_WORKER_LOOP_TTL = 30.0) so the warm loop is reused within the window and torn down only when idle past the TTL or on stop. This removes the per-dispatch churn (~31% of the 0.9.3 → 0.10.0 dispatch-latency regression), cuts the latency tail, and clears the intermittent cross-loop Lock ... bound to a different event loop crash under concurrency. stop still clears the pool immediately, so no loop or daemon thread leaks. Closes #262

Proposed changes

Reuse the worker loop via a positive TTL (runtime/worker/service.py)

Add a documented module constant _WORKER_LOOP_TTL = 30.0 and pass it as the loop pool's ttl (was 0). The pool is keyed by the constant "worker", so a positive TTL keeps one warm loop + thread across dispatches instead of churning one per call. The TTL bounds only how long an idle loop lingers; an explicit stop clears the pool at once.

Reap warm loops left by unstopped test services (tests/runtime/worker/conftest.py)

Add an autouse _reap_worker_loops teardown: a persisted loop is observable, so a test that dispatches without stopping its service would otherwise leave a running daemon-thread loop until interpreter exit (where the task-factory finalizer logs a spurious displacement warning). The fixture reaps such loops after each test, draining residual tasks across successive generations — matching production _destroy_worker_loop's multi-generation drain — before stopping.

Test cases

# Test Suite Given When Then Coverage Target
1 test_service A single worker service Two routines are dispatched sequentially Both run on the same worker loop + daemon thread (identity fingerprint matches) Loop reuse
2 test_service A service with a warm worker loop stop is called The warm daemon thread is reaped, stopped is set, and no displacement warning is logged Clean stop-time teardown
3 test_service A service whose worker-loop TTL is shortened A dispatch completes and the loop sits idle past the TTL The idle worker loop is reaped automatically TTL expiry

@conradbzura conradbzura self-assigned this Jul 2, 2026
@conradbzura conradbzura marked this pull request as ready for review July 2, 2026 15:38
The worker service runs each dispatch's routine on a loop taken from a
ResourcePool keyed by the constant "worker". The pool used ttl=0, so
the loop's reference count fell to zero at the end of every dispatch
and the finalizer tore the loop and its daemon thread down, only for
the next dispatch to build them again. That per-dispatch churn, and the
task drain the teardown runs, dominate dispatch latency and are
implicated in a cross-loop lock error under concurrency.

Give the pool a positive time-to-live, _WORKER_LOOP_TTL, so one warm
loop and thread serve dispatches within the window and are torn down
only once the loop sits idle past the TTL. An explicit stop still
clears the pool at once, so no loop or thread outlives the service.
Assert one warm worker loop and daemon thread serve two sequential
dispatches, that stopping the service reaps the warm thread with no
factory-displacement warning, and that an idle loop is reaped once its
time-to-live elapses. Add an autouse teardown that reaps any loop a
test leaves warm, draining residual tasks across successive generations
as the production finalizer does, so a cancelled task's follow-up
cleanup is not stranded.
@conradbzura conradbzura force-pushed the 262-reuse-worker-event-loop branch from f08ebfe to 64d0f33 Compare July 2, 2026 23:01
@conradbzura conradbzura merged commit c925b35 into wool-labs:master Jul 2, 2026
11 checks passed
@conradbzura conradbzura linked an issue Jul 2, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Increase the worker event-loop TTL to prevent thrashing

1 participant