Problem
When running Spawn+Gather with many tasks (500+), the doeff trace accumulates entries without bound. Observed TraceEntries:9699 in a run with 500 tasks.
Each spawned task generates multiple trace entries (semaphore acquire/release, slog, cache effects, etc.), and these are never pruned during execution.
Observed behavior
In the doeff Traceback output for a single failed task (task 7), ReleaseSemaphore appears 8 times consecutively:
── in task 7 ──
_wrapped() pipeline.py:98
yield ReleaseSemaphore(Semaphore(2))
⇢ SchedulerHandler transferred to None
_wrapped() pipeline.py:98
yield ReleaseSemaphore(Semaphore(2))
⇢ SchedulerHandler transferred to None
_wrapped() pipeline.py:98
yield ReleaseSemaphore(Semaphore(2))
⇢ SchedulerHandler transferred to None
... (8 times total)
_dummy_sllm() test_spawn_gather_sizes.py:35
raise RuntimeError('Simulated LLM error')
Questions
- Why does
ReleaseSemaphore appear 8 times for a single task that should acquire/release once?
- Is the trace accumulating entries from ALL spawned tasks into a single trace buffer?
- Is there a limit or pruning mechanism for trace entries?
Reproduction
# In proboscis-ema repo (uses real index_news_events pipeline)
uv run doeff run \
--program proboscis_ema.doeff.experiments.test_spawn_gather_sizes.p_direct_500 \
--interpreter proboscis_ema.doeff.pipeline_interpreter
Self-contained reproduction in doeff repo was not achieved — the same pattern with mock effects does not exhibit the trace growth. The difference is the number of effects per task: the real index_news_events yields ~20 effects per task (slog, cache_get, cache_put, LLM call, etc.) vs 1-2 in mock tests.
Impact
- TraceEntries growing to 9699+ may cause memory pressure
- The repeated
ReleaseSemaphore in the trace suggests something unexpected in how Gather handles cleanup of spawned tasks
Problem
When running Spawn+Gather with many tasks (500+), the doeff trace accumulates entries without bound. Observed
TraceEntries:9699in a run with 500 tasks.Each spawned task generates multiple trace entries (semaphore acquire/release, slog, cache effects, etc.), and these are never pruned during execution.
Observed behavior
In the doeff Traceback output for a single failed task (task 7),
ReleaseSemaphoreappears 8 times consecutively:Questions
ReleaseSemaphoreappear 8 times for a single task that should acquire/release once?Reproduction
# In proboscis-ema repo (uses real index_news_events pipeline) uv run doeff run \ --program proboscis_ema.doeff.experiments.test_spawn_gather_sizes.p_direct_500 \ --interpreter proboscis_ema.doeff.pipeline_interpreterSelf-contained reproduction in doeff repo was not achieved — the same pattern with mock effects does not exhibit the trace growth. The difference is the number of effects per task: the real
index_news_eventsyields ~20 effects per task (slog, cache_get, cache_put, LLM call, etc.) vs 1-2 in mock tests.Impact
ReleaseSemaphorein the trace suggests something unexpected in how Gather handles cleanup of spawned tasks