Summary
When cache(lifecycle="persistent") wraps a @do function containing yield Await(asyncio.sleep(...)), and many instances are dispatched via Spawn+Gather, the program hangs indefinitely at around N>100 tasks.
The Await inside compute_and_cache() never returns — tasks enter compute start but never reach compute done.
Reproduction
Self-contained test added in tests/test_cache_await_spawn_hang.py:
@cache(lifecycle="persistent")
@do
def _cached_async_task(i: int) -> EffectGenerator[str]:
yield slog(msg=f"[{i}] compute start", level="info")
yield Await(asyncio.sleep(0.01))
yield slog(msg=f"[{i}] compute done", level="info")
return f"result-{i}"
@do
def _spawn_gather_n(task_factory, n: int) -> EffectGenerator[list]:
tasks = []
for i in range(n):
t = yield Spawn(task_factory(i), daemon=False)
tasks.append(t)
return list((yield Gather(*tasks)))
# Run with sqlite_cache_handler
wrapped = WithHandler(sqlite_cache_handler(None), _spawn_gather_n(_cached_async_task, 500))
r = run(wrapped, handlers=default_handlers())
uv run pytest tests/test_cache_await_spawn_hang.py -v --timeout=60
Results
| Test |
N |
Result |
| no cache + Await + Spawn |
500 |
✅ PASS (46s) |
| cache + Await + Spawn |
50 |
✅ PASS (4s) |
| cache + Await + Spawn |
100 |
✅ PASS |
| cache + Await + Spawn |
200 |
❌ HANG (timeout) |
| cache + Await + Spawn |
500 |
❌ HANG (timeout) |
Observations
- Without
cache(), the same Await + Spawn+Gather pattern works at any N
- With
cache(), tasks that hit cache miss enter compute_and_cache(), the inner yield Await(asyncio.sleep(0.01)) never completes
- Log shows
[i] compute start but never [i] compute done
- This blocks real-world pipelines (e.g., 754 cached LLM calls via Spawn+Gather)
- The threshold is somewhere between N=100 and N=200
Impact
This blocks any pipeline that uses cache() on async operations (LLM calls, HTTP requests) with Spawn+Gather at scale. Discovered while running a news indexing pipeline with ~754 events.
Summary
When
cache(lifecycle="persistent")wraps a@dofunction containingyield Await(asyncio.sleep(...)), and many instances are dispatched viaSpawn+Gather, the program hangs indefinitely at around N>100 tasks.The
Awaitinsidecompute_and_cache()never returns — tasks entercompute startbut never reachcompute done.Reproduction
Self-contained test added in
tests/test_cache_await_spawn_hang.py:Results
Observations
cache(), the sameAwait + Spawn+Gatherpattern works at any Ncache(), tasks that hit cache miss entercompute_and_cache(), the inneryield Await(asyncio.sleep(0.01))never completes[i] compute startbut never[i] compute doneImpact
This blocks any pipeline that uses
cache()on async operations (LLM calls, HTTP requests) withSpawn+Gatherat scale. Discovered while running a news indexing pipeline with ~754 events.