Skip to content

cache() + Await + Spawn/Gather hangs at N>~100 tasks #342

@proboscis

Description

@proboscis

Summary

When cache(lifecycle="persistent") wraps a @do function containing yield Await(asyncio.sleep(...)), and many instances are dispatched via Spawn+Gather, the program hangs indefinitely at around N>100 tasks.

The Await inside compute_and_cache() never returns — tasks enter compute start but never reach compute done.

Reproduction

Self-contained test added in tests/test_cache_await_spawn_hang.py:

@cache(lifecycle="persistent")
@do
def _cached_async_task(i: int) -> EffectGenerator[str]:
    yield slog(msg=f"[{i}] compute start", level="info")
    yield Await(asyncio.sleep(0.01))
    yield slog(msg=f"[{i}] compute done", level="info")
    return f"result-{i}"

@do
def _spawn_gather_n(task_factory, n: int) -> EffectGenerator[list]:
    tasks = []
    for i in range(n):
        t = yield Spawn(task_factory(i), daemon=False)
        tasks.append(t)
    return list((yield Gather(*tasks)))

# Run with sqlite_cache_handler
wrapped = WithHandler(sqlite_cache_handler(None), _spawn_gather_n(_cached_async_task, 500))
r = run(wrapped, handlers=default_handlers())
uv run pytest tests/test_cache_await_spawn_hang.py -v --timeout=60

Results

Test N Result
no cache + Await + Spawn 500 ✅ PASS (46s)
cache + Await + Spawn 50 ✅ PASS (4s)
cache + Await + Spawn 100 ✅ PASS
cache + Await + Spawn 200 ❌ HANG (timeout)
cache + Await + Spawn 500 ❌ HANG (timeout)

Observations

  • Without cache(), the same Await + Spawn+Gather pattern works at any N
  • With cache(), tasks that hit cache miss enter compute_and_cache(), the inner yield Await(asyncio.sleep(0.01)) never completes
  • Log shows [i] compute start but never [i] compute done
  • This blocks real-world pipelines (e.g., 754 cached LLM calls via Spawn+Gather)
  • The threshold is somewhere between N=100 and N=200

Impact

This blocks any pipeline that uses cache() on async operations (LLM calls, HTTP requests) with Spawn+Gather at scale. Discovered while running a news indexing pipeline with ~754 events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions