Skip to content

[Feature 6/N] PdL2Adapter: Proxy Notification Async Flow & Protocol Compliance #225

@hlin99

Description

@hlin99

Label
Please label your issue with "new feature", "l2-adapter", "multiprocess", "pd-backend", "pr/6", and other relevant labels.

Is your feature request related to a problem? Please describe.
The PdL2Adapter, after implementing the full store/load path (PR 5/N), still lacks an async and protocol-compliant notification mechanism for signaling proxy (orchestration central service) after transfer completion (e.g., last prefill request or dedicated store). Without proxy notification (the PD ProxyNotif protocol), end-to-end pipeline, client progress polling, and prefetch/decode orchestration cannot reliably function or be tested in CI.

Describe the solution you'd like

  • In lmcache/v1/distributed/l2_adapters/pd_l2_adapter.py:
    • Implement async (non-blocking) proxy notification using PD protocol (ProxyNotif, msgspec encoding) and ZMQ PUSH socket to proxy from config.
    • Notification logic must trigger on completion (all data landed), and only for last batch/last prefill (see PR 5/N/is_last_prefill).
    • Implement error handling, logging, retry if possible, but never block main event loop.
    • All notification/message structure and docstring must match and reference ProxyNotif in storage_backend/pd_backend.py ([Feature 5/N] PdL2Adapter: Store/Load Full Data Path (L1 ↔ Staging ↔ RDMA) #223).
    • All socket/context setup must be tested for correct threading and teardown (CI must work with mock proxy, strict msgspec validation).
    • Cover:
      • test_proxy_notif_sent_on_last_prefill
      • test_notif_not_sent_on_nonlast
      • test_notif_exception_logs
      • test_proxy_socket_lifecycle

Describe alternatives you've considered

  • Synchronous notification or embedded transfer: not composable and blocks; error-prone, hard to E2E test.
  • Adding proxy logic in controller not adapter: breaks modularity, nonstandard protocol/integration.
  • Polling only or out-of-band DB mark: can't trigger timely coordination, fails fast-path.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions