Skip to content

[Feature 1/N] PdL2AdapterConfig: Typed config & registration for PD L2 Adapter (PR 1/7) #217

@hlin99

Description

@hlin99

Label
Please label your issue with "new feature", "l2-adapter", "multiprocess", "pd-backend", "pr/1", and any other relevant labels.

Is your feature request related to a problem? Please describe.
In LMCache Multi-process (MP) mode, supporting high-performance pipelines for LLM KVCache with PD (Prefetch/Decode) backend as an L2 Adapter requires a robust, typed, and thoroughly validated config object with CLI/JSON integration. Without this, downstream code (store, load, event triggering) cannot be developed or tested independently. Async-based proxy notification will be handled in later PRs.

Describe the solution you'd like

Please strictly follow the structure below—to unblock any new contributor or agent from starting real code PR.

  1. Location & Naming

    • Implementation file: lmcache/v1/distributed/l2_adapters/pd_l2_adapter.py (必须以 _l2_adapter.py 结尾以便自动注册)
    • Unit test: tests/v1/distributed/l2_adapters/test_l2_pd_adapter_config.py
  2. Class: PdL2AdapterConfig

    • Inherits: L2AdapterConfigBase (from lmcache.v1.distributed.l2_adapters.config import L2AdapterConfigBase)
    • All fields must include type annotation, default if any, and descriptive docstring.
    Field Type Default Required Description
    role str Y "sender" or "receiver";必须在 from_dict 校验。
    peer_host str Y Remote peer hostname/IP
    peer_init_port List[int] Y Per-TP-rank init ports (e.g. NIXL handshakes); [9051] for TP=1
    peer_alloc_port List[int] Y Per-TP-rank alloc ports; [9052] for TP=1
    proxy_host str "" N Proxy notification host (sender only)
    proxy_port int 0 N Proxy notification port (sender only)
    buffer_size int 67108864 N bytes; default 64MB staging buffer per rank
    buffer_device str "cpu" N "cpu" or "cuda"
    transfer_channel str "nixl" N "nixl" or "mock_memory"
    nixl_backends List[str] ["tcp"] N NIXL transport backends
    • Use Python @dataclass(frozen=True).
  3. Registration & Factory

    • At the end of your pd_l2_adapter.py, add:
      from lmcache.v1.distributed.l2_adapters.config import register_l2_adapter_type
      register_l2_adapter_type("pd", PdL2AdapterConfig)
  4. Required methods & base logic:

    • @classmethod def from_dict(cls, d: dict) -> "PdL2AdapterConfig":
      • Must check for all required fields.
      • role 必须是 "sender""receiver";否则 raise ValueError
      • 必须设置并赋值 eviction_config = cls._parse_eviction_config(d)persist_config = cls._parse_persist_config(d)(参考现有 L2AdapterConfigBase 使用方式)。
    • @classmethod def help(cls) -> str:
      • Should mention required/optional for every field, intended for CLI/JSON usage
  5. Example reference (do not copy code, see style):

    @dataclass(frozen=True)
    class FsL2AdapterConfig(L2AdapterConfigBase):
        path: str
        persist_enabled: bool = True
        ...
        @classmethod
        def from_dict(cls, d: dict) -> "FsL2AdapterConfig":
            ...
            cfg.eviction_config = cls._parse_eviction_config(d)
            ...
  6. Unit Tests Required:

    • Place in tests/v1/distributed/l2_adapters/test_l2_pd_adapter_config.py
    • UT checklist:
      • test_parse_minimal_sender_config (just required fields)
      • test_parse_minimal_receiver_config
      • test_fail_on_missing_required (any required field omitted should raise)
      • test_fail_on_invalid_role
      • test_help_contains_all_field_names
      • test_registered_in_factory (ensure "pd" in get_registered_l2_adapter_types())
      • test_all_fields_round_trip (pass all fields and check they set/are reflected in the instance)
      • test_defaults_applied

    Example UT (pytest):

    def test_parse_valid_sender_config():
        d = {"type": "pd", "role": "sender", "peer_host": "10.0.0.1", "peer_init_port": [9051], "peer_alloc_port": [9052]}
        cfg = PdL2AdapterConfig.from_dict(d)
        assert cfg.role == "sender"
        assert cfg.peer_host == "10.0.0.1"
        assert cfg.buffer_size == 67108864

Describe alternatives you've considered

  • Reimplementing config logic for each new L2Adapter breaks pattern reuse and increases onboarding cost.
  • Using generic config dicts would lose type safety and schema enforcement.

Additional context

  • This is Patch 1/N for full PD L2Adapter integration.
  • Real store/load logic, network or async proxy notification (建议用 async 方式) 会在后续 PR/issue。
  • All new classes & UTs should reference or match existing L2 adapter config best practices.
  • For agent/automation: do not skip _parse_eviction_config/_parse_persist_config hook!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions