Skip to content

[Feature 3/N] ObjectKey <-> CacheEngineKey Mapping Utilities #220

@hlin99

Description

@hlin99

Label
Please label your issue with "new feature", "l2-adapter", "multiprocess", "pd-backend", "pr/3", and any other relevant labels.

Is your feature request related to a problem? Please describe.
MP (Multiprocess) mode and the PD backend require reliable, lossless two-way mapping between ObjectKey and CacheEngineKey string representations. Right now, adapter, backend and controller code each has their own implementation (or hardcodes) for key serialization/deserialization, which is error-prone and hard to test. To guarantee key identity, compatibility with NIXL and PD wire protocols, and robustness in edge cases, all logic should be centralized, strictly typed, and thoroughly unit-tested.

Describe the solution you'd like

All info below is sufficient for zero-context implementation.

1. Location

  • Implementation: lmcache/v1/distributed/l2_adapters/key_mapper.py
  • Unit test: tests/v1/distributed/l2_adapters/test_key_mapper.py

2. Key Data Structures Reference

ObjectKey (lmcache/v1/distributed/api.py):

@dataclass(frozen=True)
class ObjectKey:
    chunk_hash: bytes        # content hash of the chunk
    model_name: str          # model identifier (no '@')
    kv_rank: int             # computed via ComputeKVRank()
    cache_salt: str = ""     # per-user isolation (no '@/\\\x00')

CacheEngineKey (lmcache/utils.py):

@dataclass(slots=True)
class CacheEngineKey:
    model_name: str
    world_size: int
    worker_id: int
    chunk_hash: int          # integer, serialized as hex
    dtype: torch.dtype
    request_configs: Optional[dict] = ...

3. API to Implement

def objectkey_to_cachekey_str(obj_key: ObjectKey) -> str:
    """Convert ObjectKey to a CacheEngineKey-compatible string.

    Format: "{model_name}@{kv_rank}@{chunk_hash_hex}[@{cache_salt}]"
    - chunk_hash: bytes → hex string (e.g. b'\xde\xad' → "dead")
    - kv_rank: decimal int
    - cache_salt: omitted (no trailing @) when empty
    """

def cachekey_str_to_objectkey(s: str) -> ObjectKey:
    """Parse a CacheEngineKey-compatible string back to ObjectKey.

    Must be strict inverse:
        cachekey_str_to_objectkey(objectkey_to_cachekey_str(x)) == x
    Raises ValueError on malformed input.
    """

4. Key Design Rules

  • Round-trip guarantee: cachekey_str_to_objectkey(objectkey_to_cachekey_str(key)) == key
  • Separator: @ (consistent with existing CacheEngineKey.to_string())
  • chunk_hash encoding: bytes.hex() / bytes.fromhex()
  • cache_salt: if empty → omit trailing @; if present → append as last field
  • Validation: reject @ in model_name/cache_salt (ObjectKey __post_init__ already enforces)
  • No torch dependency: pure Python + ObjectKey import only

5. Unit Tests Required

Test Description
test_round_trip_basic Standard ObjectKey → str → ObjectKey equality
test_round_trip_with_cache_salt Non-empty cache_salt survives round-trip
test_round_trip_empty_salt cache_salt="" produces same result
test_chunk_hash_hex_encoding Verify hex encoding/decoding of bytes
test_error_on_missing_fields Malformed string raises ValueError
test_error_on_too_many_fields Strings with extra @ raise ValueError
test_kv_rank_range Large kv_rank values survive round-trip
test_known_pdbackend_compatibility Known PD wire format strings parse correctly

Example:

from lmcache.v1.distributed.api import ObjectKey
from lmcache.v1.distributed.l2_adapters.key_mapper import (
    objectkey_to_cachekey_str, cachekey_str_to_objectkey
)

def test_round_trip_basic():
    key = ObjectKey(
        chunk_hash=b'\xde\xad\xbe\xef',
        model_name="meta-llama/Llama-3-8B",
        kv_rank=42,
    )
    s = objectkey_to_cachekey_str(key)
    assert s == "meta-llama/Llama-3-8B@42@deadbeef"
    assert cachekey_str_to_objectkey(s) == key

def test_round_trip_with_salt():
    key = ObjectKey(
        chunk_hash=b'\x01\x02',
        model_name="my-model",
        kv_rank=0,
        cache_salt="user-abc",
    )
    s = objectkey_to_cachekey_str(key)
    assert s == "my-model@0@0102@user-abc"
    assert cachekey_str_to_objectkey(s) == key

Describe alternatives you've considered

  • Letting each adapter duplicate serialization logic (error-prone, untestable).
  • Using CacheEngineKey.to_string() directly (incompatible: includes dtype/tags which ObjectKey lacks).
  • Embedding mapping inside PdL2Adapter (breaks reuse for other adapters).

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions