Label
Please label your issue with "new feature", "l2-adapter", "multiprocess", "pd-backend", "pr/3", and any other relevant labels.
Is your feature request related to a problem? Please describe.
MP (Multiprocess) mode and the PD backend require reliable, lossless two-way mapping between ObjectKey and CacheEngineKey string representations. Right now, adapter, backend and controller code each has their own implementation (or hardcodes) for key serialization/deserialization, which is error-prone and hard to test. To guarantee key identity, compatibility with NIXL and PD wire protocols, and robustness in edge cases, all logic should be centralized, strictly typed, and thoroughly unit-tested.
Describe the solution you'd like
All info below is sufficient for zero-context implementation.
1. Location
- Implementation:
lmcache/v1/distributed/l2_adapters/key_mapper.py
- Unit test:
tests/v1/distributed/l2_adapters/test_key_mapper.py
2. Key Data Structures Reference
ObjectKey (lmcache/v1/distributed/api.py):
@dataclass(frozen=True)
class ObjectKey:
chunk_hash: bytes # content hash of the chunk
model_name: str # model identifier (no '@')
kv_rank: int # computed via ComputeKVRank()
cache_salt: str = "" # per-user isolation (no '@/\\\x00')
CacheEngineKey (lmcache/utils.py):
@dataclass(slots=True)
class CacheEngineKey:
model_name: str
world_size: int
worker_id: int
chunk_hash: int # integer, serialized as hex
dtype: torch.dtype
request_configs: Optional[dict] = ...
3. API to Implement
def objectkey_to_cachekey_str(obj_key: ObjectKey) -> str:
"""Convert ObjectKey to a CacheEngineKey-compatible string.
Format: "{model_name}@{kv_rank}@{chunk_hash_hex}[@{cache_salt}]"
- chunk_hash: bytes → hex string (e.g. b'\xde\xad' → "dead")
- kv_rank: decimal int
- cache_salt: omitted (no trailing @) when empty
"""
def cachekey_str_to_objectkey(s: str) -> ObjectKey:
"""Parse a CacheEngineKey-compatible string back to ObjectKey.
Must be strict inverse:
cachekey_str_to_objectkey(objectkey_to_cachekey_str(x)) == x
Raises ValueError on malformed input.
"""
4. Key Design Rules
- Round-trip guarantee:
cachekey_str_to_objectkey(objectkey_to_cachekey_str(key)) == key
- Separator:
@ (consistent with existing CacheEngineKey.to_string())
- chunk_hash encoding:
bytes.hex() / bytes.fromhex()
- cache_salt: if empty → omit trailing
@; if present → append as last field
- Validation: reject
@ in model_name/cache_salt (ObjectKey __post_init__ already enforces)
- No torch dependency: pure Python + ObjectKey import only
5. Unit Tests Required
| Test |
Description |
test_round_trip_basic |
Standard ObjectKey → str → ObjectKey equality |
test_round_trip_with_cache_salt |
Non-empty cache_salt survives round-trip |
test_round_trip_empty_salt |
cache_salt="" produces same result |
test_chunk_hash_hex_encoding |
Verify hex encoding/decoding of bytes |
test_error_on_missing_fields |
Malformed string raises ValueError |
test_error_on_too_many_fields |
Strings with extra @ raise ValueError |
test_kv_rank_range |
Large kv_rank values survive round-trip |
test_known_pdbackend_compatibility |
Known PD wire format strings parse correctly |
Example:
from lmcache.v1.distributed.api import ObjectKey
from lmcache.v1.distributed.l2_adapters.key_mapper import (
objectkey_to_cachekey_str, cachekey_str_to_objectkey
)
def test_round_trip_basic():
key = ObjectKey(
chunk_hash=b'\xde\xad\xbe\xef',
model_name="meta-llama/Llama-3-8B",
kv_rank=42,
)
s = objectkey_to_cachekey_str(key)
assert s == "meta-llama/Llama-3-8B@42@deadbeef"
assert cachekey_str_to_objectkey(s) == key
def test_round_trip_with_salt():
key = ObjectKey(
chunk_hash=b'\x01\x02',
model_name="my-model",
kv_rank=0,
cache_salt="user-abc",
)
s = objectkey_to_cachekey_str(key)
assert s == "my-model@0@0102@user-abc"
assert cachekey_str_to_objectkey(s) == key
Describe alternatives you've considered
- Letting each adapter duplicate serialization logic (error-prone, untestable).
- Using
CacheEngineKey.to_string() directly (incompatible: includes dtype/tags which ObjectKey lacks).
- Embedding mapping inside PdL2Adapter (breaks reuse for other adapters).
Additional context
Label
Please label your issue with "new feature", "l2-adapter", "multiprocess", "pd-backend", "pr/3", and any other relevant labels.
Is your feature request related to a problem? Please describe.
MP (Multiprocess) mode and the PD backend require reliable, lossless two-way mapping between ObjectKey and CacheEngineKey string representations. Right now, adapter, backend and controller code each has their own implementation (or hardcodes) for key serialization/deserialization, which is error-prone and hard to test. To guarantee key identity, compatibility with NIXL and PD wire protocols, and robustness in edge cases, all logic should be centralized, strictly typed, and thoroughly unit-tested.
Describe the solution you'd like
1. Location
lmcache/v1/distributed/l2_adapters/key_mapper.pytests/v1/distributed/l2_adapters/test_key_mapper.py2. Key Data Structures Reference
ObjectKey(lmcache/v1/distributed/api.py):CacheEngineKey(lmcache/utils.py):3. API to Implement
4. Key Design Rules
cachekey_str_to_objectkey(objectkey_to_cachekey_str(key)) == key@(consistent with existingCacheEngineKey.to_string())bytes.hex()/bytes.fromhex()@; if present → append as last field@in model_name/cache_salt (ObjectKey__post_init__already enforces)5. Unit Tests Required
test_round_trip_basictest_round_trip_with_cache_salttest_round_trip_empty_salttest_chunk_hash_hex_encodingtest_error_on_missing_fieldstest_error_on_too_many_fields@raise ValueErrortest_kv_rank_rangetest_known_pdbackend_compatibilityExample:
Describe alternatives you've considered
CacheEngineKey.to_string()directly (incompatible: includes dtype/tags which ObjectKey lacks).Additional context