Releases: xPyD-hub/xPyD-proxy
Releases · xPyD-hub/xPyD-proxy
v1.3.0
v1.2.0
What's Changed
- chore: bump version to v1.2.0 by @hlin99 in #127
- ci: make release workflow idempotent for existing releases by @hlin99 in #128
- feat: add PD disaggregation Prometheus metrics with KV transfer time by @hlin99 in #130
- fix: TTFT calculation handles both P-first and D-first token paths by @hlin99 in #131
- refactor: replace dummy_nodes with real xpyd-sim by @hlin99 in #135
- chore: rename PyPI package from xpyd to xpyd-proxy by @hlin99 in #136
Full Changelog: v1.1.0...v1.2.0
v1.1.0
xPyD Proxy v1.1.0
A lightweight Prefill-Decode disaggregated proxy for LLM serving.
✨ Features
- CLI tooling — xpyd proxy for serving, xpyd fix-config for config auto-correction, xpyd validate-config for validation
- YAML configuration — Flexible config with validation, auto-fix, and template generation
- Structured logging — OpenAI-compatible error format with structured context fields
- Full type annotations — py.typed marker, all public methods annotated
- Prometheus metrics — Request counts, latencies, per-instance stats at /metrics
- 5 scheduling policies — Load balanced, round robin, consistent hash, power of two choices, cache-aware routing
- Circuit breaker & health monitor — Auto-isolate failing instances, gradual recovery with probe requests
- Multi-model routing — Single proxy serves multiple models, routes by model field, OpenAI-compatible /v1/models endpoint
- Dual-role instances — Single-pass mode for instances handling both prefill and decode, zero KV transfer overhead
- Per-model scheduler — Each model can use its own scheduling strategy with automatic fallback
📊 Stats
- 4,400+ lines source 9,400+ lines tests 480+ test cases
- Python 3.10, 3.11, 3.12
What's Changed
- refactor: unify completion and chat-completion handlers by @hlin99 in #115
- refactor: extract route handlers into core/routes/ module by @hlin99 in #116
- refactor: rename core/ to xpyd/, unify package imports by @hlin99 in #117
- feat: YAML-only config with xpyd proxy subcommand by @hlin99 in #119
- chore: bump version to 1.1.0 by @hlin99 in #120
- refactor: quality hardening — unified errors, structured logging, type annotations by @hlin99 in #121
- feat: multi-model routing support by @hlin99 in #122
- feat: quality hardening (error handling, type annotations) by @hlin99 in #124
- feat: dual-role instance support with per-model scheduler by @hlin99 in #125
- feat: fix-config CLI and README update by @hlin99 in #126
Full Changelog: v1.0.0...v1.1.0
v1.0.0
What's Changed
- feat: add /metrics Prometheus endpoint (Task 5) by @hlin99 in #49
- feat: centralized ProxyConfig with Pydantic validation (Task 6) by @hlin99 in #57
- docs: mark Task 6 as DONE, Task 7 as IN PROGRESS by @hlin99 in #61
- docs: add Task 8 (pdproxy CLI + startup discovery), renumber resilience to Task 9 by @hlin99 in #62
- feat: YAML config file support (Task 7) by @hlin99 in #60
- docs: clean up tasklist — reorder, update all statuses by @hlin99 in #63
- docs: add mandatory fetch-latest rule to BOT_POLICY.md by @hlin99 in #65
- feat: YAML config with topology expansion (Task 7) by @hlin99 in #66
- docs: rewrite Task 9 with detailed goals, examples, and verification by @hlin99 in #69
- docs: add Task 10 — advanced load balancing strategies by @hlin99 in #68
- docs: add configuration, scheduling, resilience, and CLI documentation by @hlin99 in #71
- feat: pdproxy CLI packaging + startup node discovery (Task 8) by @hlin99 in #70
- ci: improve build job — wheel build, install, smoke test, and tests by @hlin99 in #73
- refactor: rename CLI from pdproxy to xpyd by @hlin99 in #75
- docs: rename node to instance in tasklist Task 9a spec by @hlin99 in #78
- feat: Instance Registry for instance state management (Task 9a) by @hlin99 in #74
- docs: add commit identity rules — noreply email only, no Co-authored-by by @hlin99 in #80
- docs: add re-requested review priority rule to BOT_POLICY.md by @hlin99 in #81
- feat: per-instance Circuit Breaker with state machine (Task 9b) by @hlin99 in #76
- feat: background Health Monitor with async node probing (Task 9d) by @hlin99 in #77
- docs: replace per-node with per-instance in resilience docs by @hlin99 in #82
- feat: resilience module with retry, exponential backoff and jitter (Task 9c) by @hlin99 in #83
- feat: integrate registry, circuit breaker, and health monitor into proxy (Task 9 integration) by @hlin99 in #84
- docs: update task status markers in tasklist by @hlin99 in #87
- feat: Policy Registry for extensible scheduling strategies (Task 10d) by @hlin99 in #85
- feat: Consistent Hash scheduling policy with virtual nodes (Task 10a) by @hlin99 in #86
- feat: Power of Two Choices scheduling policy (Task 10b) by @hlin99 in #88
- feat: Cache-Aware Routing scheduling policy (Task 10c) by @hlin99 in #89
- feat: wire advanced scheduling policies into proxy router by @hlin99 in #91
- test: add e2e benchmark tests for CI by @hlin99 in #94
- refactor: reorganize test directory structure by @hlin99 in #95
- chore: retire tasklist_openclaw.md in favor of GitHub Issues by @hlin99 in #107
- ci: add automatic release workflow on tag push by @hlin99 in #108
- chore: bump version to 1.0.0 by @hlin99 in #109
Full Changelog: v0.1.0...v1.0.0