Skip to content

Releases: xPyD-hub/xPyD-proxy

v1.3.0

06 Apr 15:51

Choose a tag to compare

What's Changed

  • refactor: remove integration/stress tests and sim_adapter by @hlin99 in #141
  • docs: add strict bot policies (non-loop) + update trigger workflow by @hlin99 in #143
  • docs: unified structure — LICENSE, README, CONTRIBUTING by @hlin99 in #144

Full Changelog: v1.2.0...v1.3.0

v1.2.0

05 Apr 11:54
0e12dc3

Choose a tag to compare

What's Changed

  • chore: bump version to v1.2.0 by @hlin99 in #127
  • ci: make release workflow idempotent for existing releases by @hlin99 in #128
  • feat: add PD disaggregation Prometheus metrics with KV transfer time by @hlin99 in #130
  • fix: TTFT calculation handles both P-first and D-first token paths by @hlin99 in #131
  • refactor: replace dummy_nodes with real xpyd-sim by @hlin99 in #135
  • chore: rename PyPI package from xpyd to xpyd-proxy by @hlin99 in #136

Full Changelog: v1.1.0...v1.2.0

v1.1.0

03 Apr 16:08
0db5804

Choose a tag to compare

xPyD Proxy v1.1.0

A lightweight Prefill-Decode disaggregated proxy for LLM serving.

✨ Features

  • CLI tooling — xpyd proxy for serving, xpyd fix-config for config auto-correction, xpyd validate-config for validation
  • YAML configuration — Flexible config with validation, auto-fix, and template generation
  • Structured logging — OpenAI-compatible error format with structured context fields
  • Full type annotations — py.typed marker, all public methods annotated
  • Prometheus metrics — Request counts, latencies, per-instance stats at /metrics
  • 5 scheduling policies — Load balanced, round robin, consistent hash, power of two choices, cache-aware routing
  • Circuit breaker & health monitor — Auto-isolate failing instances, gradual recovery with probe requests
  • Multi-model routing — Single proxy serves multiple models, routes by model field, OpenAI-compatible /v1/models endpoint
  • Dual-role instances — Single-pass mode for instances handling both prefill and decode, zero KV transfer overhead
  • Per-model scheduler — Each model can use its own scheduling strategy with automatic fallback

📊 Stats

  • 4,400+ lines source 9,400+ lines tests 480+ test cases
  • Python 3.10, 3.11, 3.12

What's Changed

  • refactor: unify completion and chat-completion handlers by @hlin99 in #115
  • refactor: extract route handlers into core/routes/ module by @hlin99 in #116
  • refactor: rename core/ to xpyd/, unify package imports by @hlin99 in #117
  • feat: YAML-only config with xpyd proxy subcommand by @hlin99 in #119
  • chore: bump version to 1.1.0 by @hlin99 in #120
  • refactor: quality hardening — unified errors, structured logging, type annotations by @hlin99 in #121
  • feat: multi-model routing support by @hlin99 in #122
  • feat: quality hardening (error handling, type annotations) by @hlin99 in #124
  • feat: dual-role instance support with per-model scheduler by @hlin99 in #125
  • feat: fix-config CLI and README update by @hlin99 in #126

Full Changelog: v1.0.0...v1.1.0

v1.0.0

01 Apr 16:41
a21ead6

Choose a tag to compare

What's Changed

  • feat: add /metrics Prometheus endpoint (Task 5) by @hlin99 in #49
  • feat: centralized ProxyConfig with Pydantic validation (Task 6) by @hlin99 in #57
  • docs: mark Task 6 as DONE, Task 7 as IN PROGRESS by @hlin99 in #61
  • docs: add Task 8 (pdproxy CLI + startup discovery), renumber resilience to Task 9 by @hlin99 in #62
  • feat: YAML config file support (Task 7) by @hlin99 in #60
  • docs: clean up tasklist — reorder, update all statuses by @hlin99 in #63
  • docs: add mandatory fetch-latest rule to BOT_POLICY.md by @hlin99 in #65
  • feat: YAML config with topology expansion (Task 7) by @hlin99 in #66
  • docs: rewrite Task 9 with detailed goals, examples, and verification by @hlin99 in #69
  • docs: add Task 10 — advanced load balancing strategies by @hlin99 in #68
  • docs: add configuration, scheduling, resilience, and CLI documentation by @hlin99 in #71
  • feat: pdproxy CLI packaging + startup node discovery (Task 8) by @hlin99 in #70
  • ci: improve build job — wheel build, install, smoke test, and tests by @hlin99 in #73
  • refactor: rename CLI from pdproxy to xpyd by @hlin99 in #75
  • docs: rename node to instance in tasklist Task 9a spec by @hlin99 in #78
  • feat: Instance Registry for instance state management (Task 9a) by @hlin99 in #74
  • docs: add commit identity rules — noreply email only, no Co-authored-by by @hlin99 in #80
  • docs: add re-requested review priority rule to BOT_POLICY.md by @hlin99 in #81
  • feat: per-instance Circuit Breaker with state machine (Task 9b) by @hlin99 in #76
  • feat: background Health Monitor with async node probing (Task 9d) by @hlin99 in #77
  • docs: replace per-node with per-instance in resilience docs by @hlin99 in #82
  • feat: resilience module with retry, exponential backoff and jitter (Task 9c) by @hlin99 in #83
  • feat: integrate registry, circuit breaker, and health monitor into proxy (Task 9 integration) by @hlin99 in #84
  • docs: update task status markers in tasklist by @hlin99 in #87
  • feat: Policy Registry for extensible scheduling strategies (Task 10d) by @hlin99 in #85
  • feat: Consistent Hash scheduling policy with virtual nodes (Task 10a) by @hlin99 in #86
  • feat: Power of Two Choices scheduling policy (Task 10b) by @hlin99 in #88
  • feat: Cache-Aware Routing scheduling policy (Task 10c) by @hlin99 in #89
  • feat: wire advanced scheduling policies into proxy router by @hlin99 in #91
  • test: add e2e benchmark tests for CI by @hlin99 in #94
  • refactor: reorganize test directory structure by @hlin99 in #95
  • chore: retire tasklist_openclaw.md in favor of GitHub Issues by @hlin99 in #107
  • ci: add automatic release workflow on tag push by @hlin99 in #108
  • chore: bump version to 1.0.0 by @hlin99 in #109

Full Changelog: v0.1.0...v1.0.0