Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 2 additions & 10 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- run: pip install -e ".[dev]"
- run: python -m pytest tests/unit/ tests/integration/ -v --tb=short --timeout=120
- run: python -m pytest tests/unit/ -v --tb=short --timeout=120
lint:
runs-on: ubuntu-latest
steps:
Expand All @@ -36,12 +36,4 @@ jobs:
- run: pip install build && python -m build
- run: pip install dist/*.whl
- run: xpyd --help && xpyd --version
- run: pip install pytest pytest-asyncio aiohttp requests xpyd-sim && python -m pytest tests/integration/test_cli_and_discovery.py -v --tb=short
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with: {python-version: "3.12"}
- run: pip install -e ".[dev]"
- run: python -m pytest tests/stress/ -v --tb=short -m benchmark -s
- run: pip install pytest pytest-asyncio && python -m pytest tests/unit/ -v --tb=short
19 changes: 14 additions & 5 deletions .github/workflows/trigger-integration.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
name: Trigger Integration Tests

on:
pull_request:
branches: [main]
release:
types: [published]

Expand All @@ -10,9 +12,16 @@ jobs:
steps:
- name: Trigger integration tests
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_TOKEN: ${{ secrets.INTEGRATION_TOKEN }}
run: |
gh api repos/xPyD-hub/xPyD-integration/dispatches \
-f event_type=release-check \
-f "client_payload[repo]=xPyD-proxy" \
-f "client_payload[version]=${{ github.event.release.tag_name }}"
if [ "${{ github.event_name }}" = "pull_request" ]; then
gh api repos/xPyD-hub/xPyD-integration/dispatches \
-f event_type=pr-check \
-f "client_payload[repo]=xPyD-proxy" \
-f "client_payload[ref]=${{ github.head_ref }}"
else
gh api repos/xPyD-hub/xPyD-integration/dispatches \
-f event_type=release-check \
-f "client_payload[repo]=xPyD-proxy" \
-f "client_payload[version]=${{ github.event.release.tag_name }}"
fi
98 changes: 98 additions & 0 deletions bot/AUTHOR_POLICY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Author Policy — xPyD-proxy

Rules for the automated author bot when opening and maintaining PRs.
Read `ENTRY.md` first for general rules and `DESIGN_PRINCIPLES.md` for
architecture constraints.

---

## Identity

Bot-authored PRs use the **hlin99** token (the repo owner account).

## Branch Rules

- **Never push directly to `main`.** All changes go through a PR.
- Branch from the latest `main`. Keep the branch up-to-date by merging `main`
into it (not rebasing).
- **Each PR must be independent** — based on the latest `main`, with no
dependencies between PRs.
- **Avoid force-push.** Always push new commits. Force-push destroys review
history.
- Use descriptive branch names: `fix/issue-12-error-handling`,
`feat/add-metrics`, `test/concurrent-edge-cases`.

## Before Pushing

1. **Run pre-commit hooks** — `pre-commit run --all-files`.
2. **Run the full test suite** — `python -m pytest tests/ -v`.
3. **Run linters** — `ruff check .` and `isort --check-only --skip xpyd .`.
4. All three must pass locally before pushing.

## Commit Messages

Follow conventional commits:

```
<type>: <short description>

[optional body]
[optional footer]
```

Types: `fix`, `feat`, `test`, `docs`, `refactor`, `chore`, `ci`.

## Commit Identity

All commits must use:
```
git -c user.name="hlin99" -c user.email="tony.lin@intel.com" commit -s
```

Rules:
- Always use `tony.lin@intel.com` as the commit email.
- Never use the GitHub noreply address.
- Always include `Signed-off-by` trailer (`-s` flag) for DCO compliance.
- Never add `Co-authored-by` trailers.

## PR Description

- Clearly state **what** changed and **why**.
- Reference related issues (e.g. `closes #12`).
- If modifying `xpyd/`, explicitly call it out and explain the necessity.

## Responding to Reviews

- Address all `REQUEST_CHANGES` feedback before requesting re-review.
- Always push new commits to address feedback — do not amend or force-push.
- **Reply to each addressed review comment** with a reference to the fix
commit (e.g. "Fixed in `abc1234`.").
- **Re-request review** after pushing fixes.
- Keep PRs focused — one concern per PR.

## Active PR Maintenance

This section applies in **triggered mode** (not a continuous loop). The
maintenance workflow is invoked by a cron or heartbeat trigger — it does not
run autonomously in the background.

When triggered, for each open (non-draft) bot-authored PR:

1. **Update branch** — if the PR branch is behind `main`, merge `main` into
the branch. PRs must always be up-to-date with `main`.
2. **CI check** — check CI status. If any check has failed, examine the
failure logs, fix the code, and push a new commit. Do not wait for
reviewers to point out CI failures — fix them proactively.
3. **Review comment check** — read any `CHANGES_REQUESTED` reviews or
inline comments. Even if a later reviewer submitted `APPROVE`, examine
every `CHANGES_REQUESTED` review and verify the issues have been
addressed. For each piece of feedback:
- Fix the code accordingly.
- Run pre-commit, tests, and linters locally before pushing.
- Push a new commit (not amend/force-push).
- Reply to each addressed comment referencing the fix commit.
4. **Re-request review** — after pushing fixes, re-request review from the
reviewer(s) who requested changes.

**No force-push.** Force-pushing destroys review context. Always push new
commits.
56 changes: 56 additions & 0 deletions bot/DESIGN_PRINCIPLES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Design Principles — xPyD-proxy

Architecture constraints and design rules that all code changes must respect.

---

## Architecture Constraints

1. **PD Disaggregation** — xPyD-proxy is built for prefill/decode (PD)
disaggregated inference. Every routing and scheduling decision must
account for the prefill→decode handoff.

2. **Stateless Proxy** — the proxy itself holds no model weights and no KV
cache. It is a pure routing/scheduling layer between clients and vLLM
worker nodes.

3. **vLLM CLI Compatibility** — proxy configuration and startup must remain
compatible with vLLM's CLI interface. Users should be able to swap in
xPyD-proxy without changing their vLLM deployment scripts.

## Design Rules

4. **Topology-Driven Routing** — routing decisions are determined by the
topology matrix configuration `(P, D, replicas)`. The supported topologies
`(1,2,1) (2,2,1) (1,2,2) (1,2,4) (1,2,8) (2,2,2) (2,4,1) (2,4,2)` must
not be broken by any code change.

5. **Health-First Scheduling** — node health checks drive scheduling. Unhealthy
nodes must be removed from the active pool immediately. Recovery is
automatic when health checks pass again.

6. **Circuit Breaker Pattern** — repeated failures to a node trigger a circuit
breaker. The proxy must not keep sending traffic to a node that is
consistently failing.

7. **Configuration Layering** — configuration is resolved in order:
CLI flags → environment variables → YAML config → defaults.
Higher-priority sources override lower ones.

## Coding Guidelines

8. **Type Annotations** — all public functions and methods must have full
type annotations (parameters and return types).

9. **No Bare `except`** — always catch specific exception types. Bare
`except:` or `except Exception:` without re-raising is not allowed.

10. **Async by Default** — new I/O-bound code should use `async`/`await`.
Synchronous I/O in the request path is not acceptable.

11. **Tests Required** — every new feature or bug fix must include
corresponding unit tests. Integration tests go to the
`xPyD-integration` repo.

12. **Minimal Dependencies** — do not add new runtime dependencies without
discussion. The proxy should remain lightweight.
34 changes: 34 additions & 0 deletions bot/ENTRY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Bot Policy — xPyD-proxy

This directory contains the rules that automated bots must follow when
operating on this repository. Human contributors should refer to
`CONTRIBUTING.md`.

## Files

| File | Purpose |
|---|---|
| `ENTRY.md` | This file — overview and navigation. |
| `REVIEW_POLICY.md` | Rules for reviewing PRs (reviewer bots). |
| `AUTHOR_POLICY.md` | Rules for authoring PRs (author bot). |
| `DESIGN_PRINCIPLES.md` | Architecture constraints and design rules. |

## Before Any Action

- **Always fetch latest `main`** and re-read these files. They change
frequently. Never rely on cached copies.
- For task design details, refer to the linked GitHub Issue in the PR
description.

## General Rules

- **English only** — all content on GitHub must be in English. This includes
code, comments, commit messages, PR titles/descriptions, review comments,
and inline annotations. No Chinese characters allowed.
- **Secrets** — never hardcode tokens or credentials in code, PR descriptions,
or bot prompts. Read from secure storage at runtime.
- **Scope** — bots should limit their actions to reviewing code and authoring
PRs. No issue triage, no label management, no branch deletion unless
explicitly configured.
- **Transparency** — every bot action should produce a brief summary of what
it did (or chose not to do) for audit purposes.
78 changes: 78 additions & 0 deletions bot/REVIEW_POLICY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Review Policy — xPyD-proxy

Rules for automated review bots operating on this repository.
Read `ENTRY.md` first for general rules and `DESIGN_PRINCIPLES.md` for
architecture constraints.

---

## Identity

Two review bots operate on this repo:

- **hlin99-Review-Bot** — first reviewer
- **hlin99-Review-BotX** — second reviewer

Each bot uses its own dedicated token. **Never use the author's (hlin99)
token for reviews.**

## What to Review

1. **Skip draft PRs** — do not review, comment, or interact with them.
2. **Skip already-reviewed commits** — if the PR head SHA has not changed since
your last review, do not submit a duplicate review.
3. **Re-requested reviews take priority** — if a reviewer is explicitly
re-requested, always perform a fresh review even if the commit SHA has not
changed.
4. **Only skip APPROVED commits** — a commit SHA is considered "reviewed" only
if you submitted `APPROVE` on it. `COMMENT` does not count.
5. **One review per PR per commit SHA** — never submit multiple reviews for the
same commit.

## Review Standard

Reviews must be performed to the **strictest standard**. Every line of changed
code must be examined. Do not approve unless you are confident the code is
correct.

## Review Checklist

| Area | Check |
|---|---|
| **CI** | CI does not block reviewing — start immediately. However, CI must be fully green before submitting `APPROVE`. If CI is pending or failed, you may submit `REQUEST_CHANGES` or `COMMENT`. |
| **Merge conflicts** | If `mergeable == false`, submit `REQUEST_CHANGES`. |
| **`xpyd/` changes** | Business logic and API signatures must remain intact. Topology matrix configs must not be broken. |
| **Logic errors** | Incorrect conditions, off-by-one, unhandled edge cases. |
| **Type safety** | Mismatched parameter/return types, missing `None` checks. |
| **Concurrency** | Race conditions, missing locks, shared mutable state. |
| **Exception handling** | Bare `except`, swallowed exceptions, resource leaks. |
| **Security** | Injection risks, hardcoded secrets, unsanitized input. |
| **Code style** | Unused imports, shadowed variables, unclear naming. |
| **Test coverage** | New logic must have corresponding tests. |

## Review Verdicts

- **`APPROVE`** — only if the code is correct, CI is fully green, and no issues
are found.
- **`REQUEST_CHANGES`** — if any issue is found. Use inline comments to point
to specific files and lines.
- **`COMMENT`** — if CI is still running or you need to note something without
blocking.

## Merge Policy

> **Bots must NEVER merge a PR.** All merge operations are performed manually
> by a human maintainer.

This is non-negotiable.

## Review Trigger Schedule

- **Has open (non-draft) PRs**: check every **5 minutes**.
- **No open PRs**: check every **15 minutes**.
- A review can also be triggered immediately via chat command.

## Rate Limiting

Respect GitHub API rate limits; back off on `429` responses. Do not flood
PRs with duplicate comments or reviews.
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ dev = [
"isort>=5.13.0",
"tiktoken>=0.7.0",
"pytest-timeout>=2.3.0",
"xpyd-sim>=0.2.0",
]

[project.scripts]
Expand Down
26 changes: 0 additions & 26 deletions sim_adapter.py

This file was deleted.

Loading