From 02aca59736abe870094997e46b450c0dced7a339 Mon Sep 17 00:00:00 2001
From: hlin99 <hlin99@users.noreply.github.com>
Date: Mon, 6 Apr 2026 21:49:03 +0800
Subject: [PATCH] =?UTF-8?q?docs:=20unified=20structure=20=E2=80=94=20LICEN?=
 =?UTF-8?q?SE,=20README,=20CONTRIBUTING,=20bot/?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 CONTRIBUTING.md                     |  31 +++++
 LICENSE                             | 202 ++++++++++++++++++++++++++++
 README.md                           | 170 +----------------------
 REVIEW_POLICY.md                    |  62 ---------
 bot/AUTHOR_POLICY.md                |  23 ++++
 bot/BOT_POLICY.md                   |  34 +++++
 bot/DESIGN_PRINCIPLES.md            |  41 ++++++
 bot/DEV_LOOP.md                     |  28 ++++
 bot/ENTRY.md                        |  14 ++
 bot/REVIEW_POLICY.md                |  38 ++++++
 {docs => bot}/iterations/current.md |   0
 docs/DESIGN_PRINCIPLES.md           |  43 ------
 docs/DEV_LOOP.md                    |  85 ------------
 13 files changed, 416 insertions(+), 355 deletions(-)
 create mode 100644 CONTRIBUTING.md
 create mode 100644 LICENSE
 delete mode 100644 REVIEW_POLICY.md
 create mode 100644 bot/AUTHOR_POLICY.md
 create mode 100644 bot/BOT_POLICY.md
 create mode 100644 bot/DESIGN_PRINCIPLES.md
 create mode 100644 bot/DEV_LOOP.md
 create mode 100644 bot/ENTRY.md
 create mode 100644 bot/REVIEW_POLICY.md
 rename {docs => bot}/iterations/current.md (100%)
 delete mode 100644 docs/DESIGN_PRINCIPLES.md
 delete mode 100644 docs/DEV_LOOP.md

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..eee3b77
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,31 @@
+# Contributing
+
+## Setup
+
+```bash
+pip install -e ".[dev]"
+pre-commit install
+```
+
+## Workflow
+
+1. Fork & branch from `main`
+2. Write code + tests
+3. Lint: `ruff check xpyd_plan tests`
+4. Open a PR — CI must pass, two reviewer bots must approve
+
+## Commit Messages
+
+Use [Conventional Commits](https://www.conventionalcommits.org/):
+
+```
+feat: add new analyzer
+fix: correct SLA threshold check
+docs: update README
+```
+
+## Code Style
+
+- Formatter/linter: `ruff`
+- Type hints required for public APIs
+- Tests required for new features
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..d645695
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
diff --git a/README.md b/README.md
index 74ef64e..a6d6693 100644
--- a/README.md
+++ b/README.md
@@ -1,18 +1,13 @@
 # xPyD-plan
 
-Benchmark-data analysis toolkit for [xPyD-proxy](https://github.com/xPyD-hub/xPyD-proxy) — find the optimal **Prefill:Decode instance ratio** from real benchmark results.
-
-> **Core principle:** No guessing, no modeling, no simulation — everything is based on actual benchmark data.
+PD ratio planner — recommend optimal **Prefill:Decode** allocation from real benchmark data.
 
 ## Install
 
 ```bash
-# Base install
 pip install xpyd-plan
-
-# With HTML report generation
+# With HTML reports
 pip install "xpyd-plan[report]"
-
 # Development
 pip install "xpyd-plan[dev]"
 ```
@@ -20,171 +15,16 @@ pip install "xpyd-plan[dev]"
 ## Quick Start
 
 ```bash
-# Analyze a benchmark file — find the optimal P:D ratio
+# Find optimal P:D ratio from benchmark results
 xpyd-plan analyze --benchmark results.json --sla-ttft 200 --sla-tpot 50
 
 # Compare two benchmark runs
 xpyd-plan compare --baseline baseline.json --current current.json
 
-# Generate a Markdown report
+# Generate report
 xpyd-plan report --format markdown --benchmark results.json --output report.md
-
-# Export results as JSON for automation
-xpyd-plan analyze --benchmark results.json --output-format json
 ```
 
-## Features
-
-### Core Analysis
-
-| Command | Description |
-|---------|-------------|
-| `analyze` | SLA compliance check, utilization analysis, optimal P:D ratio finder |
-| `export` | Batch export analysis results as JSON/CSV/table |
-| `confidence` | Bootstrap confidence intervals for latency percentiles |
-| `sensitivity` | P:D ratio vs SLA satisfaction curves with cliff detection |
-| `decompose` | Per-request latency decomposition into prefill/decode/overhead phases |
-| `tail` | Extended percentile analysis (P99.9, P99.99) with tail classification |
-
-### Comparison & Testing
-
-| Command | Description |
-|---------|-------------|
-| `compare` | Benchmark comparison with regression detection |
-| `ab-test` | Statistical A/B test analysis (Welch's t-test, Mann-Whitney U, effect size) |
-| `drift` | Distribution drift detection via Kolmogorov-Smirnov test |
-| `model-compare` | Multi-model side-by-side latency and cost-efficiency comparison |
-
-### Planning & Optimization
-
-| Command | Description |
-|---------|-------------|
-| `plan-capacity` | Capacity planning with linear scaling model and confidence levels |
-| `plan-benchmarks` | Generate prioritized list of P:D ratios to benchmark |
-| `what-if` | Scenario simulation — scale QPS or instances and compare |
-| `fleet` | Multi-GPU-type fleet sizing with budget constraints |
-| `pareto` | Pareto frontier analysis across latency, cost, and waste |
-| `recommend` | Ranked recommendations combining SLA, cost, Pareto, and trend data |
-| `interpolate` | Performance interpolation/extrapolation for untested P:D ratios |
-| `threshold-advisor` | SLA threshold tuning — find thresholds for target pass rates |
-| `forecast` | Capacity forecasting from historical trend data |
-
-### Cost & Budgeting
-
-| Command | Description |
-|---------|-------------|
-| `budget` | SLA budget allocation across TTFT/TPOT stages |
-| `scorecard` | Composite efficiency score (SLA compliance + utilization + waste) |
-| `sla-tier` | Multi-SLA tier analysis for different service levels |
-
-### Data Management
-
-| Command | Description |
-|---------|-------------|
-| `validate` | Data quality scoring, outlier detection (IQR/Z-score) |
-| `filter` | Token/latency/time-window filtering and random sampling |
-| `merge` | Merge multiple benchmark files (union/intersection strategies) |
-| `annotate` | Tag benchmark files with metadata (non-destructive sidecar) |
-| `discover` | Recursive directory scanning for benchmark files |
-| `generate` | Generate synthetic benchmark data for testing |
-
-### Monitoring & Alerting
-
-| Command | Description |
-|---------|-------------|
-| `alert` | YAML-defined alert rules with CI/CD-friendly exit codes |
-| `trend` | Historical trend tracking with SQLite storage |
-| `saturation` | Saturation point detection across QPS levels |
-| `metrics` | Prometheus/OpenMetrics text format export |
-| `dashboard` | Real-time Rich TUI dashboard |
-| `timeline` | Time-window analysis with warmup detection and trend regression |
-
-### Workload Characterization
-
-| Command | Description |
-|---------|-------------|
-| `workload` | Request clustering into categories (prefill-heavy, decode-heavy, etc.) |
-| `correlation` | Pearson correlation between request characteristics and latency |
-| `heatmap` | 2D latency heatmap (prompt × output tokens) with hotspot detection |
-| `root-cause` | Anomaly root cause analysis — SLA-failing vs passing request comparison |
-| `scaling` | Throughput scaling efficiency with knee-point detection |
-
-### Reporting & Configuration
-
-| Command | Description |
-|---------|-------------|
-| `report` | HTML report with inline SVG visualizations |
-| `report --format markdown` | Markdown report for GitHub PRs/wikis |
-| `config init` | Generate starter YAML config file |
-| `config show` | Display resolved configuration |
-| `pipeline` | YAML-defined multi-step batch pipeline runner |
-
-## Benchmark Data Format
-
-xPyD-plan consumes JSON benchmark files (native or [xpyd-bench](https://github.com/xPyD-hub/xPyD-bench) format) containing per-request measurements:
-
-```json
-{
-  "metadata": {
-    "num_prefill_instances": 2,
-    "num_decode_instances": 6,
-    "total_instances": 8,
-    "measured_qps": 10.5
-  },
-  "requests": [
-    {
-      "request_id": "req-001",
-      "prompt_tokens": 512,
-      "output_tokens": 128,
-      "ttft_ms": 45.2,
-      "tpot_ms": 12.1,
-      "total_latency_ms": 1593.0,
-      "timestamp": 1700000000.0
-    }
-  ]
-}
-```
-
-## Programmatic API
-
-Every feature is available as a Python function:
-
-```python
-from xpyd_plan import analyze, compare_benchmarks, analyze_ab_test
-
-# Analyze benchmark data
-result = analyze("benchmark.json", sla_ttft_ms=200, sla_tpot_ms=50)
-
-# Compare two runs
-comparison = compare_benchmarks("baseline.json", "current.json")
-```
-
-## Configuration
-
-Create `xpyd-plan.yaml` (or `~/.config/xpyd-plan/config.yaml`) for persistent defaults:
-
-```bash
-xpyd-plan config init    # generates a commented starter file
-xpyd-plan config show    # shows resolved config
-```
-
-CLI flags always override config file values.
-
-## How It Works
-
-1. **Input**: Benchmark results from `xpyd-bench` (real measured data)
-2. **Analyze**: Compute latency distributions, SLA compliance, utilization per P:D ratio
-3. **Optimize**: Find the ratio with minimum resource waste while meeting SLA constraints
-4. **Report**: Output tables, JSON/CSV, Markdown/HTML reports, or Prometheus metrics
-
-## Documentation
-
-- **[使用指南 (Guide)](docs/guide.md)** — 完整使用指南：安装、子命令说明、典型工作流、结果解读
-- [设计原则](docs/DESIGN_PRINCIPLES.md) — 架构与设计决策
-- [开发循环](docs/DEV_LOOP.md) — 开发流程
-- [Roadmap](ROADMAP.md) — 完整里程碑列表
-- [当前迭代状态](docs/iterations/current.md) — M104 里程碑总结
-
 ## License
 
-TBD
+[Apache 2.0](LICENSE)
diff --git a/REVIEW_POLICY.md b/REVIEW_POLICY.md
deleted file mode 100644
index 4e3ee8a..0000000
--- a/REVIEW_POLICY.md
+++ /dev/null
@@ -1,62 +0,0 @@
-# Review Policy
-
-## Roles
-
-| Role | GitHub Account | Action |
-|------|---------------|--------|
-| Implementer | `hlin99` | Write code, submit PRs, fix issues |
-| Reviewer 1 | `hlin99-Review-Bot` | Review PRs: approve / request changes / close |
-| Reviewer 2 | `hlin99-Review-BotX` | Review PRs: approve / request changes / close |
-
-## Timing
-
-| Parameter | Value |
-|-----------|-------|
-| Iteration interval | 10 minutes |
-| PR wait for review | max 15 minutes |
-| Fix after request changes | max 10 minutes |
-| Reviewer check frequency | every 5 minutes |
-| Reviewer response deadline | 15 minutes after assign |
-| Reviewer timeout action | close PR (iteration failed) |
-| Total round timeout | 1 hour from PR creation |
-| Round timeout action | close PR (iteration failed) |
-
-## Review Criteria
-
-Reviewers evaluate each PR on two dimensions:
-
-### 1. Idea Value
-- Is the direction/approach valuable for the project?
-- Does it align with the project goals?
-- **If NO → close PR immediately** (one close = PR rejected)
-
-### 2. Code Quality
-- Is the code correct?
-- Are tests included/passing?
-- Is `docs/iterations/current.md` updated with clear description?
-- Does `docs/guide.md` reflect changes (if applicable)?
-- **If idea is good but code has issues → request changes**
-
-## Decision Rules
-
-| Scenario | Action |
-|----------|--------|
-| Both reviewers approve | Auto-merge |
-| One approves, one requests changes | Implementer fixes, reviewers re-review |
-| Either reviewer closes | PR closed, iteration failed |
-| Both approve after fixes | Auto-merge |
-| Timeout (15min no review) | PR closed, iteration failed |
-| Total timeout (1 hour) | PR closed, iteration failed |
-
-## Iteration Record
-
-Every PR MUST update `docs/iterations/current.md` with:
-- What was done this iteration
-- Result: merged / closed (with reason)
-- Reviewer scores/comments summary
-
-## Auto-Merge Requirements
-
-- 2 approvals from designated reviewers
-- CI passes (all checks green)
-- No unresolved review comments
diff --git a/bot/AUTHOR_POLICY.md b/bot/AUTHOR_POLICY.md
new file mode 100644
index 0000000..54c6d36
--- /dev/null
+++ b/bot/AUTHOR_POLICY.md
@@ -0,0 +1,23 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# Author Policy — xPyD-plan
+
+## Code Standards
+
+- **Language:** Python 3.10+
+- **Lint:** `ruff check xpyd_plan tests`
+- **Type hints:** Required for all public APIs
+- **Tests:** Required for every new feature or bug fix
+
+## Git
+
+- **Committer:** `hlin99 <hlin99@users.noreply.github.com>`
+- **Branch naming:** `feat/`, `fix/`, `docs/`, `refactor/`
+- **Commit format:** Conventional Commits
+
+## PR Checklist
+
+- [ ] Code + tests
+- [ ] `ruff check xpyd_plan tests` passes
+- [ ] `bot/iterations/current.md` updated
+- [ ] PR body contains `Closes #N` (if applicable)
diff --git a/bot/BOT_POLICY.md b/bot/BOT_POLICY.md
new file mode 100644
index 0000000..48cd598
--- /dev/null
+++ b/bot/BOT_POLICY.md
@@ -0,0 +1,34 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# Bot Policy — xPyD-plan
+
+## Identity
+
+- **Project:** xPyD-plan
+- **Repo:** `xPyD-hub/xPyD-plan`
+- **Architecture:** Offline planning tool — analyze vLLM/SGLang/TRT-LLM benchmark data to recommend optimal P:D instance ratios.
+
+## Accounts
+
+| Role | GitHub Account |
+|------|---------------|
+| Implementer | `hlin99` |
+| Reviewer 1 | `hlin99-Review-Bot` |
+| Reviewer 2 | `hlin99-Review-BotX` |
+
+## Rules
+
+1. **Never push directly to `main`.** All changes go through PRs.
+2. **Rebase onto latest `main`** before every push.
+3. **Run pre-commit** before every commit.
+4. **Never self-merge.** Wait for both reviewer bots to approve.
+5. **All code, docs, issues, PRs in English.**
+6. **Conventional Commits** format for all commit messages.
+7. **CI must be green** before merge.
+
+## References
+
+- `bot/DESIGN_PRINCIPLES.md` — what to build and why
+- `bot/DEV_LOOP.md` — how to iterate
+- `bot/REVIEW_POLICY.md` — review process
+- `ROADMAP.md` — milestone tracker
diff --git a/bot/DESIGN_PRINCIPLES.md b/bot/DESIGN_PRINCIPLES.md
new file mode 100644
index 0000000..9076b86
--- /dev/null
+++ b/bot/DESIGN_PRINCIPLES.md
@@ -0,0 +1,41 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# xPyD-plan Design Principles
+
+## Core Positioning
+
+Offline planning tool: analyze benchmark data from real hardware to recommend the optimal P:D configuration.
+
+## Methodology
+
+1. Users run benchmarks on real hardware with specified P:D configurations
+2. Collect measured data points (TTFT, TPOT/ITL, throughput, etc.)
+3. Analyze data, fit/interpolate, build performance model
+4. Extrapolate predicted performance for all possible P:D ratios
+5. Find the P:D combination that meets SLA with minimum waste
+
+## Principles
+
+### Data-Driven
+- Everything based on real hardware benchmark data
+- No pure theoretical simulation, no guessing
+- Supports vLLM, SGLang, TensorRT-LLM formats
+
+### Independent Thinking
+- Reference industry solutions for ideas, never copy blindly
+- Every technical decision must have its own analysis
+
+### User-Friendly
+- Minimize required benchmark runs
+- Clearly specify which configurations to benchmark
+- Include confidence/reliability assessment
+- Be honest about uncertainty
+
+### Rigorous Definitions
+- "Optimal" = minimum idle waste while meeting SLA
+- SLA checks use measured percentiles (P95/P99), not averages
+- Granularity: instances, not individual GPU cards
+
+## References
+- ai-dynamo planner: reference its ideas on profiling data interpolation
+- dynamo is online autoscaler; we are offline planner — different scenarios
diff --git a/bot/DEV_LOOP.md b/bot/DEV_LOOP.md
new file mode 100644
index 0000000..adad447
--- /dev/null
+++ b/bot/DEV_LOOP.md
@@ -0,0 +1,28 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# Development Loop
+
+Autonomous infinite loop. Runs until explicitly stopped.
+
+## Each Iteration
+
+1. Pull latest `main`, rebase branch
+2. Read `ROADMAP.md` — find next incomplete milestone
+3. Read `bot/DESIGN_PRINCIPLES.md` — follow the rules
+4. Check open issues/PRs — handle unmerged PRs first
+5. Create GitHub Issue: problem, solution, acceptance criteria, tests
+6. Create branch, implement code + tests
+7. Lint: `ruff check xpyd_plan tests`
+8. Update `bot/iterations/current.md`
+9. Create PR (body contains `Closes #N`)
+10. Wait for CI green. Fix failures.
+11. Wait for reviewer bots (see `bot/REVIEW_POLICY.md`)
+12. Handle review result:
+    - **2 approvals** → auto-merge → update ROADMAP → step 1
+    - **request changes** → fix, push → wait for re-review
+    - **closed** → record failure in `bot/iterations/current.md` → step 1
+
+## Deliverables (every iteration)
+
+- Code changes + tests
+- Updated `bot/iterations/current.md`
diff --git a/bot/ENTRY.md b/bot/ENTRY.md
new file mode 100644
index 0000000..473c2db
--- /dev/null
+++ b/bot/ENTRY.md
@@ -0,0 +1,14 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# Bot Entry Point — xPyD-plan
+
+Read these files **in order** before every iteration:
+
+1. `bot/BOT_POLICY.md` — project rules and constraints
+2. `bot/AUTHOR_POLICY.md` — coding standards
+3. `bot/DESIGN_PRINCIPLES.md` — architectural guidelines
+4. `bot/DEV_LOOP.md` — iteration steps
+5. `bot/REVIEW_POLICY.md` — PR review process
+6. `bot/iterations/current.md` — current state
+
+**Do not skip any file. Do not summarize from memory.**
diff --git a/bot/REVIEW_POLICY.md b/bot/REVIEW_POLICY.md
new file mode 100644
index 0000000..cfd721f
--- /dev/null
+++ b/bot/REVIEW_POLICY.md
@@ -0,0 +1,38 @@
+<!-- ⚠️ DO NOT COMPRESS, SUMMARIZE, OR SKIP ANY PART OF THIS FILE ⚠️ -->
+
+# Review Policy
+
+## Roles
+
+| Role | GitHub Account |
+|------|---------------|
+| Implementer | `hlin99` |
+| Reviewer 1 | `hlin99-Review-Bot` |
+| Reviewer 2 | `hlin99-Review-BotX` |
+
+## Review Criteria
+
+### 1. Idea Value
+- Does the direction align with project goals?
+- **If NO → close PR immediately** (one close = PR rejected)
+
+### 2. Code Quality
+- Correct code, tests included/passing
+- `bot/iterations/current.md` updated
+- **If idea good but code has issues → request changes**
+
+## Decision Rules
+
+| Scenario | Action |
+|----------|--------|
+| Both approve | Auto-merge |
+| One approves, one requests changes | Fix, re-review |
+| Either closes | PR closed, iteration failed |
+| Timeout (15 min no review) | PR closed |
+| Total timeout (1 hour) | PR closed |
+
+## Auto-Merge Requirements
+
+- 2 approvals from designated reviewers
+- CI green
+- No unresolved review comments
diff --git a/docs/iterations/current.md b/bot/iterations/current.md
similarity index 100%
rename from docs/iterations/current.md
rename to bot/iterations/current.md
diff --git a/docs/DESIGN_PRINCIPLES.md b/docs/DESIGN_PRINCIPLES.md
deleted file mode 100644
index 17627c0..0000000
--- a/docs/DESIGN_PRINCIPLES.md
+++ /dev/null
@@ -1,43 +0,0 @@
-# xPyD-plan Design Principles
-
-## Core Positioning
-Offline planning tool: analyze vLLM benchmark data from real hardware to recommend the optimal P:D configuration.
-
-## Methodology
-1. Users run vLLM benchmarks on real hardware with several P:D configurations we specify
-2. Collect multiple measured data points (TTFT, TPOT/ITL, throughput, etc.)
-3. Analyze data, fit/interpolate, build performance model
-4. Extrapolate predicted performance for all possible P:D ratios
-5. Find the P:D combination that meets SLA with minimum waste (best cost-efficiency)
-
-## Principles
-
-### Data-Driven
-- Everything is based on real hardware benchmark data
-- No pure theoretical simulation, no guessing hardware performance
-- Data format: vLLM benchmark standard output
-
-### Independent Thinking
-- Reference industry solutions (e.g., dynamo planner) for ideas, but never copy
-- Every technical decision must have its own analysis and reasoning
-- Algorithm choices based on actual data characteristics, not "because others do it"
-
-### User-Friendly
-- Minimize the number of benchmark runs users need to perform
-- Clearly tell users which configurations to benchmark
-- Recommendations must include confidence/reliability assessment
-- Be honest about uncertainty when sample points are few
-
-### Rigorous Definitions
-- "Optimal" = minimum idle waste on P and D instances while meeting SLA
-- "Waste" must have a strict mathematical definition, no ambiguity
-- SLA checks use measured percentiles (P95/P99), not averages
-
-### Granularity
-- Measured in instances, not individual GPU cards
-- QPS is not optimized — it is a given from the benchmark
-
-## References
-- ai-dynamo/dynamo planner module: reference its ideas on profiling data interpolation and SLA checking
-- However, dynamo is an online autoscaler; we are an offline planning tool — different scenarios
-- Do not copy dynamo's code or data formats
diff --git a/docs/DEV_LOOP.md b/docs/DEV_LOOP.md
deleted file mode 100644
index a889be4..0000000
--- a/docs/DEV_LOOP.md
+++ /dev/null
@@ -1,85 +0,0 @@
-# Development Loop
-
-Autonomous infinite loop. Runs until explicitly stopped.
-
-## Setup (every iteration)
-```
-git config user.email "tony.lin@intel.com"
-git config user.name "hlin99"
-```
-
-## Each Iteration
-
-1. Pull latest code
-2. Read `ROADMAP.md` — find the next incomplete milestone
-3. Read `DESIGN_PRINCIPLES.md` — follow the rules
-4. Check open issues/PRs — handle unmerged PRs first (fix CI failures, address review comments)
-5. If no milestone left, create new ones (see Phase 2 below)
-6. Create GitHub Issue: problem, solution, acceptance criteria, tests
-7. Create branch, implement code + tests
-8. Pass lint: `ruff check src tests && isort --check src tests`
-9. Update `docs/iterations/current.md` with what you did this iteration
-10. Create PR (body contains `Closes #N`)
-11. Wait for CI green. Fix failures. Never merge red CI.
-12. **Wait for reviewer bots** — do NOT self-merge. Two reviewer bots (`hlin99-Review-Bot` and `hlin99-Review-BotX`) will be auto-assigned.
-13. Handle review result:
-    - **2 approvals** → auto-merge → update ROADMAP.md → go to step 1
-    - **request changes** → fix code, push to same PR → wait for re-review (max 10 min to fix)
-    - **closed by reviewer** → iteration failed → push update to `docs/iterations/current.md` on main recording the failure (what was attempted, why rejected, reviewer comments) → go to step 1 with a different task
-14. Go to step 1
-
-## Review Rules (see REVIEW_POLICY.md)
-
-- 2 reviewer bots are auto-assigned on PR creation
-- Either reviewer can close the PR (idea rejected) — one close = PR dead
-- Both must approve for merge
-- Reviewer timeout: 15 minutes → PR auto-closed
-- Total round timeout: 1 hour → PR auto-closed
-- Implementer (hlin99) must NEVER approve or merge their own PR
-
-## Timing
-
-| Parameter | Value |
-|-----------|-------|
-| Iteration interval | 10 minutes |
-| PR wait for review | max 15 minutes |
-| Fix after request changes | max 10 minutes |
-| Total round timeout | 1 hour |
-
-## Deliverables (every iteration)
-
-Every PR MUST include:
-- Code changes (if any)
-- Tests for new code
-- Updated `docs/iterations/current.md` describing what was done
-
-## Rules
-- Committer must be `hlin99 <tony.lin@intel.com>` — always set git config before any commit
-- All code, docs, issues, PRs in English
-- Commit messages: conventional commits format
-- Never self-merge — wait for reviewer bots
-
-## Phase 1: Roadmap-Driven
-Follow ROADMAP.md milestones in order.
-
-## Phase 2: Continuous Evolution
-When all milestones are done:
-1. Review the project — find limitations, improvements, new scenarios
-2. Create new milestones in ROADMAP.md
-3. Return to Phase 1
-
-## Iteration Tracking
-
-`docs/iterations/current.md` must maintain a running log at the bottom:
-
-```markdown
-## Iteration History
-
-| # | Date | Task | Result | Reviewer Comments |
-|---|------|------|--------|-------------------|
-| 1 | 2026-04-06 | Added X feature | ✅ merged | Both approved |
-| 2 | 2026-04-06 | Refactored Y | ❌ closed | BotX: idea not valuable |
-| 3 | 2026-04-06 | Fixed Z bug | ✅ merged | Bot requested changes, fixed |
-```
-
-This table is the source of truth for iteration success/failure rate.