PlanExeOrg
diff --git a/‎docs/mcp/autonomous_agent_guide.md‎
Lines changed: 5 additions & 5 deletions b/‎docs/mcp/autonomous_agent_guide.md‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎docs/mcp/mcp_details.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/mcp/mcp_details.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/mcp/planexe_mcp_interface.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/mcp/planexe_mcp_interface.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/proposals/06-adopt-on-the-fly.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/proposals/06-adopt-on-the-fly.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/proposals/101-luigi-resume-enhancements.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/proposals/101-luigi-resume-enhancements.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/proposals/107-domain-aware-normalizer.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/proposals/107-domain-aware-normalizer.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/proposals/112-end-to-end-test-plan.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/proposals/112-end-to-end-test-plan.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/proposals/114-mcp-interface-feedback-stress-test.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/proposals/114-mcp-interface-feedback-stress-test.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/proposals/117-system-prompt-optimizer.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/proposals/117-system-prompt-optimizer.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/proposals/133-dag-and-rca.md‎
Lines changed: 13 additions & 13 deletions b/‎docs/proposals/133-dag-and-rca.md‎
Lines changed: 13 additions & 13 deletions
@@ -123,11 +123,11 @@ An advanced pattern: use PlanExe to plan the agent's own work.
 4. Agent executes the plan step by step, tracking progress against the WBS
 
 Key files in the zip for agent consumption:
-- `018-2-wbs_level1.json` — High-level work packages
-- `018-5-wbs_level2.json` — Detailed tasks within each package
-- `023-2-wbs_level3.json` — Sub-tasks with effort estimates
-- `004-2-pre_project_assessment.json` — Feasibility assessment
-- `003-6-distill_assumptions_raw.json` — Key assumptions to validate
+- `wbs_level1.json` — High-level work packages
+- `wbs_level2.json` — Detailed tasks within each package
+- `wbs_level3.json` — Sub-tasks with effort estimates
+- `pre_project_assessment.json` — Feasibility assessment
+- `distill_assumptions_raw.json` — Key assumptions to validate
 
 ## Prompt writing tips for agents
 
 
@@ -223,7 +223,7 @@ curl -H "X-API-Key: pex_0123456789abcdef" -O "https://mcp.planexe.org/download/2
 
 Download report:
 ```bash
-curl -H "X-API-Key: pex_0123456789abcdef" -O "https://mcp.planexe.org/download/2d57a448-1b09-45aa-ad37-e69891ff6ec7/030-report.html"
+curl -H "X-API-Key: pex_0123456789abcdef" -O "https://mcp.planexe.org/download/2d57a448-1b09-45aa-ad37-e69891ff6ec7/report.html"
 ```
 
 ## Tool Catalog, `mcp_local`
@@ -248,7 +248,7 @@ Example call:
 - Save directory is `PLANEXE_PATH`, or current working directory if unset.
 - Non-existing directories are created automatically.
 - If `PLANEXE_PATH` points to a file, download fails.
-- Filename is prefixed with plan id (for example `<plan_id>-030-report.html`).
+- Filename is prefixed with plan id (for example `<plan_id>-report.html`).
 - Response includes `saved_path` with the exact local file location.
 
 ## Minimal error-handling contract
 
@@ -522,7 +522,7 @@ Use `plan_resume` when `plan_status` shows `failed` or `stopped` and plan genera
 
 **Required semantics**
 
-- The MCP tool only accepts plans in `failed` state. However, the underlying Luigi mechanism is more general: Luigi skips any task whose output file already exists and re-executes any task whose output file is missing. This means a completed plan can be partially re-run by deleting `999-pipeline_complete.txt` and the output files of the tasks you want to regenerate — Luigi will re-execute those tasks and all their downstream dependents. The MCP API does not yet expose this capability; it is available when running the pipeline locally via `run_plan_pipeline.py`.
+- The MCP tool only accepts plans in `failed` state. However, the underlying Luigi mechanism is more general: Luigi skips any task whose output file already exists and re-executes any task whose output file is missing. This means a completed plan can be partially re-run by deleting `pipeline_complete.txt` and the output files of the tasks you want to regenerate — Luigi will re-execute those tasks and all their downstream dependents. The MCP API does not yet expose this capability; it is available when running the pipeline locally via `run_plan_pipeline.py`.
 - On success, the same plan_id is reset to `pending` and requeued.
 - Prior artifacts are **preserved** — the worker restores the output directory from the stored zip snapshot.
 - `resume_count` tracks how many times the plan has been resumed.
@@ -577,7 +577,7 @@ Bump `PIPELINE_VERSION` whenever the pipeline changes in a way that would break
 - Save directory is `PLANEXE_PATH`.
 - If `PLANEXE_PATH` is unset, save to current working directory.
 - If `PLANEXE_PATH` points to a file (not a directory), return an error.
-- Filenames are `<plan_id>-030-report.html` or `<plan_id>-run.zip`.
+- Filenames are `<plan_id>-report.html` or `<plan_id>-run.zip`.
 - If a filename already exists, append `-1`, `-2`, ... before extension.
 - Successful responses include `saved_path`.
 
 
@@ -6,9 +6,9 @@ This is a concrete implementation plan for making PlanExe's agent behavior adapt
 
 PlanExe already has multiple "early classification" concepts and quality gates that we can build on:
 
-- **Purpose classification (business/personal/other)**: `worker_plan/worker_plan_internal/assume/identify_purpose.py` produces `002-6-identify_purpose.md` and is already used downstream (e.g., SWOT prompt selection).
+- **Purpose classification (business/personal/other)**: `worker_plan/worker_plan_internal/assume/identify_purpose.py` produces `identify_purpose.md` and is already used downstream (e.g., SWOT prompt selection).
 
-- **Plan type classification (digital/physical)**: `worker_plan/worker_plan_internal/assume/identify_plan_type.py` produces `002-8-plan_type.md`. Note: it intentionally labels most software development as "physical" (because it assumes a physical workspace/devices).
+- **Plan type classification (digital/physical)**: `worker_plan/worker_plan_internal/assume/identify_plan_type.py` produces `plan_type.md`. Note: it intentionally labels most software development as "physical" (because it assumes a physical workspace/devices).
 
 - **Levers pipeline**: `worker_plan/worker_plan_internal/lever/*` produces potential levers -> deduped -> enriched -> "vital few" -> scenarios/strategic decisions.
 
 
@@ -85,9 +85,9 @@ Behavior:
 ```
 $ planexe invalidate SelectScenarioTask --run-dir ./run/Qwen_Clean_v1
 Would delete:
-  run/Qwen_Clean_v1/002-17-selected_scenario_raw.json
-  run/Qwen_Clean_v1/002-18-selected_scenario.json
-  run/Qwen_Clean_v1/002-19-scenarios.md
+  run/Qwen_Clean_v1/selected_scenario_raw.json
+  run/Qwen_Clean_v1/selected_scenario.json
+  run/Qwen_Clean_v1/scenarios.md
 Proceed? [y/N]
 ```
 
@@ -101,7 +101,7 @@ Tonight we needed to re-run `SelectScenarioTask` after applying a fix. Without k
 
 ### The problem
 
-The input plan (`001-2-plan.txt`) is locked in at run start. If a user wants to refine the plan description mid-run — clarify scope, correct a factual error, tighten the framing — there is no supported path. The only option is start a new run from scratch.
+The input plan (`plan.txt`) is locked in at run start. If a user wants to refine the plan description mid-run — clarify scope, correct a factual error, tighten the framing — there is no supported path. The only option is start a new run from scratch.
 
 ### What we want
 
 
@@ -401,9 +401,9 @@ MakeAssumptions → [QuantifiedAssumptionExtractor] → [FermiSanityCheck] → [
 
 The three new tasks (in brackets) are inserted between the existing MakeAssumptions and DistillAssumptions tasks. Each produces output files following PlanExe's standard naming convention:
 
-- `003-12-fermi_sanity_check_report.json` — detailed per-assumption verdicts
-- `003-13-fermi_sanity_check_summary.md` — human-readable summary of findings
-- `003-14-normalized_assumptions.json` — all assumptions in standard representation
+- `fermi_sanity_check_report.json` — detailed per-assumption verdicts
+- `fermi_sanity_check_summary.md` — human-readable summary of findings
+- `normalized_assumptions.json` — all assumptions in standard representation
 
 The FermiSanityCheck report includes a section on ethical flags, making it visible to both the downstream pipeline tasks and human reviewers.
 
 
@@ -25,7 +25,7 @@ These tests exercise the MCP server, database, and worker interactions without i
 
 **Variant — worker-side check:**
 1. Bypass the MCP-layer check (e.g. manually set `parameters["pipeline_version"]` to match current).
-2. But ensure the `001-3-planexe_metadata.json` in the zip snapshot has a different version.
+2. But ensure the `planexe_metadata.json` in the zip snapshot has a different version.
 3. Let the worker pick up the resumed plan.
 4. Assert: worker sets plan to failed with progress_message containing "Not resumable".
 
@@ -87,7 +87,7 @@ These tests invoke real LLMs and are non-deterministic, slow (~10-20 min per pla
 4. Call `plan_file_info` with `artifact: "report"` — assert `download_url` is present.
 5. Call `plan_file_info` with `artifact: "zip"` — assert `download_url` is present.
 6. Download the report and verify it is valid HTML containing expected sections.
-7. Download the zip and verify `001-3-planexe_metadata.json` is present with correct `pipeline_version`.
+7. Download the zip and verify `planexe_metadata.json` is present with correct `pipeline_version`.
 
 ### 7. Resume after mid-generation failure
 
 
@@ -68,7 +68,7 @@ During the stress test, Plan 1 (20f1cfac) stalled at 5.5% with zero diagnostic i
   "state": "failed",
   "error": {
     "failure_reason": "generation_error",
-    "failed_step": "016-expert_criticism",
+    "failed_step": "expert_criticism",
     "message": "LLM provider returned 503",
     "recoverable": true
   }
@@ -248,7 +248,7 @@ This is a trust gap: the agent cannot confidently tell the user "your plan is re
   "sections_complete": 108,
   "sections_partial": 2,
   "partial_details": [
-    {"step": "016-expert_criticism", "note": "2/8 experts provided feedback"}
+    {"step": "expert_criticism", "note": "2/8 experts provided feedback"}
   ]
 }
 ```
@@ -507,7 +507,7 @@ No stale error information leaked between states.
 
 ### Files list ordering fix
 
-The files list in `plan_status` now shows the most recent 10 files instead of the first 10. When the plan completed, the agent saw `029-2-self_audit.md`, `030-report.html`, `999-pipeline_complete.txt` etc. instead of the same early pipeline files every time. Much more useful for monitoring progress.
+The files list in `plan_status` now shows the most recent 10 files instead of the first 10. When the plan completed, the agent saw `self_audit.md`, `report.html`, `pipeline_complete.txt` etc. instead of the same early pipeline files every time. Much more useful for monitoring progress.
 
 ### Agent-server capability mismatch (systemic observation)
 
 
@@ -303,10 +303,10 @@ populate_baseline.py                # script to populate baseline from zip files
 baseline/                           # current outputs (extracted from dataset zips)
   train/
     20260310_hong_kong_game/
-      001-1-start_time.json
-      001-2-plan.txt
+      start_time.json
+      plan.txt
       ...
-      030-report.html
+      report.html
     20250329_gta_game/
       ...
     20250321_silo/
@@ -338,8 +338,8 @@ history/                                      # captured output, global run coun
       outputs.jsonl
       outputs/
         20250321_silo/
-          002-9-potential_levers_raw.json
-          002-10-potential_levers.json
+          potential_levers_raw.json
+          potential_levers.json
           activity_overview.json
           usage_metrics.jsonl
         20260310_hong_kong_game/
@@ -382,7 +382,7 @@ scores/                             # longitudinal tracking
 full_plan_comparisons/              # Stage 3 periodic full-plan regenerations
   2026-03-20/
     hong_kong_game/
-      030-report.html
+      report.html
     kpi_comparison.json
 ```
 
 
@@ -81,7 +81,7 @@ Example:
 
 {
   "id": "executive_summary_markdown",
-  "path": "025-2-executive_summary.md",
+  "path": "executive_summary.md",
   "format": "md",
   "role": "summary_markdown"
 }
@@ -132,7 +132,7 @@ A stronger format could allow fields like:
 
 {
   "from_node": "executive_summary",
-  "artifact_path": "025-2-executive_summary.md",
+  "artifact_path": "executive_summary.md",
   "used_for": "decision-maker summary section"
 }
 
@@ -143,7 +143,7 @@ How RCA can work with the current format
 Goal
 
 The goal of RCA is to answer questions like:
-	•	Why is a false claim shown in 030-report.html?
+	•	Why is a false claim shown in report.html?
 	•	Which upstream artifact first contained it?
 	•	Which node likely introduced it?
 	•	Which source file should be inspected first?
@@ -153,7 +153,7 @@ Investigation strategy
 Step 1: Start from the final artifact
 
 Begin with the final output artifact, such as:
-	•	030-report.html
+	•	report.html
 
 Find the node that produces it.
 
@@ -210,15 +210,15 @@ Suppose the final report contains the false claim:
 The project requires 12 full-time engineers.
 
 A practical investigation would look like this:
-	1.	search 030-report.html for the claim
+	1.	search report.html for the claim
 	2.	inspect the report node inputs
-	3.	search 025-2-executive_summary.md
-	4.	search 024-2-review_plan.md
-	5.	search 013-team.md
-	6.	if the claim appears in 013-team.md, inspect the team_markdown node
+	3.	search executive_summary.md
+	4.	search review_plan.md
+	5.	search team.md
+	6.	if the claim appears in team.md, inspect the team_markdown node
 	7.	inspect that node’s inputs:
-	•	011-2-enrich_team_members_environment_info.json
-	•	012-review_team_raw.json
+	•	enrich_team_members_environment_info.json
+	•	review_team_raw.json
 	8.	search those artifacts for the same claim or the numeric value
 	9.	continue upstream until the earliest occurrence is found
 	10.	inspect the producing node’s source_files
@@ -255,7 +255,7 @@ Example:
 
 {
   "id": "review_plan_markdown",
-  "path": "024-2-review_plan.md",
+  "path": "review_plan.md",
   "format": "md",
   "role": "review_output"
 }
@@ -266,7 +266,7 @@ Example:
 
 {
   "from_node": "review_plan",
-  "artifact_path": "024-2-review_plan.md",
+  "artifact_path": "review_plan.md",
   "used_for": "quality review section"
 }