Activer007 · Activer007 · May 13, 2026 · May 13, 2026
diff --git a/API.md b/API.md
@@ -1253,7 +1253,7 @@ data: {"type":"message_stop"}
   - `token_budget_used` / `token_budget_limit` / `token_budget_overflow`
   - `warnings`（若有）
 
-此接口仅在 `parser.context_engine.mode = shadow` 时有意义；其他模式下缓冲始终为空。
+此接口仅在 `context_engine.mode = shadow` 时有意义；其他模式下缓冲始终为空。
 
 ### `DELETE /admin/context-plans`
 

diff --git a/README.en.md b/README.en.md
@@ -332,8 +332,12 @@ Common fields:
 - `model_aliases`: one shared alias map for OpenAI / Claude / Gemini model names.
 - `runtime`: account concurrency, queueing, and token refresh behavior, hot-reloadable via Admin Settings.
 - `auto_delete.mode`: remote session cleanup after each request, supporting `none` / `single` / `all`.
-- `current_input_file`: the global context split/upload mode; it is enabled by default and uploads the full context as a `DS2API_HISTORY.txt` context file once the character threshold is reached.
+- `current_input_file`: the global context split/upload mode. It is enabled by default and uses an inline-first path; when the full context is at or below `inline_max_tokens=30000`, DS2API keeps the complete prompt inline and does not upload generated context/tool files.
+- When the full context exceeds the inline threshold and the latest user turn satisfies `min_chars`, DS2API generates conversation context / tool reference files. The default `filename_policy` is `neutral_random`; legacy implementation filenames are used only when `filename_policy=legacy`.
 - If you turn off `current_input_file`, requests pass through directly without uploading any split context file.
+- `thinking_injection`: disabled by default. It appends the enhanced reasoning prompt to the latest user message only when `thinking_injection.enabled=true`.
+- `parser_v2.mode`: Tool Parser v2 gradual switch, supporting `off` / `shadow` / `enforce`; it is safely off by default and can be overridden with `DS2API_PARSER_V2`.
+- `context_engine.mode`: Context Engine gradual switch, supporting `off` / `shadow` / `enforce`; it defaults to `enforce`, with `context_engine.strategy=hybrid_recent` by default.
 
 For the full environment variable list, see [docs/DEPLOY.en.md](docs/DEPLOY.en.md). For auth behavior, see [API.en.md](API.en.md#authentication).
 

diff --git a/README.md b/README.md
@@ -151,7 +151,7 @@ flowchart LR
 | 日志与安全 | 支持日志文件输出、默认 `logs/ds2api.log`、轮转与 `max_size_mb` 校验；日志路径做安全校验，敏感 token、邮箱、手机号等脱敏策略进一步收紧 |
 | WebUI 配置体验 | WebUI 保存配置时保留 `_comment` 注释字段并保持 raw JSON 同步；Feature Flags 区域显示 parser v2 只读状态，配置模板继续来自 `config.example.json` |
 | Vercel 流式链路 | Vercel Node 流式桥接补齐账号切换、租约释放、auto-delete 释放前清理和上游安全加固，`/v1/chat/completions` 实时流式更接近 Go 主链路语义 |
-| 治理与发布纪律 | CORS allowlist、Gemini query-key fallback 开关、admin fail-closed、M3/M4 规划与审计文档已经落地，默认策略更偏“先 shadow 收证据，再 enforce” |
+| 治理与发布纪律 | CORS allowlist、Gemini query-key fallback 开关、admin fail-closed、M3/M4 规划与审计文档已经落地；新增高风险能力仍按 shadow-first 收证据，已验收的 Context Engine 默认进入 `enforce` |
 
 相关入口：
 
@@ -368,11 +368,12 @@ go run ./cmd/ds2api
 - `model_aliases`：OpenAI / Claude / Gemini 共用的模型 alias 映射。
 - `runtime`：账号并发、队列与 token 刷新策略，可通过 Admin Settings 热更新。
 - `auto_delete.mode`：请求结束后的远端会话清理策略，支持 `none` / `single` / `all`。
-- `current_input_file`：全局生效的上下文拆分上传策略；默认开启且阈值为 `0`，触发时将完整上下文合并上传为 `DS2API_HISTORY.txt` 上下文文件。
+- `current_input_file`：全局生效的上下文拆分上传策略，默认开启并采用 inline-first；整体上下文不超过 `inline_max_tokens=30000` 时保持完整 inline prompt，不上传生成的上下文或工具文件。
+- 当整体上下文超过 inline 阈值且最新 user turn 满足 `min_chars` 时，才会生成 conversation context / tool reference 文件；默认 `filename_policy=neutral_random`，只在显式设为 `legacy` 时使用旧文件名。
 - 如果关闭 `current_input_file`，请求会直接透传，不上传拆分上下文文件。
-- `thinking_injection`：默认开启；在最新 user 消息末尾追加思考增强提示词，提高高强度推理与工具调用前的思考稳定性；`prompt` 留空时使用内置默认提示词。
+- `thinking_injection`：默认关闭；只有显式设置 `thinking_injection.enabled=true` 时，才会在最新 user 消息末尾追加增强提示词；`prompt` 留空时使用内置默认提示词。
 - `parser_v2.mode`：Tool Parser v2 渐进开关，支持 `off` / `shadow` / `enforce`，默认安全关闭；可用环境变量 `DS2API_PARSER_V2` 覆盖。
-- `context_engine.mode`：Context Engine 渐进开关，支持 `off` / `shadow` / `enforce`，默认安全关闭；可用环境变量 `DS2API_CONTEXT_ENGINE` 覆盖。
+- `context_engine.mode`：Context Engine 渐进开关，支持 `off` / `shadow` / `enforce`，默认 `enforce`；`context_engine.strategy` 默认 `hybrid_recent`，可用环境变量 `DS2API_CONTEXT_ENGINE` / `DS2API_CONTEXT_ENGINE_STRATEGY` 覆盖。
 - `log`：日志文件输出与轮转配置；`file` 为空时使用默认 `logs/ds2api.log`，`max_size_mb` 有上限校验。
 
 环境变量完整列表见 [部署指南](docs/DEPLOY.md)，接口鉴权规则见 [API.md](API.md#鉴权规则)。

diff --git a/docs/C01.md b/docs/C01.md
@@ -1,4 +1,7 @@
-当前项目在进行LLM 提示词交流中，使用 `DS2API_HISTORY.txt` + `DS2API_TOOLS.txt` 的形态确实很“中间件味”，不像自然对话输入。代码里现在明确上传生成文件，并用短 prompt 要求模型从 `DS2API_HISTORY.txt` 继续；如果有工具文件，还会提示 `DS2API_TOOLS.txt` 里有工具说明和 schema。
+> Status: historical analysis and planning note.
+> 本文记录早期问题分析和改造思路；当前行为以 [prompt-compatibility.md](./prompt-compatibility.md) 和 [context-engine-strategies.md](./context-engine-strategies.md) 为准。已落地的默认路径是 `current_input_file` inline-first、`inline_max_tokens=30000`、`filename_policy=neutral_random`、`context_engine.mode=enforce`、`context_engine.strategy=hybrid_recent`，并且默认不注入 thinking 增强提示词。下文提到的 `DS2API_HISTORY.txt` / `DS2API_TOOLS.txt` 主要指早期问题形态或 `filename_policy=legacy` 回滚模式，不代表当前默认可见输入。
+
+当前项目在进行LLM 提示词交流中，早期使用 `DS2API_HISTORY.txt` + `DS2API_TOOLS.txt` 的形态确实很“中间件味”，不像自然对话输入。旧路径会上传生成文件，并用短 prompt 要求模型从 `DS2API_HISTORY.txt` 继续；如果有工具文件，还会提示 `DS2API_TOOLS.txt` 里有工具说明和 schema。
 
 更好的改进方向不是“伪装”，而是“上下文自然化 + 语义压缩 + 降低异常输入形态”。
 
@@ -308,4 +311,3 @@ Context = System Rules
 ```
 
 对 ds2api 来说，下一步最值得做的是把 `DS2API_HISTORY.txt` 从“全量 transcript 文件”升级为“Agent Coding Context Package”。这样既能支持 Claude Code / Codex CLI 这类编程工具，又能减少上下文膨胀、异常输入形态和模型误判。
-
diff --git a/docs/DEPLOY.en.md b/docs/DEPLOY.en.md
@@ -301,6 +301,9 @@ VERCEL_TEAM_ID=team_xxxxxxxxxxxx   # optional for personal accounts
 | `DS2API_RAW_STREAM_SAMPLE_ROOT` | Raw stream sample root for saving/reading samples | `tests/raw_stream_samples` |
 | `DS2API_STATIC_ADMIN_DIR` | WebUI static asset directory | `static/admin` |
 | `DS2API_AUTO_BUILD_WEBUI` | Whether local startup auto-builds missing WebUI assets (`1/true/yes/on` or `0/false/no/off`) | Enabled outside Vercel |
+| `DS2API_CONTEXT_ENGINE` | Context Engine mode: `off`, `shadow`, or `enforce`; `shadow` records ContextPlan summaries without changing responses, while `enforce` uses the default context path | `enforce` |
+| `DS2API_CONTEXT_ENGINE_STRATEGY` | Context rendering strategy: `raw_transcript`, `natural_context`, `context_capsule`, `hybrid_recent`, or reserved `auto` | `hybrid_recent` |
+| `DS2API_PARSER_V2` | Tool Parser v2 mode: `off`, `shadow`, or `enforce`; keep disabled unless parser shadow evidence supports promotion | `off` |
 | `VERCEL_TOKEN` | Vercel sync token | — |
 | `VERCEL_PROJECT_ID` | Vercel project ID | — |
 | `VERCEL_TEAM_ID` | Vercel team ID | — |

diff --git a/docs/capability-router.md b/docs/capability-router.md
@@ -16,7 +16,7 @@
 | `internal/config/models.go` | 模型 alias、模型列表、thinking/search/model_type 基础映射 |
 | `promptcompat.StandardRequest` | 已包含 requested/resolved/response model、thinking、search、tool choice、ref files |
 | Upload model type | 文件上传已能根据模型透传 `default` / `expert` / `vision` |
-| current-input 文件化 | 已有 DS2API_HISTORY / DS2API_TOOLS 上传路径 |
+| current-input 文件化 | 已有 conversation context / tool reference 上传路径，默认使用中性随机文件名；legacy 模式保留旧文件名回滚 |
 | observe metrics | 可增加能力路由 trace 字段 |
 
 ## 3. Profile 模型

diff --git a/docs/history-analyzer-design.md b/docs/history-analyzer-design.md
@@ -112,7 +112,7 @@ P3 规则约束：
 | `HA_TOOL_FALSE_POSITIVE` | tool | Markdown / XML 示例被误识别或正文缺失 | 补 false-positive fixture |
 | `HA_CONTEXT_TOOL_PAIR_ORPHAN` | context | history / ContextPlan 中 tool_call 和 tool_result 断链 | 修 Context Engine pair policy |
 | `HA_CONTEXT_REASONING_BLOAT` | context | reasoning 历史过大，挤压当前请求 | 调整 reasoning summary |
-| `HA_CONTEXT_CURRENT_INPUT_MISMATCH` | context | DS2API_HISTORY / DS2API_TOOLS 文件缺失或 hash 异常 | 检查 current-input 注入 |
+| `HA_CONTEXT_CURRENT_INPUT_MISMATCH` | context | conversation context / tool reference 生成文件缺失或 hash 异常；legacy 模式下兼容旧文件名 | 检查 current-input 注入 |
 | `HA_CONTINUE_CANDIDATE` | continue | 输出截断、代码块未闭合、JSON 未闭合、INCOMPLETE/AUTO_CONTINUE 状态 | 进入 Auto Continue shadow 样本 |
 | `HA_CAPABILITY_SEARCH_THINKING_CONFLICT` | capability | search 请求伴随 thinking/current-input 冲突信号 | 检查 capability policy |
 | `HA_ACCOUNT_RETRY_RECOVERED` | account_runtime | 空输出 / 429 后账号切换恢复 | 记录账号健康但不一定报错 |

diff --git a/docs/m4-development-plan.md b/docs/m4-development-plan.md
@@ -32,7 +32,8 @@ M4.0 Release Readiness baseline
 
 核心约束：
 
-- 任何会改变响应行为的能力默认 `off`，先 `shadow` 收集证据，再进入 `enforce`。
+- 任何新引入且尚未完成证据闭环的响应行为能力默认 `off`，先 `shadow` 收集证据，再进入 `enforce`。
+- 已完成验收并作为当前默认路径的能力需要记录默认值、证据和回滚方式；例如 Context Engine 当前默认 `enforce`，策略默认 `hybrid_recent`。
 - `History Analyzer` 和 `Release Readiness` 优先做，因为它们给后续改造提供样本和晋级依据。
 - `Auto Continue` 第一版只处理纯文本 continuation；遇到 tool call / JSON mode / structured output 默认跳过。
 - `Capability Router` 第一阶段只做 profile + trace + warning，不直接重写模型选择。
@@ -180,7 +181,33 @@ DoD：
 - tool_call / tool_result 不断链。
 - Agent profile enforce 前必须有 shadow 报告。
 
-## 4. PR 切分建议
+## 4. 最近 M4 阶段完成核对（2026-05-13）
+
+本节用于回顾最近 M4 相关提交与本计划的偏差。它不替代 PR review，只记录“已经完成什么、完成是否有问题、还需要补什么”。
+
+| 任务 | 当前核对结果 | 问题 / 修补 |
+|---|---|---|
+| M4.0-P1 Release Readiness 文档与决策口径 | 已完成。`docs/release-readiness.md` 已定义报告模板、gate、feature flag readiness、缺证据处理和无账号验证口径。 | 本次修补了“所有高风险能力都不得 enforce”的旧表述，改为区分“已晋级默认能力”和“尚未晋级新能力”。 |
+| M4.0-P2 readiness 数据模型与 Markdown 渲染 | 已完成。`internal/readiness` 已有 baseline、Markdown 渲染和测试。 | 目前是轻量 baseline，不自动汇总真实 History Analyzer / shadow report；后续 M4.2 需要接入结构化输入。 |
+| M4.0-P3 本地生成器与脚本 | 已完成。`cmd/release-readiness`、`tests/scripts/run-release-readiness.sh` 可生成报告。 | 无真实账号时已补 `offline current-input smoke` 证据入口；仍不能把 offline smoke 等同于 live gate。 |
+| M4.1-P1 History Analyzer core | 已完成。核心模型、规则接口、报告结构和脱敏证据已落地。 | 后续要继续扩大真实样本覆盖，但主框架已闭合。 |
+| M4.1-P2 历史数据导入与归一化 | 已完成。已接入本地历史、响应历史、dev capture / raw sample 等输入，并做过路径校验修补。 | 仍需在实际部署数据上验证样本覆盖率。 |
+| M4.1-P3 首批 HA_* 规则 | 已完成。确定性规则和合成样本单测已落地。 | 本次修补了 context current-input mismatch 的文档描述，避免默认路径继续暴露旧实现文件名。 |
+| M4.1-P4 离线 CLI 与报告输出 | 已完成。`cmd/history-analyzer` 与脚本可输出 Markdown / JSON。 | 真实历史样本不足时应输出 `PENDING` 或低样本说明，不能得出强晋级结论。 |
+| M4.2 Parser / Context Shadow Report | 部分完成。Context 侧已完成 strategy 文档、`hybrid_recent` 默认、renderer/golden/protocol 默认覆盖和换行稳定性修复。 | 尚未完成 parser/context 两类统一 shadow report 汇总入口；当前只能依赖 History Analyzer、golden、protocol tests 和 offline smoke。 |
+| M4.2 Context 默认值与可见输入核对 | 已完成但文档曾有偏差。实现已是 `context_engine.mode=enforce`、`strategy=hybrid_recent`、`current_input_file.inline_max_tokens=30000`、`filename_policy=neutral_random`、`thinking_injection=false`。 | 本次修补 README / README.en / prompt compatibility / release readiness / C01 的旧默认值和旧文件名口径。 |
+| M4.3 Auto Continue MVP | 未开始实现。当前只有设计文档。 | 后续应按配置、non-stream、stream 三个堆叠 PR 推进；没有 live smoke 时不得 stream enforce。 |
+| M4.4 Capability Router | 未开始实现。当前只有设计文档。 | 本次修补设计文档中的 current-input 基础描述；后续仍需 profile、trace、warning 和 WebUI matrix。 |
+| M4.5 WebUI v2 Diagnostics | 未开始实现。当前只有设计文档。 | 等 M4.2-M4.4 结构化报告稳定后再产品化，避免 WebUI 先行展示不完整结论。 |
+
+综合结论：
+
+- 最近完成的 M4.0 / M4.1 任务基本符合计划，没有提前改主请求链路。
+- 最近追加的 Context Engine 默认路径硬化属于 M4.2 的一部分，但不是完整的 M4.2 shadow report 交付。
+- 当前最大偏差是文档口径曾落后于实现，本次已优先修补 README、API、Release Readiness、C01 和相关设计文档。
+- 当前最大证据缺口仍是真实账号 live smoke；无账号时只能用 offline smoke 证明本地协议和上下文路径未回归。
+
+## 5. PR 切分建议
 
 M4.0 使用 stacked PR 推进：
 
@@ -218,7 +245,7 @@ main
 | 13 | `feat/m4-webui-diagnostics` | WebUI 诊断页和报告展示 |
 | 14 | `feat/m5-agent-context-shadow` | Agent profile / Task Memory shadow |
 
-## 5. 门禁
+## 6. 门禁
 
 每个代码 PR 至少运行：
 

diff --git a/docs/prompt-compatibility.md b/docs/prompt-compatibility.md
@@ -516,7 +516,7 @@ level=INFO msg="[context_engine_shadow]"
 
 ### 回滚
 
-任何时候将 `DS2API_CONTEXT_ENGINE` 设为 `off`（或不设置）即可完全关闭 shadow，主链路行为与 M1 完全一致。
+任何时候将 `DS2API_CONTEXT_ENGINE` 设为 `off` 即可关闭 Context Engine 路径，回到旧式 promptcompat 渲染。当前默认值为 `enforce`，因此“不设置环境变量”会使用配置文件或内置默认，而不是关闭该能力。
 
 ### M3 新增功能
 

diff --git a/docs/release-readiness.md b/docs/release-readiness.md
@@ -43,7 +43,7 @@ M4.0 只建立 release readiness baseline，不改变任何主请求链路行为
 
 - 不实现 History Analyzer 规则。
 - 不自动切换 feature flag。
-- 不把 parser、context、auto continue 或 capability router 推到 `enforce`。
+- 已验收且当前默认为 `enforce` 的能力应记录现状和回滚路径；尚未完成证据闭环的新能力不得推进到 `enforce`。
 - 不读取或输出未脱敏 prompt、token、账号凭证和完整真实请求。
 - 不把 live test 失败或缺凭证隐藏成通过结果。
 - 无账号时可以记录 offline stand-in smoke 结果，但不能把它等同于 live gate 通过。
@@ -110,7 +110,7 @@ Required follow-ups:
 | Decision | 含义 | 必要条件 |
 |---|---|---|
 | `GO` | 可以按目标配置发布 | PR Gate 通过；无未解释 critical finding；目标 feature flag 有对应证据 |
-| `GO-WITH-FLAGS-OFF` | 可以发布，但高风险能力保持 `off` 或 `shadow` | 主链路改动低风险；高风险能力未默认开启；缺失证据已列为 follow-up |
+| `GO-WITH-FLAGS-OFF` | 可以发布，但未晋级高风险能力保持 `off` 或 `shadow` | 主链路改动低风险；未晋级高风险能力未默认开启；缺失证据已列为 follow-up |
 | `NO-GO` | 不建议发布 | PR Gate 失败；存在未解释 critical finding；或高风险能力缺少回滚路径 |
 
 默认优先选择 `GO-WITH-FLAGS-OFF`，除非本次发布明确需要启用高风险能力。
@@ -122,7 +122,7 @@ Required follow-ups:
 - 没有 shadow 数据，不允许直接 `enforce`。
 - 有 critical analyzer finding，不允许 release，除非明确不影响本次变更范围。
 - Auto Continue 没有 live smoke，不允许 stream enforce。
-- Parser / Context diff 无人工审阅，不允许默认开启。
+- Parser / Context diff 无人工审阅，不允许把尚未晋级的路径默认开启；已默认 `enforce` 的 Context Engine 变更仍需列出回归证据和回滚方式。
 - live gate 因无账号 `SKIP` 时，必须写明是否执行了 offline stand-in smoke；该证据只能说明本地协议和上下文路径未回归。
 
 ### 6.1 Feature Flag 晋级矩阵
@@ -168,7 +168,7 @@ Required follow-ups:
 1. 是否仍符合 `docs/v2-prd.md` 的 DeepSeek 专用 Agent API 引擎方向？
 2. 是否没有改动主请求链路？
 3. 是否没有提前实现 M4.1/M4.2 的分析规则？
-4. 是否所有高风险能力仍默认 `off` 或只允许 `shadow`？
+4. 是否所有尚未晋级的高风险能力仍默认 `off` 或只允许 `shadow`？已晋级能力是否记录了当前默认值、证据和回滚方式？
 5. 是否报告、模型、CLI 的职责边界清楚？
 6. 是否没有输出未脱敏 artifacts？
 7. 是否满足本仓库 PR Gate 和 stacked PR 分支纪律？