lintsinghua · lanshi17 · Apr 3, 2026 · Apr 3, 2026 · Apr 3, 2026 · Apr 3, 2026
diff --git a/.gitignore b/.gitignore
@@ -1 +1,2 @@
 .DS_Store
+command.log
diff --git a/00-前言.md b/00-前言.md
@@ -43,50 +43,8 @@
 
 这三次浪潮可以用一张简明的演进图来概括：
 
-```mermaid
-flowchart LR
-    subgraph wave1["2021-2022"]
-        A["代码补全\nInline Complete"]
-    end
-    subgraph wave2["2023-2024"]
-        B["对话助手\nChat Assisted"]
-    end
-    subgraph wave3["2025 至今"]
-        C["自主智能体\nAgent Autonomy"]
-    end
-
-    wave1 --> wave2 --> wave3
-
-    classDef era fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
-    class A,B,C era
-```
-
-```mermaid
-flowchart LR
-    subgraph w1["第一次浪潮"]
-        direction TB
-        A1["Tab 接受建议"]
-        A2["单文件局部补全"]
-        A3["被动等待触发"]
-        A4["无执行能力"]
-    end
-    subgraph w2["第二次浪潮"]
-        direction TB
-        B1["Ctrl+L 对话提问"]
-        B2["多文件跨文件生成"]
-        B3["主动理解意图"]
-        B4["无执行能力"]
-    end
-    subgraph w3["第三次浪潮"]
-        direction TB
-        C1["直接交付任务"]
-        C2["跨工具自主编排"]
-        C3["主动规划执行路径"]
-        C4["读/写/执行/验证"]
-    end
-
-    w1 --> w2 --> w3
-```
+![AI 编程范式演进](assets/render_evolution_graph.png)
+
 
 ### Agent Harness：一个新架构概念的诞生
 
@@ -164,32 +122,7 @@ flowchart LR
 
 本书分为四个部分，按照从宏观到微观、从概念到实现的组织方式：
 
-```mermaid
-flowchart TD
-    subgraph part1["第一部分：基础篇（第 1-4 章）"]
-        direction LR
-        ch1["第1章\n新范式\n全景导览"] --> ch2["第2章\n对话循环\nAgent心跳"] --> ch3["第3章\n工具系统\nAgent双手"] --> ch4["第4章\n权限管线\nAgent护栏"]
-    end
-
-    subgraph part2["第二部分：核心篇（第 5-8 章）"]
-        core["上下文管理 · 缓存策略 · 流式架构 · 错误恢复"]
-    end
-
-    subgraph part3["第三部分：扩展篇（第 9-12 章）"]
-        ext["MCP 协议 · 子智能体 · 插件系统 · Hook 机制"]
-    end
-
-    subgraph part4["第四部分：实战篇（第 13-15 章）"]
-        practice["Mini Agent Harness 构建 · 调试技巧 · 生产部署"]
-    end
-
-    part1 --> part2 --> part3 --> part4
-
-    classDef section fill:#f3f9ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
-    classDef chapter fill:#eff6ff,stroke:#60a5fa,stroke-width:1px,color:#2563eb
-    class part1,part2,part3,part4 section
-    class ch1,ch2,ch3,ch4 chapter
-```
+![书籍结构概述](assets/book_structure_overview.png)
 
 **如果你时间有限（快速路径）：** 至少阅读第 1 章（建立心智模型）和第 2 章（理解核心循环），然后用 15 分钟浏览第 3-4 章的关键要点部分。这两章是理解后续所有内容的基础。
 

diff --git a/README.md b/README.md
@@ -162,7 +162,7 @@
 
 ## 背景
 
-2026 年 3 月 31 日，安全研究员 [Chaofan Shou (@Fried_rice)](https://x.com/Fried_rice) 发现 npm registry 中的 `@anthropic-ai/claude-code` 包存在构建配置失误，source map 文件引用了未设访问控制的 Cloudflare R2 存储桶。披露推文获得超 1700 万次浏览，引发了技术社区对 Agent 架构的空前讨论。
+2026 年 3 月 31 日，安全研究员 [Chaofan Shou (@Fried_rice)](https://x.com/Fried_rice) 发现 npm registry 中的 `@anthropic-ai/claude-code` 包存在构建配置失误。披露推文获得超 1700 万次浏览，引发了技术社区对 Agent 架构的空前讨论。
 
 这本书的诞生正是受到这场讨论的启发——当 Agent 架构成为热门话题，我们意识到需要一本系统性的书来讲解 Agent Harness 的设计原理。
 

diff --git a/assets/book_structure_overview.png b/assets/book_structure_overview.png
diff --git a/assets/fundamentals/ai_programming_tools_evolution_timeline.png b/assets/fundamentals/ai_programming_tools_evolution_timeline.png
diff --git a/assets/fundamentals/async_generator_diagram.png b/assets/fundamentals/async_generator_diagram.png
diff --git a/assets/fundamentals/claude_code_architecture_overview.png b/assets/fundamentals/claude_code_architecture_overview.png
diff --git a/assets/fundamentals/llm_human_interaction.png b/assets/fundamentals/llm_human_interaction.png
diff --git a/assets/fundamentals/message_types_diagram.png b/assets/fundamentals/message_types_diagram.png
diff --git a/assets/fundamentals/six_engineering_challenges.png b/assets/fundamentals/six_engineering_challenges.png
diff --git a/assets/fundamentals/state_transition_diagram.jpeg b/assets/fundamentals/state_transition_diagram.jpeg
diff --git a/assets/fundamentals/termination_reasons_diagram.png b/assets/fundamentals/termination_reasons_diagram.png
diff --git a/assets/fundamentals/tools_partition_example.png b/assets/fundamentals/tools_partition_example.png
diff --git a/assets/fundamentals/tools_state_machine.png b/assets/fundamentals/tools_state_machine.png
diff --git a/assets/fundamentals/turn_lifecycle_diagram.png b/assets/fundamentals/turn_lifecycle_diagram.png
diff --git a/assets/render_evolution_graph.png b/assets/render_evolution_graph.png
diff --git a/en/00-Foreword.md b/en/00-Foreword.md
@@ -14,50 +14,7 @@ Looking back over the past few years, AI-assisted programming has gone through t
 
 These three waves can be summarized in a concise evolution diagram:
 
-```mermaid
-flowchart LR
-    subgraph wave1["2021-2022"]
-        A["Code Completion\nInline Complete"]
-    end
-    subgraph wave2["2023-2024"]
-        B["Chat Assistant\nChat Assisted"]
-    end
-    subgraph wave3["2025 to present"]
-        C["Autonomous Agent\nAgent Autonomy"]
-    end
-
-    wave1 --> wave2 --> wave3
-
-    classDef era fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
-    class A,B,C era
-```
-
-```mermaid
-flowchart LR
-    subgraph w1["First Wave"]
-        direction TB
-        A1["Tab to accept suggestions"]
-        A2["Single-file local completion"]
-        A3["Passively waiting for triggers"]
-        A4["No execution capability"]
-    end
-    subgraph w2["Second Wave"]
-        direction TB
-        B1["Ctrl+L conversational queries"]
-        B2["Multi-file cross-file generation"]
-        B3["Proactive intent understanding"]
-        B4["No execution capability"]
-    end
-    subgraph w3["Third Wave"]
-        direction TB
-        C1["Direct task delegation"]
-        C2["Cross-tool autonomous orchestration"]
-        C3["Proactive execution path planning"]
-        C4["Read/Write/Execute/Verify"]
-    end
-
-    w1 --> w2 --> w3
-```
+![Evolution of the AI Programming Paradigm](assets/render_evolution_graph.png)
 
 ### Agent Harness: The Birth of a New Architectural Concept
 
@@ -135,32 +92,7 @@ This book is suitable for the following readers, each of whom can gain unique va
 
 This book is divided into four parts, organized from macro to micro, from concept to implementation:
 
-```mermaid
-flowchart TD
-    subgraph part1["Part 1: Foundations (Chapters 1-4)"]
-        direction LR
-        ch1["Chapter 1\nNew Paradigm\nPanoramic Overview"] --> ch2["Chapter 2\nDialog Loop\nAgent Heartbeat"] --> ch3["Chapter 3\nTool System\nAgent Hands"] --> ch4["Chapter 4\nPermission Pipeline\nAgent Guardrails"]
-    end
-
-    subgraph part2["Part 2: Core (Chapters 5-8)"]
-        core["Context Management · Cache Strategy · Streaming Architecture · Error Recovery"]
-    end
-
-    subgraph part3["Part 3: Extensions (Chapters 9-12)"]
-        ext["MCP Protocol · Sub-Agents · Plugin System · Hook Mechanism"]
-    end
-
-    subgraph part4["Part 4: Practice (Chapters 13-15)"]
-        practice["Mini Agent Harness Build · Debugging Tips · Production Deployment"]
-    end
-
-    part1 --> part2 --> part3 --> part4
-
-    classDef section fill:#f3f9ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
-    classDef chapter fill:#eff6ff,stroke:#60a5fa,stroke-width:1px,color:#2563eb
-    class part1,part2,part3,part4 section
-    class ch1,ch2,ch3,ch4 chapter
-```
+![Book Structure Overview](assets/book_structure_overview.png)
 
 **If you are short on time (Fast Path):** Read at least Chapter 1 (to establish a mental model) and Chapter 2 (to understand the core loop), then spend 15 minutes browsing the key takeaways sections of Chapters 3-4. These two chapters are the foundation for understanding everything that follows.
 

diff --git a/en/Part-1-Foundations/01-The-New-Paradigm-of-Agent-Programming.md b/en/Part-1-Foundations/01-The-New-Paradigm-of-Agent-Programming.md
@@ -22,16 +22,8 @@ In 2023, most developers interacted with LLMs like this: open a web page, enter
 
 This mode can be described with a simple model:
 
-```mermaid
-sequenceDiagram
-    participant Human Send
-    participant LLM
-    participant Human Receive
-    Human Send->>LLM: Prompt
-    LLM->>Human Receive: Response (Text)
-    Note over Human Receive,Human Send: Human copies, pastes, executes manually
-    Human Receive->>Human Send: Manually transport results
-```
+![llm_human_interaction](../assets/fundamentals/llm_human_interaction.png)
+
 
 The core limitation of this mode is that the LLM can only "speak," not "act." It cannot read your file system, execute test commands, create Git branches, or autonomously adjust its strategy when encountering errors. Every interaction with the outside world requires a human intermediary to manually complete -- copying code to the editor, switching to the terminal to run commands, and then copying the output back to the dialog box. This "human glue" pattern is not only inefficient but also error-prone.
 
@@ -58,31 +50,7 @@ But tool calling also introduces new engineering challenges. These challenges ar
 
 These questions gave birth to a new architectural concept: **Agent Harness**.
 
-```mermaid
-flowchart TD
-    subgraph challenges["Six Engineering Challenges from Tool Calling"]
-        c1["Tool Registration & Discovery"]
-        c2["Parameter Validation"]
-        c3["Permission Control"]
-        c4["Error Recovery"]
-        c5["State Consistency"]
-        c6["Concurrency & Scheduling"]
-    end
-
-    harness["Agent Harness<br/>Unified Runtime Framework"]
-
-    c1 --> harness
-    c2 --> harness
-    c3 --> harness
-    c4 --> harness
-    c5 --> harness
-    c6 --> harness
-
-    classDef challenge fill:#fef9f0,stroke:#f59e0b,stroke-width:1px,color:#92400e
-    classDef center fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
-    class c1,c2,c3,c4,c5,c6 challenge
-    class harness center
-```
+![Six Engineering Challenges](../assets/fundamentals/six_engineering_challenges.png)
 
 ### Why Agent Harness Instead of Simple Wrappers
 
@@ -126,28 +94,7 @@ Before diving into design philosophy, let's first examine Claude Code's codebase
 
 Before analyzing Claude Code's architecture, let's widen our perspective and look at the evolution of AI programming tools. Understanding this timeline helps us see where Claude Code sits in the technological lineage:
 
-```mermaid
-flowchart TD
-    A["2021.06 GitHub Copilot Technical Preview<br/>First LLM integration into editor"] -->
-    B["2022.12 ChatGPT Launch<br/>Proves LLM's general conversational ability"] -->
-    C["2023.03 GPT-4 + Function Calling<br/>LLM transforms from text generator to instruction orchestrator"] -->
-    D["2023.06 OpenAI Code Interpreter<br/>LLM gains code execution capability for the first time"] -->
-    E["2023.11 Claude 2.1 + Tool Use<br/>200K context window"] -->
-    F["2024.01 Devin Launch<br/>First AI software engineer"] -->
-    G["2024.08 Cursor Agent Mode<br/>Editor-integrated Agent"] -->
-    H["2024.10 Anthropic Computer Use<br/>LLM can operate GUI"] -->
-    I["2025.02 Claude Code Launch<br/>Terminal-native Agent"] -->
-    J["2025.11 Anthropic Publishes MCP<br/>Standardized Agent communication protocol"] -->
-    K["2026.03 Source Code Accidentally Exposed<br/>Community deeply examines Agent Harness"] -->
-    L["Now ← You are here"]
-
-    classDef event fill:#f0f7ff,stroke:#3b82f6,stroke-width:1px,color:#1e3a5f
-    classDef milestone fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
-    classDef current fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#92400e
-    class A,B,C,D,E,F,G,H event
-    class I,J,K milestone
-    class L current
-```
+![AI Programming Tools Evolution Timeline](../assets/fundamentals/ai_programming_tools_evolution_timeline.png)
 
 This timeline reveals an important pattern: the direction of AI programming tool evolution has always been "giving LLMs more agency." From only being able to see the current file, to seeing the entire project; from only being able to generate suggestions, to being able to execute commands; from single-step operations to multi-step autonomous planning. Agent Harness is the inevitable architectural product of this evolutionary direction.
 
@@ -161,35 +108,7 @@ But this does not mean the code is bloated. On the contrary, Claude Code's code
 
 Let's use an architectural overview diagram to visually illustrate Claude Code's module organization:
 
-```mermaid
-flowchart TD
-    entry["Entry Module<br/>CLI Parsing · Startup Optimization · React/Ink Initialization"]
-    query["Query Engine<br/>Session State Management · Message History · File Cache · Usage Statistics"]
-    loop["Dialog Main Loop (AsyncGenerator)<br/>Preprocessing Pipeline → API Call → Tool Detection → State Construction"]
-
-    subgraph loop_inner[" "]
-        direction LR
-        p1["Preprocessing Pipeline<br/>Compression/Trimming"] --> p2["API Call<br/>Streaming Reception"] --> p3["Tool Detection<br/>Permission Check"] --> p4["State Construction<br/>Message Backfill"]
-    end
-
-    tools["Tool System<br/>45+ Tools · Orchestration Engine · Concurrency Partitioning"]
-    perm["Permission Pipeline<br/>Four-Stage Check · Five Modes · Rule Persistence"]
-    ext["Extension Layer<br/>MCP Protocol · Sub-Agent Dispatch · Plugin System · Hook Mechanism"]
-
-    entry --> query --> loop
-    loop --- loop_inner
-    loop --> tools
-    loop --> perm
-    perm -.->|Permission Constraints| tools
-    tools --> ext
-
-    classDef module fill:#f0f7ff,stroke:#3b82f6,stroke-width:2px,color:#1e3a5f
-    classDef inner fill:#f8fafc,stroke:#93c5fd,stroke-width:1px,color:#475569
-    classDef extmod fill:#fef9f0,stroke:#f59e0b,stroke-width:2px,color:#78350f
-    class entry,query,loop,module module
-    class p1,p2,p3,p4 inner
-    class tools,perm,ext extmod
-```
+![Claude Code Architecture Overview](../assets/fundamentals/claude_code_architecture_overview.png)
 
 This diagram reveals the layered nature of Claude Code's architecture: from the user entry point at the top to the extension layer at the bottom, each layer has clear responsibilities and interface boundaries. The dialog main loop is the "heart" of the entire system, driving data flow between the various subsystems.