Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
.DS_Store
command.log
73 changes: 3 additions & 70 deletions 00-前言.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,50 +43,8 @@

这三次浪潮可以用一张简明的演进图来概括:

```mermaid
flowchart LR
subgraph wave1["2021-2022"]
A["代码补全\nInline Complete"]
end
subgraph wave2["2023-2024"]
B["对话助手\nChat Assisted"]
end
subgraph wave3["2025 至今"]
C["自主智能体\nAgent Autonomy"]
end

wave1 --> wave2 --> wave3

classDef era fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
class A,B,C era
```

```mermaid
flowchart LR
subgraph w1["第一次浪潮"]
direction TB
A1["Tab 接受建议"]
A2["单文件局部补全"]
A3["被动等待触发"]
A4["无执行能力"]
end
subgraph w2["第二次浪潮"]
direction TB
B1["Ctrl+L 对话提问"]
B2["多文件跨文件生成"]
B3["主动理解意图"]
B4["无执行能力"]
end
subgraph w3["第三次浪潮"]
direction TB
C1["直接交付任务"]
C2["跨工具自主编排"]
C3["主动规划执行路径"]
C4["读/写/执行/验证"]
end

w1 --> w2 --> w3
```
![AI 编程范式演进](assets/render_evolution_graph.png)


### Agent Harness:一个新架构概念的诞生

Expand Down Expand Up @@ -164,32 +122,7 @@ flowchart LR

本书分为四个部分,按照从宏观到微观、从概念到实现的组织方式:

```mermaid
flowchart TD
subgraph part1["第一部分:基础篇(第 1-4 章)"]
direction LR
ch1["第1章\n新范式\n全景导览"] --> ch2["第2章\n对话循环\nAgent心跳"] --> ch3["第3章\n工具系统\nAgent双手"] --> ch4["第4章\n权限管线\nAgent护栏"]
end

subgraph part2["第二部分:核心篇(第 5-8 章)"]
core["上下文管理 · 缓存策略 · 流式架构 · 错误恢复"]
end

subgraph part3["第三部分:扩展篇(第 9-12 章)"]
ext["MCP 协议 · 子智能体 · 插件系统 · Hook 机制"]
end

subgraph part4["第四部分:实战篇(第 13-15 章)"]
practice["Mini Agent Harness 构建 · 调试技巧 · 生产部署"]
end

part1 --> part2 --> part3 --> part4

classDef section fill:#f3f9ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
classDef chapter fill:#eff6ff,stroke:#60a5fa,stroke-width:1px,color:#2563eb
class part1,part2,part3,part4 section
class ch1,ch2,ch3,ch4 chapter
```
![书籍结构概述](assets/book_structure_overview.png)

**如果你时间有限(快速路径):** 至少阅读第 1 章(建立心智模型)和第 2 章(理解核心循环),然后用 15 分钟浏览第 3-4 章的关键要点部分。这两章是理解后续所有内容的基础。

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@

## 背景

2026 年 3 月 31 日,安全研究员 [Chaofan Shou (@Fried_rice)](https://x.com/Fried_rice) 发现 npm registry 中的 `@anthropic-ai/claude-code` 包存在构建配置失误,source map 文件引用了未设访问控制的 Cloudflare R2 存储桶。披露推文获得超 1700 万次浏览,引发了技术社区对 Agent 架构的空前讨论。
2026 年 3 月 31 日,安全研究员 [Chaofan Shou (@Fried_rice)](https://x.com/Fried_rice) 发现 npm registry 中的 `@anthropic-ai/claude-code` 包存在构建配置失误。披露推文获得超 1700 万次浏览,引发了技术社区对 Agent 架构的空前讨论。

这本书的诞生正是受到这场讨论的启发——当 Agent 架构成为热门话题,我们意识到需要一本系统性的书来讲解 Agent Harness 的设计原理。

Expand Down
Binary file added assets/book_structure_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/async_generator_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/llm_human_interaction.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/message_types_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/tools_partition_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/tools_state_machine.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/fundamentals/turn_lifecycle_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/render_evolution_graph.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 2 additions & 70 deletions en/00-Foreword.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,50 +14,7 @@ Looking back over the past few years, AI-assisted programming has gone through t

These three waves can be summarized in a concise evolution diagram:

```mermaid
flowchart LR
subgraph wave1["2021-2022"]
A["Code Completion\nInline Complete"]
end
subgraph wave2["2023-2024"]
B["Chat Assistant\nChat Assisted"]
end
subgraph wave3["2025 to present"]
C["Autonomous Agent\nAgent Autonomy"]
end

wave1 --> wave2 --> wave3

classDef era fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
class A,B,C era
```

```mermaid
flowchart LR
subgraph w1["First Wave"]
direction TB
A1["Tab to accept suggestions"]
A2["Single-file local completion"]
A3["Passively waiting for triggers"]
A4["No execution capability"]
end
subgraph w2["Second Wave"]
direction TB
B1["Ctrl+L conversational queries"]
B2["Multi-file cross-file generation"]
B3["Proactive intent understanding"]
B4["No execution capability"]
end
subgraph w3["Third Wave"]
direction TB
C1["Direct task delegation"]
C2["Cross-tool autonomous orchestration"]
C3["Proactive execution path planning"]
C4["Read/Write/Execute/Verify"]
end

w1 --> w2 --> w3
```
![Evolution of the AI Programming Paradigm](assets/render_evolution_graph.png)

### Agent Harness: The Birth of a New Architectural Concept

Expand Down Expand Up @@ -135,32 +92,7 @@ This book is suitable for the following readers, each of whom can gain unique va

This book is divided into four parts, organized from macro to micro, from concept to implementation:

```mermaid
flowchart TD
subgraph part1["Part 1: Foundations (Chapters 1-4)"]
direction LR
ch1["Chapter 1\nNew Paradigm\nPanoramic Overview"] --> ch2["Chapter 2\nDialog Loop\nAgent Heartbeat"] --> ch3["Chapter 3\nTool System\nAgent Hands"] --> ch4["Chapter 4\nPermission Pipeline\nAgent Guardrails"]
end

subgraph part2["Part 2: Core (Chapters 5-8)"]
core["Context Management · Cache Strategy · Streaming Architecture · Error Recovery"]
end

subgraph part3["Part 3: Extensions (Chapters 9-12)"]
ext["MCP Protocol · Sub-Agents · Plugin System · Hook Mechanism"]
end

subgraph part4["Part 4: Practice (Chapters 13-15)"]
practice["Mini Agent Harness Build · Debugging Tips · Production Deployment"]
end

part1 --> part2 --> part3 --> part4

classDef section fill:#f3f9ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af
classDef chapter fill:#eff6ff,stroke:#60a5fa,stroke-width:1px,color:#2563eb
class part1,part2,part3,part4 section
class ch1,ch2,ch3,ch4 chapter
```
![Book Structure Overview](assets/book_structure_overview.png)

**If you are short on time (Fast Path):** Read at least Chapter 1 (to establish a mental model) and Chapter 2 (to understand the core loop), then spend 15 minutes browsing the key takeaways sections of Chapters 3-4. These two chapters are the foundation for understanding everything that follows.

Expand Down
91 changes: 5 additions & 86 deletions en/Part-1-Foundations/01-The-New-Paradigm-of-Agent-Programming.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,8 @@ In 2023, most developers interacted with LLMs like this: open a web page, enter

This mode can be described with a simple model:

```mermaid
sequenceDiagram
participant Human Send
participant LLM
participant Human Receive
Human Send->>LLM: Prompt
LLM->>Human Receive: Response (Text)
Note over Human Receive,Human Send: Human copies, pastes, executes manually
Human Receive->>Human Send: Manually transport results
```
![llm_human_interaction](../assets/fundamentals/llm_human_interaction.png)


The core limitation of this mode is that the LLM can only "speak," not "act." It cannot read your file system, execute test commands, create Git branches, or autonomously adjust its strategy when encountering errors. Every interaction with the outside world requires a human intermediary to manually complete -- copying code to the editor, switching to the terminal to run commands, and then copying the output back to the dialog box. This "human glue" pattern is not only inefficient but also error-prone.

Expand All @@ -58,31 +50,7 @@ But tool calling also introduces new engineering challenges. These challenges ar

These questions gave birth to a new architectural concept: **Agent Harness**.

```mermaid
flowchart TD
subgraph challenges["Six Engineering Challenges from Tool Calling"]
c1["Tool Registration & Discovery"]
c2["Parameter Validation"]
c3["Permission Control"]
c4["Error Recovery"]
c5["State Consistency"]
c6["Concurrency & Scheduling"]
end

harness["Agent Harness<br/>Unified Runtime Framework"]

c1 --> harness
c2 --> harness
c3 --> harness
c4 --> harness
c5 --> harness
c6 --> harness

classDef challenge fill:#fef9f0,stroke:#f59e0b,stroke-width:1px,color:#92400e
classDef center fill:#e8f4f8,stroke:#2196F3,stroke-width:2px,color:#1565C0
class c1,c2,c3,c4,c5,c6 challenge
class harness center
```
![Six Engineering Challenges](../assets/fundamentals/six_engineering_challenges.png)

### Why Agent Harness Instead of Simple Wrappers

Expand Down Expand Up @@ -126,28 +94,7 @@ Before diving into design philosophy, let's first examine Claude Code's codebase

Before analyzing Claude Code's architecture, let's widen our perspective and look at the evolution of AI programming tools. Understanding this timeline helps us see where Claude Code sits in the technological lineage:

```mermaid
flowchart TD
A["2021.06 GitHub Copilot Technical Preview<br/>First LLM integration into editor"] -->
B["2022.12 ChatGPT Launch<br/>Proves LLM's general conversational ability"] -->
C["2023.03 GPT-4 + Function Calling<br/>LLM transforms from text generator to instruction orchestrator"] -->
D["2023.06 OpenAI Code Interpreter<br/>LLM gains code execution capability for the first time"] -->
E["2023.11 Claude 2.1 + Tool Use<br/>200K context window"] -->
F["2024.01 Devin Launch<br/>First AI software engineer"] -->
G["2024.08 Cursor Agent Mode<br/>Editor-integrated Agent"] -->
H["2024.10 Anthropic Computer Use<br/>LLM can operate GUI"] -->
I["2025.02 Claude Code Launch<br/>Terminal-native Agent"] -->
J["2025.11 Anthropic Publishes MCP<br/>Standardized Agent communication protocol"] -->
K["2026.03 Source Code Accidentally Exposed<br/>Community deeply examines Agent Harness"] -->
L["Now ← You are here"]

classDef event fill:#f0f7ff,stroke:#3b82f6,stroke-width:1px,color:#1e3a5f
classDef milestone fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af
classDef current fill:#fef3c7,stroke:#f59e0b,stroke-width:2px,color:#92400e
class A,B,C,D,E,F,G,H event
class I,J,K milestone
class L current
```
![AI Programming Tools Evolution Timeline](../assets/fundamentals/ai_programming_tools_evolution_timeline.png)

This timeline reveals an important pattern: the direction of AI programming tool evolution has always been "giving LLMs more agency." From only being able to see the current file, to seeing the entire project; from only being able to generate suggestions, to being able to execute commands; from single-step operations to multi-step autonomous planning. Agent Harness is the inevitable architectural product of this evolutionary direction.

Expand All @@ -161,35 +108,7 @@ But this does not mean the code is bloated. On the contrary, Claude Code's code

Let's use an architectural overview diagram to visually illustrate Claude Code's module organization:

```mermaid
flowchart TD
entry["Entry Module<br/>CLI Parsing · Startup Optimization · React/Ink Initialization"]
query["Query Engine<br/>Session State Management · Message History · File Cache · Usage Statistics"]
loop["Dialog Main Loop (AsyncGenerator)<br/>Preprocessing Pipeline → API Call → Tool Detection → State Construction"]

subgraph loop_inner[" "]
direction LR
p1["Preprocessing Pipeline<br/>Compression/Trimming"] --> p2["API Call<br/>Streaming Reception"] --> p3["Tool Detection<br/>Permission Check"] --> p4["State Construction<br/>Message Backfill"]
end

tools["Tool System<br/>45+ Tools · Orchestration Engine · Concurrency Partitioning"]
perm["Permission Pipeline<br/>Four-Stage Check · Five Modes · Rule Persistence"]
ext["Extension Layer<br/>MCP Protocol · Sub-Agent Dispatch · Plugin System · Hook Mechanism"]

entry --> query --> loop
loop --- loop_inner
loop --> tools
loop --> perm
perm -.->|Permission Constraints| tools
tools --> ext

classDef module fill:#f0f7ff,stroke:#3b82f6,stroke-width:2px,color:#1e3a5f
classDef inner fill:#f8fafc,stroke:#93c5fd,stroke-width:1px,color:#475569
classDef extmod fill:#fef9f0,stroke:#f59e0b,stroke-width:2px,color:#78350f
class entry,query,loop,module module
class p1,p2,p3,p4 inner
class tools,perm,ext extmod
```
![Claude Code Architecture Overview](../assets/fundamentals/claude_code_architecture_overview.png)

This diagram reveals the layered nature of Claude Code's architecture: from the user entry point at the top to the extension layer at the bottom, each layer has clear responsibilities and interface boundaries. The dialog main loop is the "heart" of the entire system, driving data flow between the various subsystems.

Expand Down
Loading