Mingye-Lu · Mingye-Lu · Jun 11, 2026 · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -8,7 +8,7 @@
 
 ```bash
 cargo build --release                                        # produce ./target/release/acrawl
-cargo test --workspace                                       # run full test suite (~770 tests)
+cargo test --workspace                                       # run full test suite (~1,100 tests)
 cargo test -p <crate> <test_name>                            # run a single test (e.g. -p agent mvp_tool_specs_contains_expected_21_tools)
 cargo clippy --workspace --all-targets -- -D warnings        # lints must be clean (workspace lints set pedantic = warn)
 cargo fmt --check                                            # format check
@@ -78,6 +78,48 @@ Default model comes from the `default_model` field in the active provider's `Sto
 
 `agent::mvp_tool_specs()` returns the canonical 21-tool list with JSON schemas and required permission. When you add or rename a tool, update `mvp_tool_specs`, add a handler in `tools/mod.rs`, and adjust the count assertion in `crates/agent/src/lib.rs` tests.
 
+## Optimization layer
+
+14 vendor-derived optimizations live in `crates/agent/src/` and `crates/runtime/src/`. All are gated by `settings.optimization.*` fields (all default OFF). The pattern every optimization follows:
+
+### Shared infrastructure (must understand before touching any optimization)
+
+**`DynamicPromptContext`** (`crates/agent/src/prompt.rs`) — four optional string fields (`stagnation_alert`, `planning_guidance`, `budget_warning`, `loop_nudge`). `build_system_prompt(specs, Some(&ctx))` appends the context as section 9 of the system prompt.
+
+**Arc slot pattern** — `CrawlerAgent` and `ConversationRuntime` share two Arc slots created in `run_with_system_prompt()`:
+- `prompt_override: Arc<Mutex<Option<Vec<String>>>>` — agent writes a new full system prompt here after any tool execution; runtime applies it before the next API call in `prepare_iteration()`.
+- `last_assistant_text: Arc<Mutex<Option<String>>>` — runtime writes the latest assistant response text here; agent reads it for confidence parsing.
+- `cumulative_cost: Arc<AtomicU64>` (millicents) — runtime updates it after each usage record; agent reads it for budget enforcement.
+
+All three slots are internal to `ConversationRuntime` (not constructor parameters) but accessible via getters. The agent gets the cost counter via `runtime.cumulative_cost_counter()` after construction.
+
+### Per-optimization modules
+
+| Module | Location | What it adds to `CrawlState` / `CrawlerAgent` |
+|--------|----------|-----------------------------------------------|
+| `page_fingerprint` | `crates/agent/src/page_fingerprint.rs` | `CrawlState.page_fingerprints: Vec<PageFingerprint>` |
+| `tools/html_diff` | `crates/agent/src/tools/html_diff.rs` | `CrawlState.html_diff_tracker: Option<HtmlDiffTracker>` |
+| `loop_detector` | `crates/agent/src/loop_detector.rs` | `CrawlState.loop_detector: Option<LoopDetector>` |
+| `failure_classifier` | `crates/agent/src/failure_classifier.rs` | (pure function — no state) |
+| `self_healing` | `crates/agent/src/self_healing.rs` | (pure function — no state) |
+| `action_cache` | `crates/agent/src/action_cache.rs` | `CrawlState.action_cache: Option<ActionCache>` |
+| `confidence` | `crates/agent/src/confidence.rs` | `CrawlerAgent.confidence_tracker: Option<ConfidenceTracker>` |
+| `budget` | `crates/runtime/src/budget.rs` | `CrawlerAgent.cumulative_cost_slot: SharedCostCounter` |
+
+### Where optimizations run
+
+All optimization logic runs inside `CrawlerAgent::execute()` in `crates/agent/src/implementation/mod.rs`. The execution order (each guarded by its settings flag):
+1. **Action cache lookup** — before the tool runs (returns cached result if hit)
+2. **Tool execution** — normal handler dispatch
+3. **Self-healing retry** — on SelectorNotFound/SelectorAmbiguous
+4. **Loop detection** — records action + fingerprint, writes nudge to prompt_override_slot
+5. **Planning interval** — injects planning/execution guidance at step N
+6. **Confidence tracking** — reads last_assistant_text slot, parses `[confidence: ...]`
+7. **Budget enforcement** — reads cumulative_cost_slot, warns or blocks
+8. **Action cache store** — stores result after successful read-only tool call
+
+`CrawlState` fields are ephemeral (never persisted to session files). Adding a new field requires no serde changes.
+
 ## Conventions specific to this repo
 
 - **Always run `cargo fmt` before committing.** CI checks formatting with `cargo fmt --check` — commits that fail this check will be rejected.

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,25 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [Unreleased]
+
+### Added
+
+- **HTML Diff Mode** (`optimization.html_diff_mode`) — on repeated visits to the same URL, only changed content sections are returned with `[unchanged: N sections]` markers, reducing token usage 50–70% on multi-turn sessions.
+- **Action Loop Detection** (`optimization.loop_detection`) — rolling-window action hash detects repeated identical actions with escalating nudges (soft at 5, medium at 8, strong at 12 repeats); page stagnation detection after 5 consecutive identical page fingerprints.
+- **Page Fingerprinting** (`optimization.page_fingerprinting`) — lightweight FNV-1a fingerprint (url + element_count + first-1000-char text hash) stored in CrawlState; used by loop detection and action caching for cache invalidation.
+- **Planning Interval** (`optimization.planning_interval`) — every N steps injects planning-checkpoint or execution-mode guidance into the dynamic prompt; disabled by default (interval=0).
+- **Failure Classification** (`optimization.failure_classification`) — 16-category keyword-based error taxonomy (zero LLM cost); `classify()` maps error messages to SelectorNotFound, CaptchaDetected, RateLimited, etc.; `retry_strategy()` returns RetryWithHealing, RetryWithDelay, NoRetry, or ResetAndRetry per category.
+- **Self-Healing Selectors** (`optimization.self_healing`) — on SelectorNotFound/SelectorAmbiguous, fetches a fresh page_map and text-matches to the correct element ref; logs `[healed: @eOLD → @eNEW]`; zero LLM calls; max retries configurable (default 2).
+- **Action Caching** (`optimization.action_caching`) — in-memory FNV-1a keyed cache for read-only tools (`page_map`, `read_content`, `list_resources`, `execute_js`); invalidated on page fingerprint change; TTL-based expiry (default 30s); interaction tools never cached.
+- **Confidence Tracking** (`optimization.confidence_tracking`) — parses `[confidence: HIGH/MEDIUM/LOW]` from assistant responses; 2+ consecutive LOWs triggers stagnation alert via DynamicPromptContext; advisory only, never blocks.
+- **Compound Component Enrichment** (`optimization.compound_enrichment`) — extends interactive element JSON with an `enrichment` field for complex form controls: date format hints, range min/max/step/value, number bounds, select option lists (max 20 + overflow count), file accept types, textarea maxlength. Max 200 bytes/element.
+- **Content-Aware Cleaning Profiles** (`optimization.content_aware_profiles`) — `CleaningProfile` enum (Default/Minimal/Aggressive/ReadingMode) auto-selected by task keyword and content size; `select_profile()` picks ReadingMode for extraction tasks, Minimal for interaction tasks, Aggressive for content > 50KB.
+- **Budget Enforcement** (`optimization.budget_max_session_cost_usd`, `optimization.budget_enforcement`) — `BudgetEnforcer` with Warn/Block modes; Warn injects budget warning into the dynamic prompt at configurable threshold (default 80%); Block terminates the agent loop cleanly when the cost limit is reached.
+- **Per-Agent Cost Attribution** (`optimization.per_agent_cost_tracking`) — `build_cost_breakdown()` walks flat child sessions and reconstructs per-child cost via UsageTracker; `/cost` command shows per-agent breakdown when flag is ON.
+- **Dynamic System Prompt Infrastructure** — `DynamicPromptContext` struct with four optional fields (stagnation_alert, planning_guidance, budget_warning, loop_nudge); injected as section 9 of the system prompt via a shared `Arc<Mutex<>>` slot; all optimizations write to this slot, runtime picks up on the next iteration.
+- **Optimization Settings Schema** — nested `OptimizationSettings` struct in `Settings` with 18 fields, all `Option<T>` and defaulting to OFF for backward compatibility; 18 `settings_get_*` getter functions.
+
 ## [0.9.1] - 2026-06-10
 
 ### Changed

diff --git a/Cargo.lock b/Cargo.lock
diff --git a/README.md b/README.md
@@ -722,6 +722,29 @@ Created with defaults on first run.
 | `browser_backend` | `null` | Active browser backend: `"extension"` or `null` (CloakBrowser) |
 | `extension_bridge_port` | `19876` | Port for Chrome extension bridge WebSocket server |
 
+All fields are optional; omitting a field uses the default. The `optimization` block accepts a nested object with the following fields (all default to `false`/`0`/`null`, safe to omit entirely):
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `html_diff_mode` | `false` | On repeated visits to the same URL, returns only changed content sections with `[unchanged: N sections]` markers. 50 to 70% token reduction on multi-turn sessions. No behavior change on first visit. |
+| `content_aware_profiles` | `false` | Auto-selects a cleaning profile based on the task keyword: ReadingMode for extraction tasks, Minimal for interaction tasks, Aggressive for content > 50KB. |
+| `loop_detection` | `false` | Detects repeated identical actions and injects escalating nudges (soft, medium, strong). Also detects page stagnation. |
+| `loop_detection_window` | `20` | Rolling window size for action hash comparison. |
+| `loop_nudge_threshold` | `5` | Number of repeated actions before first nudge fires. |
+| `page_fingerprinting` | `false` | Enables lightweight page fingerprints used by loop detection and action caching. |
+| `failure_classification` | `false` | Classifies errors into 16 categories (SelectorNotFound, CaptchaDetected, RateLimited, etc.) using keyword matching. Zero LLM cost. |
+| `self_healing` | `false` | On SelectorNotFound/SelectorAmbiguous, fetches a fresh page_map and text-matches to a replacement element ref. Logs `[healed: @eOLD -> @eNEW]`. Zero LLM calls. |
+| `self_healing_max_retries` | `2` | Max healing attempts per failed action. |
+| `action_caching` | `false` | Caches results of read-only tools (`page_map`, `read_content`, `list_resources`, `execute_js`) keyed by tool + input + page fingerprint. Cache is invalidated when the page changes. |
+| `action_cache_ttl_secs` | `30` | Cache entry TTL in seconds. |
+| `planning_interval` | `0` | Every N steps, injects a planning checkpoint into the system prompt. 0 = disabled. |
+| `confidence_tracking` | `false` | Asks the LLM to self-report confidence after each action (`[confidence: HIGH/MEDIUM/LOW]`). Two consecutive LOWs trigger a stagnation alert. |
+| `compound_enrichment` | `false` | Adds `enrichment` metadata to complex form controls in page_map: date format hints, range min/max/value, select option lists (max 20 + overflow count), file accept types, textarea maxlength. Max 200 bytes per element. |
+| `budget_max_session_cost_usd` | `null` | Session cost limit in USD. Null = no limit. |
+| `budget_enforcement` | `null` | How to enforce the budget: `warn` injects a warning into the prompt; `block` terminates the session when the limit is reached. |
+| `budget_warn_threshold_pct` | `80` | Percentage of budget at which warnings start. |
+| `per_agent_cost_tracking` | `false` | When ON, `/cost` shows a per-child-agent cost breakdown. |
+
 ### Environment Variables
 
 | Variable | Description |
@@ -730,6 +753,43 @@ Created with defaults on first run.
 
 Provider-specific env vars (see [provider table](#24-llm-providers) above) are read as fallbacks when no `credentials.json` entry exists.
 
+### Performance Optimizations
+
+acrawl ships 14 vendor-derived optimizations (sourced from browser-use, Stagehand, crawl4ai, Skyvern, Spider, nanobrowser, and ZeroClaw). All are **disabled by default**, enable selectively via `settings.json`.
+
+Example `settings.json` with a cost-optimized profile:
+
+```json
+{
+  "optimization": {
+    "html_diff_mode": true,
+    "action_caching": true,
+    "page_fingerprinting": true,
+    "loop_detection": true,
+    "self_healing": true,
+    "budget_max_session_cost_usd": 0.50,
+    "budget_enforcement": "warn"
+  }
+}
+```
+
+| Optimization | Flag | Benefit |
+|--------------|------|---------|
+| **HTML Diff Mode** | `html_diff_mode` | Reduces tokens by 50 to 70% on repeated visits by returning only changed content. |
+| **Content-Aware Profiles** | `content_aware_profiles` | Auto-selects cleaning profiles (ReadingMode, Minimal, Aggressive) based on task. |
+| **Loop Detection** | `loop_detection` | Prevents infinite loops by detecting repeated actions and injecting nudges. |
+| **Page Fingerprinting** | `page_fingerprinting` | Generates lightweight page fingerprints for loop detection and action caching. |
+| **Failure Classification** | `failure_classification` | Classifies errors into 16 categories using keyword matching with zero LLM cost. |
+| **Self-Healing** | `self_healing` | Automatically heals broken selectors using text-matching with zero LLM calls. |
+| **Action Caching** | `action_caching` | Caches read-only tool results to avoid redundant LLM calls. |
+| **Planning Interval** | `planning_interval` | Injects periodic planning checkpoints to keep the agent focused. |
+| **Confidence Tracking** | `confidence_tracking` | Tracks LLM self-reported confidence to alert on stagnation. |
+| **Compound Enrichment** | `compound_enrichment` | Enriches complex form controls in the page map with metadata. |
+| **Budget Limit** | `budget_max_session_cost_usd` | Sets a hard session cost limit in USD to prevent runaway costs. |
+| **Budget Enforcement** | `budget_enforcement` | Controls whether to warn or block when the session budget is reached. |
+| **Budget Warning** | `budget_warn_threshold_pct` | Triggers warnings when a percentage of the budget is consumed. |
+| **Per-Agent Cost Tracking** | `per_agent_cost_tracking` | Breaks down costs per child agent in the `/cost` command. |
+
 ## Known Limitations
 
 acrawl works well on most public web content, but some situations are outside what the agent can reliably handle:
@@ -780,7 +840,7 @@ crates/
   commands/     17 slash commands with resume-safety annotations
 ```
 
-11 crates, ~38K lines of Rust, 770 tests.
+11 crates, ~40K lines of Rust, 1,097 tests.
 
 ## Development
 

diff --git a/crates/agent/Cargo.toml b/crates/agent/Cargo.toml
@@ -14,6 +14,7 @@ regex = "1"
 runtime = { path = "../runtime" }
 script = { path = "../script" }
 serde_json = "1"
+sha2 = "0.10"
 time = { version = "0.3", features = ["formatting"] }
 tokio = { version = "1", features = ["sync", "time", "fs"] }
 tokio-util = { version = "0.7", default-features = false }