feat(script): Autonomous Script Protocol by Mingye-Lu · Pull Request #38 · Mingye-Lu/AgenticCrawler

Mingye-Lu · 2026-06-09T08:15:48Z

Summary

Adds a new Autonomous Script Protocol that lets the LLM (or an external MCP client) execute deterministic multi-step browser automation without per-step LLM round-trips — dramatically faster and cheaper for repetitive page patterns.

What's new

New crate: crates/script/

AST grammar (ScriptDefinition, ScriptNode, Expression)
Parser + validator (parse_script, validate_script)
Persistence layer (save_script_to_disk, load_script_from_disk, list_scripts_on_disk)

7 new tools (exposed via agent loop and MCP server):

Tool	Description
`run_script`	Execute an inline script or load one by name; returns `script_id` immediately
`wait_for_scripts`	Block until script(s) complete; returns full `ScriptResult`
`script_status`	Non-blocking poll of running script state
`cancel_script`	Abort a running script
`save_script`	Persist a script definition to `~/.acrawl/scripts/<name>.json`
`list_scripts`	List all saved scripts with ISO 8601 timestamps
`read_script`	Read back a saved script definition

Script nodes: tool_call, assign, collect, yield, for_loop, for_each, while_loop, if_else, try_catch, parallel

Execution engine (crates/agent/src/script_executor/):

Step counter, wall-clock timeout, per-step timeout, output byte limit, cancellation token
Parallel branches share step counter + cancel token; each branch opens its own browser page
errors_caught and output_bytes propagated back from parallel branches to parent

Bugs fixed (post code-review)

Severity	Fix
Critical	`Expression` serde tag changed from internally-tagged to adjacently-tagged (`#[serde(tag="kind", content="value")]`) — `Literal`, `Variable`, `JsEval` now deserialize correctly from JSON
Important	`run_script` ToolSpec schema now exposes `name`, `save_as`, `limits`; removed internal `__load_from_disk`
Important	`max_output_bytes` enforced via `push_extracted`/`push_yielded` helpers
Important	`cleanup_completed()` removed from `spawn_script` — completed scripts survive until explicitly `wait_for_scripts`-ed
Important	`validate_script_name` consolidated into `persistence.rs`; rejects leading dash, dots, path traversal
Important	`list_scripts` `modified_at` now returns ISO 8601 UTC (e.g. `2026-06-09T13:39:09Z`)
Bug	`spawn_script` in MCP server wrapped in `rt.block_on` — was panicking with `no reactor running` and killing the server process

Test coverage

17 parser unit tests (including parse_script_expression_round_trip covering all 5 Expression variants through serde round-trip)
31 script executor unit tests (including collect_over_output_byte_limit_fails, yield_over_output_byte_limit_fails)
1 script manager unit test (completed_script_survives_subsequent_spawn_check)
11 script integration tests
1 MCP stdio integration test (stdio_server_run_script_returns_script_id_and_survives)

E2E verified (live MCP session)

All 7 tools exercised end-to-end against the running MCP server:

run_script  → script_id returned, server alive        ✓
wait_for_scripts → status:Completed, extracted_data:[42], yielded_data:["done"]  ✓
run_script (navigate + literal + variable) → extracted_data:["Example Domain"]  ✓
script_status → live step/items_collected/elapsed_secs  ✓
cancel_script → status:Cancelled, items_collected:0  ✓
save_script → saved to disk  ✓
list_scripts → ISO 8601 modified_at  ✓
read_script → full definition round-tripped including field_access expression  ✓

Change Expression enum from internally-tagged (#[serde(tag = kind)]) to adjacently-tagged (#[serde(tag = kind, content = value)]) so that newtype variants (Literal, Variable, JsEval) deserialize correctly from JSON. Update expression_to_value in parser.rs to wrap FieldAccess and ArrayIndex struct fields under a value key, matching the new wire format. Add parse_script_expression_round_trip test covering all five Expression variants through serde_json::to_value -> parse_script. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…mestamp Strengthen persistence::validate_script_name to reject leading dashes, dots, and non-normal path components (matching the stricter rules previously duplicated in save_script.rs and read_script.rs). Remove the duplicated local validate_script_name from save_script.rs and read_script.rs; both now delegate to script::persistence::validate_script_name. Fix format_system_time in list_scripts.rs: was using Debug format ({:?}) producing unreadable output; now returns Unix epoch seconds. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

The run_script handler already accepted name (load saved script), save_as (persist after run), and limits (override defaults), but the ToolSpec input_schema only declared script and the internal __load_from_disk field, hiding the other params from the LLM. Replace __load_from_disk with the three user-facing properties and update the instructions field to reference name instead of the internal marker. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…helpers ScriptLimits::max_output_bytes was set by effective_limits() but never checked during execution; extracted_data and yielded_data could grow without bound. Add output_bytes: usize field to ScriptExecutor. Replace the inline Collect/Yield node handling in mod.rs with calls to push_extracted and push_yielded from data.rs; both helpers now check and accumulate the byte count, returning ScriptExecutionError::ToolError on overflow. Remove the #[allow(dead_code)] on the data.rs impl block and delete the three helpers that were never called (store_variable, variables, extracted_data). Fix the doc-comment Expression examples to match the adjacently-tagged wire format (kind + value). Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…eted spawn race Two executor-layer safety fixes: 1. ParallelBranchResult now carries errors_caught and output_bytes. After all branches complete the parent merges both counters back, so TryCatch nodes inside parallel branches correctly contribute to the overall errors_caught tally and the output byte budget. 2. Remove the cleanup_completed() call from spawn_script(). It was removing finished scripts from the map before the caller could retrieve results via wait_for_scripts, producing spurious NotFound errors for fast-completing scripts. check_can_spawn() already counts only running (non-finished) handles so the concurrent cap is unaffected. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

The wait_for_subagents bullet inside section_parallel_exploration had extra leading spaces (9) vs its sibling bullets (7), producing slightly misaligned output. Normalize to the consistent 7-space indent. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…mpleted race Two tests that directly exercise the fix behaviours that cannot be driven through the MCP server in a headless shell (run_script requires an active Playwright browser): collect_over_output_byte_limit_fails / yield_over_output_byte_limit_fails Verify that push_extracted / push_yielded return ScriptStatus::Failed with an 'output size limit exceeded' message when the accumulated output exceeds ScriptLimits::max_output_bytes. completed_script_survives_subsequent_spawn_check Pre-populates ScriptManager with a finished entry, calls check_can_spawn (formerly spawn_script would call cleanup_completed here), and asserts the completed entry is still retrievable via get_status — proving the cleanup race is gone. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…me context spawn_script calls tokio::task::spawn internally but was being invoked from synchronous code outside any block_on, causing an immediate panic: 'there is no reactor running, must be called from the context of a Tokio 1.x runtime' Wrap the call in rt.block_on(async { ... }) to enter the runtime context before spawning, matching the pattern already used by wait_for_scripts. Add stdio integration test (stdio_server_run_script_returns_script_id_and_survives) that drives the full run_script -> wait_for_scripts flow through the MCP binary, verifying the server stays alive and returns correct extracted_data / yielded_data. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…ix epoch format_system_time previously returned a bare Unix epoch integer (e.g. '1780991949'). Now uses time::OffsetDateTime + Rfc3339 to return a human-readable UTC timestamp (e.g. '2026-06-09T13:39:09Z'). Uses i64::try_from to avoid the clippy::cast_possible_wrap lint on u64->i64 conversion. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

… test Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Mingye-Lu and others added 16 commits June 8, 2026 21:01

feat(script): add crate scaffolding and core types

08a6f1d

feat(script): implement parser and executor

45f76da

feat(script): add tool handlers and agent integration

7a083f7

feat(script): MCP exposure and system prompt update

7828d68

test(script): add unit and integration tests

1b440af

fix(script): resolve clippy warnings and fmt violations

2454171

fix(cli): add digit separators to large integer literals in mcp_stdio…

fb29544

… test Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Mingye-Lu merged commit 0eb6c8f into main Jun 9, 2026
4 checks passed

Mingye-Lu deleted the feat/script-protocol branch June 9, 2026 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(script): Autonomous Script Protocol#38

feat(script): Autonomous Script Protocol#38
Mingye-Lu merged 16 commits into
mainfrom
feat/script-protocol

Mingye-Lu commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mingye-Lu commented Jun 9, 2026

Summary

What's new

Bugs fixed (post code-review)

Test coverage

E2E verified (live MCP session)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant