feat(script): Autonomous Script Protocol#38
Merged
Conversation
Change Expression enum from internally-tagged (#[serde(tag = kind)]) to adjacently-tagged (#[serde(tag = kind, content = value)]) so that newtype variants (Literal, Variable, JsEval) deserialize correctly from JSON. Update expression_to_value in parser.rs to wrap FieldAccess and ArrayIndex struct fields under a value key, matching the new wire format. Add parse_script_expression_round_trip test covering all five Expression variants through serde_json::to_value -> parse_script. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…mestamp
Strengthen persistence::validate_script_name to reject leading dashes,
dots, and non-normal path components (matching the stricter rules
previously duplicated in save_script.rs and read_script.rs).
Remove the duplicated local validate_script_name from save_script.rs and
read_script.rs; both now delegate to script::persistence::validate_script_name.
Fix format_system_time in list_scripts.rs: was using Debug format
({:?}) producing unreadable output; now returns Unix epoch seconds.
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The run_script handler already accepted name (load saved script), save_as (persist after run), and limits (override defaults), but the ToolSpec input_schema only declared script and the internal __load_from_disk field, hiding the other params from the LLM. Replace __load_from_disk with the three user-facing properties and update the instructions field to reference name instead of the internal marker. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…helpers ScriptLimits::max_output_bytes was set by effective_limits() but never checked during execution; extracted_data and yielded_data could grow without bound. Add output_bytes: usize field to ScriptExecutor. Replace the inline Collect/Yield node handling in mod.rs with calls to push_extracted and push_yielded from data.rs; both helpers now check and accumulate the byte count, returning ScriptExecutionError::ToolError on overflow. Remove the #[allow(dead_code)] on the data.rs impl block and delete the three helpers that were never called (store_variable, variables, extracted_data). Fix the doc-comment Expression examples to match the adjacently-tagged wire format (kind + value). Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…eted spawn race Two executor-layer safety fixes: 1. ParallelBranchResult now carries errors_caught and output_bytes. After all branches complete the parent merges both counters back, so TryCatch nodes inside parallel branches correctly contribute to the overall errors_caught tally and the output byte budget. 2. Remove the cleanup_completed() call from spawn_script(). It was removing finished scripts from the map before the caller could retrieve results via wait_for_scripts, producing spurious NotFound errors for fast-completing scripts. check_can_spawn() already counts only running (non-finished) handles so the concurrent cap is unaffected. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The wait_for_subagents bullet inside section_parallel_exploration had extra leading spaces (9) vs its sibling bullets (7), producing slightly misaligned output. Normalize to the consistent 7-space indent. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…mpleted race Two tests that directly exercise the fix behaviours that cannot be driven through the MCP server in a headless shell (run_script requires an active Playwright browser): collect_over_output_byte_limit_fails / yield_over_output_byte_limit_fails Verify that push_extracted / push_yielded return ScriptStatus::Failed with an 'output size limit exceeded' message when the accumulated output exceeds ScriptLimits::max_output_bytes. completed_script_survives_subsequent_spawn_check Pre-populates ScriptManager with a finished entry, calls check_can_spawn (formerly spawn_script would call cleanup_completed here), and asserts the completed entry is still retrievable via get_status — proving the cleanup race is gone. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…me context
spawn_script calls tokio::task::spawn internally but was being invoked
from synchronous code outside any block_on, causing an immediate panic:
'there is no reactor running, must be called from the context of
a Tokio 1.x runtime'
Wrap the call in rt.block_on(async { ... }) to enter the runtime
context before spawning, matching the pattern already used by
wait_for_scripts.
Add stdio integration test (stdio_server_run_script_returns_script_id_and_survives)
that drives the full run_script -> wait_for_scripts flow through the
MCP binary, verifying the server stays alive and returns correct
extracted_data / yielded_data.
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…ix epoch format_system_time previously returned a bare Unix epoch integer (e.g. '1780991949'). Now uses time::OffsetDateTime + Rfc3339 to return a human-readable UTC timestamp (e.g. '2026-06-09T13:39:09Z'). Uses i64::try_from to avoid the clippy::cast_possible_wrap lint on u64->i64 conversion. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
… test Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new Autonomous Script Protocol that lets the LLM (or an external MCP client) execute deterministic multi-step browser automation without per-step LLM round-trips — dramatically faster and cheaper for repetitive page patterns.
What's new
New crate:
crates/script/ScriptDefinition,ScriptNode,Expression)parse_script,validate_script)save_script_to_disk,load_script_from_disk,list_scripts_on_disk)7 new tools (exposed via agent loop and MCP server):
run_scriptscript_idimmediatelywait_for_scriptsScriptResultscript_statuscancel_scriptsave_script~/.acrawl/scripts/<name>.jsonlist_scriptsread_scriptScript nodes:
tool_call,assign,collect,yield,for_loop,for_each,while_loop,if_else,try_catch,parallelExecution engine (
crates/agent/src/script_executor/):errors_caughtandoutput_bytespropagated back from parallel branches to parentBugs fixed (post code-review)
Expressionserde tag changed from internally-tagged to adjacently-tagged (#[serde(tag="kind", content="value")]) —Literal,Variable,JsEvalnow deserialize correctly from JSONrun_scriptToolSpec schema now exposesname,save_as,limits; removed internal__load_from_diskmax_output_bytesenforced viapush_extracted/push_yieldedhelperscleanup_completed()removed fromspawn_script— completed scripts survive until explicitlywait_for_scripts-edvalidate_script_nameconsolidated intopersistence.rs; rejects leading dash, dots, path traversallist_scriptsmodified_atnow returns ISO 8601 UTC (e.g.2026-06-09T13:39:09Z)spawn_scriptin MCP server wrapped inrt.block_on— was panicking withno reactor runningand killing the server processTest coverage
parse_script_expression_round_tripcovering all 5Expressionvariants through serde round-trip)collect_over_output_byte_limit_fails,yield_over_output_byte_limit_fails)completed_script_survives_subsequent_spawn_check)stdio_server_run_script_returns_script_id_and_survives)E2E verified (live MCP session)
All 7 tools exercised end-to-end against the running MCP server: