Deterministic workflow orchestration for command steps, provider-driven agent loops, and reusable design -> plan -> implementation stacks.
This repo is built around three ideas:
- workflows are authored in a strict YAML DSL;
- runtime state, outputs, artifacts, and routing are contract-checked;
- every run leaves filesystem-native evidence under
.orchestrate/runs/<run_id>/so it can be inspected, resumed, and reported.
The project is useful when an agent workflow needs more structure than an ad hoc shell script: typed inputs and outputs, reusable subworkflows, bounded review/fix loops, artifact lineage, resumable state, and local observability.
This project is licensed under the Functional Source License, Version 1.1,
MIT Future License (FSL-1.1-MIT). See LICENSE.md.
| Goal | Read or run |
|---|---|
| Understand the repo map | docs/index.md |
| Learn the execution model | docs/orchestration_start_here.md |
| Author or revise workflow YAML | docs/workflow_drafting_guide.md |
| Check the normative DSL contract | specs/index.md and specs/dsl.md |
| Find runnable examples | workflows/README.md |
| Compare the Lisp MVP to YAML | docs/workflow_lisp_mvp_comparison.md |
If you are new to the repo, first validate the call-based design -> plan ->
implementation example. It is self-contained and does not execute provider
commands when run with --dry-run.
Requirements:
- Python 3.11+
bash- a checkout of this repository
Optional for real provider execution:
- the
codexCLI available in your shell; - whatever authentication your
codex execsetup requires.
git clone <repo-url> agent-orchestration
cd agent-orchestration
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"Sanity check:
python -m orchestrator --helpThe CLI program name is orchestrate, but the examples use
python -m orchestrator so they work directly from a checkout.
Validate the current modular design -> plan -> implementation stack:
python -m orchestrator run \
workflows/examples/design_plan_impl_review_stack_v2_call.yaml \
--dry-runExpected result:
- the workflow loads successfully;
- imported subworkflows validate;
- typed inputs and outputs validate;
- no provider command is executed.
This example exercises the current reusable stack:
- top-level workflow:
workflows/examples/design_plan_impl_review_stack_v2_call.yaml - design phase:
workflows/library/tracked_design_phase.yaml - plan phase:
workflows/library/tracked_plan_phase.yaml - implementation phase:
workflows/library/design_plan_impl_implementation_phase.yaml - input brief:
workflows/examples/inputs/provider_session_resume_brief.md
Only run provider workflows after --dry-run succeeds and your provider CLI
works in the same shell.
python -m orchestrator run \
workflows/examples/design_plan_impl_review_stack_v2_call.yaml \
--stream-outputThat run will:
- read the example brief;
- draft and review a design;
- draft and review an execution plan;
- execute implementation work and run the implementation review/fix loop;
- write run state and logs under
.orchestrate/runs/<run_id>/; - write workflow artifacts under the paths declared by the workflow.
If the run stops partway through, resume it:
python -m orchestrator resume <run_id> --stream-outputThe first places to inspect after a run are:
.orchestrate/runs/<run_id>/state.json: step status, artifacts, and errors;.orchestrate/runs/<run_id>/logs/<Step>.prompt.txt: composed provider prompt;.orchestrate/runs/<run_id>/logs/<Step>.stdoutand.stderr: command or provider execution traces.
Generate a readable run report:
python -m orchestrator report --run-id <run_id> --format mdServe the local read-only dashboard:
python -m orchestrator dashboard --workspace "$(pwd)"For long workflows, optional summaries can make run review easier:
python -m orchestrator run \
workflows/examples/design_plan_impl_review_stack_v2_call.yaml \
--stream-output \
--step-summaries \
--summary-profile phase-performanceSummary files are observability artifacts only. They must not drive workflow routing or recovery decisions.
For headless email alerts when runs complete, fail, crash, or stall across
multiple workspaces, see docs/workflow_monitoring.md.
| Path | Purpose |
|---|---|
orchestrator/ |
Loader, validator, executor, CLI, dashboard, observability, and experimental Workflow Lisp compiler code. |
specs/ |
Normative DSL, CLI, state, provider, observability, and acceptance contracts. |
docs/ |
Informative guides, design notes, runbooks, and implementation plans. |
workflows/examples/ |
Runnable examples and validation fixtures. |
workflows/library/ |
Reusable imported subworkflows and bundled prompt assets. |
prompts/ |
Shared prompt catalog. |
tests/ |
Unit, runtime, loader, workflow, and fixture tests. |
Important entry points:
docs/index.md: documentation hub and recommended read order;workflows/README.md: workflow catalog;prompts/README.md: prompt catalog;tests/README.md: test and smoke-check guidance;docs/workflow_lisp_mvp_comparison.md: side-by-side Workflow Lisp MVP vs YAML comparison.
Validate a workflow without executing steps:
python -m orchestrator run workflows/examples/design_plan_impl_review_stack_v2_call.yaml --dry-runRun with live provider output:
python -m orchestrator run workflows/examples/design_plan_impl_review_stack_v2_call.yaml --stream-outputResume an existing run:
python -m orchestrator resume <run_id> --stream-outputRender the latest run report:
python -m orchestrator report --format mdServe the dashboard:
python -m orchestrator dashboard --workspace "$(pwd)"Validate an observability-focused example:
python -m orchestrator run workflows/examples/observability_runtime_config_demo.yaml --debug --step-summariesRun the default non-e2e test loop:
pytest -m "not e2e" -vThe repo contains workflows across multiple DSL versions, including older 1.x
examples and newer 2.x structured-control, reusable-call, provider-session,
managed-job, and v2.14 materialization/variant examples.
Authoritative versioning details live in:
If agent behavior looks wrong, inspect the composed provider prompt before changing workflow logic:
less .orchestrate/runs/<run_id>/logs/<Step>.prompt.txtIf routing or artifact lineage looks wrong, inspect state.json and the
workflow's declared outputs, publishes, consumes, expected_outputs, and
output_bundle contracts before changing prompts.