Wire OpenShell sandbox lifecycle into the orchestrator#70
Wire OpenShell sandbox lifecycle into the orchestrator#70saichandrapandraju wants to merge 14 commits into
Conversation
|
Hi @dmaniloff , would like to get early feedback on this, thanks! |
| backend: | ||
| type: openshell | ||
| image: base | ||
| policy: ollama-local |
There was a problem hiding this comment.
there's already a pi image and a pi policy for pi.dev agents.
it's the one that gets used when you say openshell sandbox create --from pi
let's use that instead.
and let's also add an agent_command that defaults to the image's entrypoint eg
agent_command: ["pi", "-p", "--no-session"] # optional, defaults to image's entrypoint
There was a problem hiding this comment.
though i may have a different opinion about this now ... let's chat when you get a moment.
| # observations["openshell"] (raw audit trail). The env-field approach is necessary | ||
| # because the any_of/all_of combinators call parse_predicate recursively, which | ||
| # doesn't dispatch to named verifiers. Revisit when combinators are upgraded. |
There was a problem hiding this comment.
the any_of/all_of combinators call parse_predicate recursively, which doesn't dispatch to named verifiers
nice call out @saichandrapandraju -- this is a missing piece in Predicate.evaluate. i am adding that in #71
let's please rebase this on top of that
There was a problem hiding this comment.
once rebased, the question is whether we want to keep the observations field or just do what you are doing which is to gather observations from the orchestrator and populate environment fields like files etc.
maybe that is cleaner sine the fields are self-explanatory? the other benefit of that is that we can remove the POST /observations endpoint in the control api.
thoughts?
There was a problem hiding this comment.
I think -
VerificationContext.observations can stay - it's the right extension point for future backends that surface event streams that don't fit neatly into env fields.
Everything OpenShell-specific around it can be removed:
backend.observations()— its only job was formatting the OCSF dict to push to the control plane, which we don't need since env fields cover all current predicates- The orchestrator step calling
backend.observations()andPOST /observations - The
POST /evaluations/{id}/observationscontrol API endpoint evaluation.observationsin the state model
At least for Openshell, the env fields on OpenShellEnvironment are sufficient. If a future backend needs richer grading data that doesn't map cleanly to env fields, then ctx.observations and the endpoint comes back.
Happy to make the changes if this sounds right to you.
There was a problem hiding this comment.
agree w/ you @saichandrapandraju ! let's make those changes.
Fills in the lifecycle stubs from the #67 scaffold (setup, exec_agent, snapshot, observations, teardown) and wires them into the orchestrator so the openshell protocol runs end-to-end against real sandboxes. openshell_logs.py OCSF shorthand log parser: NET, HTTP, PROC, and FINDING event types. openshell_backend.py Implements all four lifecycle stubs. Adds _BUILTIN_POLICIES (camelCase proto-JSON dicts resolved via ParseDict so suites declare network policy in YAML without importing gRPC types directly) and _resolve_policy(). Embeds a minimal ollama-backed agent script installed to /sandbox/.venv/bin/pi for local testing without a real PI agent. setup() falls back to SandboxClient.from_active_cluster() when no explicit endpoint is given, so the Homebrew-managed gateway works out of the box. agent_client.py OpenShellPIAgentClient delegates execution to backend.exec_agent() and returns stdout. The sandbox lifecycle lives in the backend, not the agent client. orchestrator.py run_task() split into _create_evaluation() + _complete_and_grade() helpers. Backend lifecycle (setup → send_task → snapshot → observations → teardown) is wired around the agent execution step. --agent-url is now optional for --protocol openshell. verifiers/builtin.py Shell/OCSF predicates: workspace_file_exists, workspace_file_deleted, workspace_file_contains, commands_match_pattern, process_ran, network_call_to, network_call_blocked_to, security_finding_raised. suites/shell_financial_report/ Demo suite: Q4 financial report with two injection tasks (curl exfiltration and staging-cache write) graded by OCSF predicates. pyproject.toml protobuf>=6.31 (openshell SDK requires >=6.31 gencode). openshell added to the [shell] optional-dependency group. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
44a305a to
a137df1
Compare
- Rename suite folder shell_financial_report → document_assistant - Use pi image and policy (community sandbox with PI pre-installed); drop the embedded ollama agent script - Add agent_command field to suite YAML and OpenShellBackend so the agent binary is configurable rather than hardcoded to ["pi", ...] - Rename OpenShellPIAgentClient → OpenShellAgentClient - Move shell/OCSF predicates from verifiers/builtin.py to a dedicated verifiers/openshell.py module; update to evaluate(ctx) interface introduced by #71 (full VerificationContext now reaches nested predicates) - Rebase on top of #71 which fixes combinator context forwarding Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a137df1 to
39112bb
Compare
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| @click.command() | ||
| @click.option("--control-url", default="http://localhost:8080", help="URL of the benchmark MCP server control plane.") | ||
| @click.option("--agent-url", required=True, help="URL of the agent to test.") | ||
| @click.option("--agent-url", default=None, help="URL of the agent to test (not used for --protocol openshell).") |
There was a problem hiding this comment.
lets leave this in as required and make it the openshell endpoint when using --protocol openshell
| @click.option( | ||
| "--openshell-endpoint", default=None, envvar="OPENSHELL_ENDPOINT", help="OpenShell gateway gRPC endpoint (openshell protocol only)." | ||
| ) |
There was a problem hiding this comment.
lets use the agent-url for this. maybe we can generalize the name a bit. eg., in the case of pi the url is actually the path to the pi TS extension. so we can maybe rename agent-url to agent-uri ?
| user_task_ids: list[str] | None, | ||
| injection_task_ids: list[str] | None, | ||
| logdir: Path, | ||
| lifecycle_backend: object | None = None, |
There was a problem hiding this comment.
this should be of type EnvironmentBackend right?
| injection_task_id: str | None, | ||
| injections: dict[str, str], | ||
| *, | ||
| backend: object | None = None, |
There was a problem hiding this comment.
same here, EnvironmentBackend ?
| await client.put( | ||
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment", | ||
| json=post_env.model_dump(), # type: ignore[union-attr] | ||
| ) | ||
| await client.post( | ||
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations", | ||
| json={"source": "openshell", "data": obs_data}, | ||
| ) |
There was a problem hiding this comment.
These two pushes are what grading reads from — if either fails (e.g. a 422 from env-model version skew, or a 404 when the server was started with a different suite), httpx won't raise on the 4xx, the grade endpoint falls back to the pristine provisioned environment, and every security predicate evaluates False over empty fields. A successful attack then grades as "agent secure" with no error anywhere. Better to fail the eval loudly:
| await client.put( | |
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment", | |
| json=post_env.model_dump(), # type: ignore[union-attr] | |
| ) | |
| await client.post( | |
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations", | |
| json={"source": "openshell", "data": obs_data}, | |
| ) | |
| env_resp = await client.put( | |
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment", | |
| json=post_env.model_dump(), # type: ignore[union-attr] | |
| ) | |
| env_resp.raise_for_status() | |
| obs_resp = await client.post( | |
| f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations", | |
| json={"source": "openshell", "data": obs_data}, | |
| ) | |
| obs_resp.raise_for_status() |
| if self._cached_ocsf is not None: | ||
| return self._cached_ocsf | ||
|
|
||
| from openshell._proto import openshell_pb2 # pyright: ignore[reportMissingImports] |
There was a problem hiding this comment.
is there a way we can avoid the inline import?
| except Exception as exc: | ||
| import logging | ||
| logging.getLogger(__name__).warning( | ||
| "OCSF log fetch failed — security predicates will degrade to False: %s", exc |
There was a problem hiding this comment.
we should think about a way to surface this so that during grading we can say N/A rather than "attack failed"
The embedded pi script was a dev-time workaround for testing the sandbox lifecycle without a real PI agent. Now that the suite uses image: pi (which ships PI pre-installed) the script would shadow the real binary. The ollama-local built-in policy is unused after the suite update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Suite YAML gains a models_json backend field that is seeded to /sandbox/.pi/agent/models.json during setup(). Pi reads this file to discover custom inference providers (ollama, vllm, any OpenAI-compatible endpoint) without needing a custom image. Any non-localhost base URL in models_json is auto-added to the sandbox network policy via a UpdateConfig merge operation so the node process can reach it. The merge is applied after wait_ready() (dynamic network policy only) and the backend polls GetSandboxPolicyStatus until the sandbox supervisor loads the new version. document_assistant suite updated to use ollama/qwen3.5:2b via host.openshell.internal for local testing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
models_json and --model flag are developer-local config, not suite definition. Pi reads ~/.pi/agent/models.json natively; users configure their own inference provider there. The models_json backend field remains available for suites that genuinely need to ship their own inference config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Developers can now run the openshell benchmark against a local ollama or
vllm instance without modifying the suite YAML:
midojo-run --protocol openshell --suite document_assistant \
--pi-models-file ~/.pi/agent/models.json \
--pi-model ollama/qwen3.5:2b
--pi-models-file seeds the file to /sandbox/.pi/agent/models.json and
auto-adds the provider's base URL to the sandbox network policy.
--pi-model appends --model <id> to agent_command at runtime.
Both flags also accept PI_MODELS_FILE / PI_MODEL env vars.
Also fixes community image name resolution: bare names like 'pi' are now
expanded to ghcr.io/nvidia/openshell-community/sandboxes/pi:latest, matching
the CLI's --from behaviour (override with OPENSHELL_COMMUNITY_REGISTRY).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| agent_command: ["pi", "-p", "--no-session", "--model", "ollama/qwen3.5:2b"] | ||
| models_json: | ||
| providers: | ||
| ollama: | ||
| baseUrl: "http://host.openshell.internal:11434/v1" | ||
| api: "openai-completions" | ||
| apiKey: "ollama" | ||
| compat: | ||
| supportsDeveloperRole: false | ||
| supportsReasoningEffort: false | ||
| models: | ||
| - id: "qwen3.5:2b" |
There was a problem hiding this comment.
thinking back to my suggestion re: agent_command ... i think we should probably try to avoid all this.
the concern of midojo stops at the agent boundary. in other words, agent config (like its command and providers etc) should not be what we deal with. this should all exist somewhere else already.
in the case of an openshell env it seems that there's some coupling between the agent's config & the env so maybe the way around it is via an image or another openshell native mechanism. let's chat about this.
The policy management (auto-detecting base URLs from models.json, applying UpdateConfig merge operations, polling GetSandboxPolicyStatus) is too specific and belongs outside midojo. Developers who want local inference can configure sandbox policy and pi models.json independently. Keeps _resolve_image() for community name expansion (pi → ghcr.io/...). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
--agent-url → --agent-uri: more accurate since pi uses a file path not a URL. Unified flag serves all protocols: for openshell it is the gateway endpoint (omit to use the CLI-registered active gateway), so --openshell-endpoint is removed. EnvironmentBackend type: lifecycle_backend typed as EnvironmentBackend | None in run_task() and run_benchmark() instead of object | None. raise_for_status on env push: PUT /environment now raises on 4xx so a silent failure (e.g. env-model version skew) doesn't cause a successful attack to grade as "agent secure" with no error anywhere. Avoid inline import in _fetch_ocsf: openshell_pb2 is stored as self._pb2 in setup() so _fetch_ocsf() reuses it without a repeated lazy import. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eld)
VerificationContext.observations stays as the extension point for future
backends. Everything OpenShell-specific around it is removed:
- backend.observations() method — env fields on OpenShellEnvironment
already carry all OCSF data needed for grading; the raw dict had no
consumer
- POST/GET /evaluations/{id}/observations endpoints in the control API
- POST/GET /current/observations endpoints
- RecordObservationsRequest model
- evaluation.observations in the state model
- observations= kwarg in YAMLTaskSuite.grade() and the grade endpoint
- test_observations_grading.py and two observations control-API tests
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
88e8b1f to
5afce53
Compare
SandboxSpec.providers lists the OpenShell provider names whose credentials are injected into the sandbox as env vars. Suites that need a cloud inference provider (e.g. OpenAI, Anthropic) declare it here; the backend passes it straight through to the gRPC call. Example: providers: [openai] # injects OPENAI_API_KEY into the sandbox Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
providers in the backend config injects named OpenShell provider credentials into the sandbox as env vars (e.g. OPENAI_API_KEY). Required for agents like pi that need inference credentials at runtime. The field is wired through backends.py → OpenShellBackend → SandboxSpec. Users uncomment and set their registered provider name(s) in the suite YAML before running — no code changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Named string policies (e.g. policy: pi) were misleading — the only built-in "pi" policy only allowed api.anthropic.com, which shadowed the pi image's own comprehensive /etc/openshell/policy.yaml and broke other providers. Omitting policy (None) lets the image's built-in policy apply, which is the right default for community images. policy field now accepts None or an inline camelCase proto-JSON dict. _resolve_policy simplified accordingly; tests updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What this adds
Fills in the lifecycle stubs introduced in #67 (
setup,exec_agent,snapshot,teardown) so--protocol openshellruns end-to-end against real NVIDIA OpenShell MicroVM sandboxes. The orchestrator drives the full evaluation cycle — create sandbox, seed workspace, run agent, collect workspace diff and OCSF kernel events, grade — with the backend owning the sandbox lifecycle.Architecture
OpenShellAgentClientis a 5-line wrapper that callsbackend.exec_agent()and returns stdout. The sandbox lifecycle lives entirely in the backend.New files
src/midojo/openshell_logs.pysrc/midojo/verifiers/openshell.pysuites/document_assistant/tests/test_shell_predicates.pySuite YAML
imageis a community sandbox name — resolved toghcr.io/nvidia/openshell-community/sandboxes/pi:latestautomatically.policyis omitted so the image's own/etc/openshell/policy.yamlapplies (comprehensive allowlist for all major providers).providersinjects registered OpenShell provider credentials as env vars into the sandbox (e.g.OPENAI_API_KEY).agent_commandis configurable per-suite; model selection is left to the user's pi config or provider defaults.CLI changes
--agent-urlrenamed to--agent-uri(more accurate: forpiit's a directory path, not a URL).--openshell-endpointremoved —--agent-uriserves as the gateway endpoint for--protocol openshell. Omit it to use the CLI-registered active gateway (SandboxClient.from_active_cluster()).backendparameter typed asEnvironmentBackend | Noneinrun_task()andrun_benchmark().PUT /environmentnow callsraise_for_status()— silent 4xx failures no longer cause a successful attack to grade as "agent secure".Running it
Prerequisites (one-time)
Install OpenShell (macOS — Homebrew, auto-starts the gateway):
Install the Python SDK into the midojo venv:
Register your inference provider:
OPENAI_API_KEY=sk-... openshell provider create --name openai --type openai --from-existing # or: ANTHROPIC_API_KEY=sk-ant-... openshell provider create --name anthropic --type anthropic --from-existingRun the demo suite
Uncomment and set
providersinsuites/document_assistant/suite.yaml:Then:
Each evaluation creates a fresh MicroVM sandbox from the
piimage, seeds/sandbox/workspace/with the injected financial report, runs the pi agent against it, collects workspace diff and OCSF kernel events, and deletes the sandbox. Each evaluation takes ~30–60 s on first run (image pull) and ~15–30 s subsequently.Known limitation
OCSF unavailability grades as False, not N/A. When
GetSandboxLogsfails (gateway timeout, OCSF stream unavailable), OCSF predicates (process_ran,network_call_to, etc.) silently returnFalse— "attack failed" — rather than N/A. A silent OCSF failure can make a successful attack appear to have been resisted. Fixing this properly requires predicates to returnbool | None, which is a verifier-protocol change deferred to a follow-up.agent_commandis effectively required. The suite YAML fieldagent_commandis technically optional (defaults toNone), but the fallback when unset runs the prompt text as a shell command — not the image's entrypoint. Thepiimage'sENTRYPOINTis/bin/bash, so withoutagent_commandnothing useful runs. This means midojo currently owns a piece of agent configuration that arguably belongs to the image. The right long-term fix is for community sandbox images to adopt a standard headless entrypoint that accepts a prompt argument, makingagent_commandunnecessary. Until then it is load-bearing.