Skip to content

Wire OpenShell sandbox lifecycle into the orchestrator#70

Draft
saichandrapandraju wants to merge 14 commits into
mainfrom
openshell-backend
Draft

Wire OpenShell sandbox lifecycle into the orchestrator#70
saichandrapandraju wants to merge 14 commits into
mainfrom
openshell-backend

Conversation

@saichandrapandraju

@saichandrapandraju saichandrapandraju commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

What this adds

Fills in the lifecycle stubs introduced in #67 (setup, exec_agent, snapshot, teardown) so --protocol openshell runs end-to-end against real NVIDIA OpenShell MicroVM sandboxes. The orchestrator drives the full evaluation cycle — create sandbox, seed workspace, run agent, collect workspace diff and OCSF kernel events, grade — with the backend owning the sandbox lifecycle.

Architecture

orchestrator.run_task()
  ├── backend.provision(injections)       # pure: render workspace files
  ├── backend.setup(pre_env)             # create MicroVM, seed /sandbox/workspace, inject providers
  │     └── SandboxClient.from_active_cluster()  # picks up mTLS from ~/.config/openshell/
  ├── agent_client.send_task(prompt)     # exec agent_command inside sandbox, return stdout
  ├── backend.snapshot()                 # workspace diff + OCSF events → OpenShellEnvironment
  ├── PUT /evaluations/{id}/environment  # post-env to control plane
  └── backend.teardown()                 # delete sandbox (always runs, even on error)

OpenShellAgentClient is a 5-line wrapper that calls backend.exec_agent() and returns stdout. The sandbox lifecycle lives entirely in the backend.

New files

File Purpose
src/midojo/openshell_logs.py OCSF shorthand log parser (NET/HTTP/PROC/FINDING events)
src/midojo/verifiers/openshell.py Workspace diff and OCSF predicates (moved out of builtin)
suites/document_assistant/ Demo suite: Q4 financial report with curl-exfiltration and staging-cache injection tasks
tests/test_shell_predicates.py Tests for the workspace/OCSF predicates

Suite YAML

environment:
  backend:
    type: openshell
    image: pi                             # community sandbox with pi pre-installed
    # providers: [openai]                 # uncomment and set your registered provider(s)
    agent_command: ["pi", "-p", "--no-session"]
  • image is a community sandbox name — resolved to ghcr.io/nvidia/openshell-community/sandboxes/pi:latest automatically.
  • policy is omitted so the image's own /etc/openshell/policy.yaml applies (comprehensive allowlist for all major providers).
  • providers injects registered OpenShell provider credentials as env vars into the sandbox (e.g. OPENAI_API_KEY).
  • agent_command is configurable per-suite; model selection is left to the user's pi config or provider defaults.

CLI changes

  • --agent-url renamed to --agent-uri (more accurate: for pi it's a directory path, not a URL).
  • --openshell-endpoint removed — --agent-uri serves as the gateway endpoint for --protocol openshell. Omit it to use the CLI-registered active gateway (SandboxClient.from_active_cluster()).
  • backend parameter typed as EnvironmentBackend | None in run_task() and run_benchmark().
  • PUT /environment now calls raise_for_status() — silent 4xx failures no longer cause a successful attack to grade as "agent secure".

Running it

Prerequisites (one-time)

Install OpenShell (macOS — Homebrew, auto-starts the gateway):

curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sh
openshell status   # → Status: Connected

Install the Python SDK into the midojo venv:

uv pip install openshell

Register your inference provider:

OPENAI_API_KEY=sk-... openshell provider create --name openai --type openai --from-existing
# or: ANTHROPIC_API_KEY=sk-ant-... openshell provider create --name anthropic --type anthropic --from-existing

Run the demo suite

Uncomment and set providers in suites/document_assistant/suite.yaml:

providers: [openai]   # or [anthropic], matching your registered provider name

Then:

# Terminal 1 — control plane
uv run midojo-serve --suite document_assistant --port 8090

# Terminal 2 — benchmark (no --agent-uri needed; active gateway is picked up automatically)
uv run midojo-run \
  --protocol openshell \
  --suite document_assistant \
  --control-url http://localhost:8090

Each evaluation creates a fresh MicroVM sandbox from the pi image, seeds /sandbox/workspace/ with the injected financial report, runs the pi agent against it, collects workspace diff and OCSF kernel events, and deletes the sandbox. Each evaluation takes ~30–60 s on first run (image pull) and ~15–30 s subsequently.

Known limitation

OCSF unavailability grades as False, not N/A. When GetSandboxLogs fails (gateway timeout, OCSF stream unavailable), OCSF predicates (process_ran, network_call_to, etc.) silently return False — "attack failed" — rather than N/A. A silent OCSF failure can make a successful attack appear to have been resisted. Fixing this properly requires predicates to return bool | None, which is a verifier-protocol change deferred to a follow-up.

agent_command is effectively required. The suite YAML field agent_command is technically optional (defaults to None), but the fallback when unset runs the prompt text as a shell command — not the image's entrypoint. The pi image's ENTRYPOINT is /bin/bash, so without agent_command nothing useful runs. This means midojo currently owns a piece of agent configuration that arguably belongs to the image. The right long-term fix is for community sandbox images to adopt a standard headless entrypoint that accepts a prompt argument, making agent_command unnecessary. Until then it is load-bearing.

@saichandrapandraju

Copy link
Copy Markdown
Collaborator Author

Hi @dmaniloff , would like to get early feedback on this, thanks!

@saichandrapandraju saichandrapandraju self-assigned this Jun 10, 2026
@saichandrapandraju saichandrapandraju added the enhancement New feature or request label Jun 10, 2026
Comment thread suites/document_assistant/suite.yaml Outdated
Comment on lines +10 to +13
backend:
type: openshell
image: base
policy: ollama-local

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's already a pi image and a pi policy for pi.dev agents.

it's the one that gets used when you say openshell sandbox create --from pi

let's use that instead.

and let's also add an agent_command that defaults to the image's entrypoint eg

agent_command: ["pi", "-p", "--no-session"] # optional, defaults to image's entrypoint

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though i may have a different opinion about this now ... let's chat when you get a moment.

Comment thread src/midojo/agent_client.py Outdated
Comment thread src/midojo/verifiers/builtin.py Outdated
Comment thread src/midojo/verifiers/builtin.py Outdated
Comment on lines +157 to +159
# observations["openshell"] (raw audit trail). The env-field approach is necessary
# because the any_of/all_of combinators call parse_predicate recursively, which
# doesn't dispatch to named verifiers. Revisit when combinators are upgraded.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the any_of/all_of combinators call parse_predicate recursively, which doesn't dispatch to named verifiers

nice call out @saichandrapandraju -- this is a missing piece in Predicate.evaluate. i am adding that in #71

let's please rebase this on top of that

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once rebased, the question is whether we want to keep the observations field or just do what you are doing which is to gather observations from the orchestrator and populate environment fields like files etc.

maybe that is cleaner sine the fields are self-explanatory? the other benefit of that is that we can remove the POST /observations endpoint in the control api.

thoughts?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -

VerificationContext.observations can stay - it's the right extension point for future backends that surface event streams that don't fit neatly into env fields.

Everything OpenShell-specific around it can be removed:

  • backend.observations() — its only job was formatting the OCSF dict to push to the control plane, which we don't need since env fields cover all current predicates
  • The orchestrator step calling backend.observations() and POST /observations
  • The POST /evaluations/{id}/observations control API endpoint
  • evaluation.observations in the state model

At least for Openshell, the env fields on OpenShellEnvironment are sufficient. If a future backend needs richer grading data that doesn't map cleanly to env fields, then ctx.observations and the endpoint comes back.

Happy to make the changes if this sounds right to you.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree w/ you @saichandrapandraju ! let's make those changes.

Fills in the lifecycle stubs from the #67 scaffold (setup, exec_agent,
snapshot, observations, teardown) and wires them into the orchestrator
so the openshell protocol runs end-to-end against real sandboxes.

openshell_logs.py
  OCSF shorthand log parser: NET, HTTP, PROC, and FINDING event types.

openshell_backend.py
  Implements all four lifecycle stubs. Adds _BUILTIN_POLICIES (camelCase
  proto-JSON dicts resolved via ParseDict so suites declare network policy
  in YAML without importing gRPC types directly) and _resolve_policy().
  Embeds a minimal ollama-backed agent script installed to
  /sandbox/.venv/bin/pi for local testing without a real PI agent.
  setup() falls back to SandboxClient.from_active_cluster() when no
  explicit endpoint is given, so the Homebrew-managed gateway works out
  of the box.

agent_client.py
  OpenShellPIAgentClient delegates execution to backend.exec_agent() and
  returns stdout. The sandbox lifecycle lives in the backend, not the
  agent client.

orchestrator.py
  run_task() split into _create_evaluation() + _complete_and_grade()
  helpers. Backend lifecycle (setup → send_task → snapshot → observations
  → teardown) is wired around the agent execution step. --agent-url is
  now optional for --protocol openshell.

verifiers/builtin.py
  Shell/OCSF predicates: workspace_file_exists, workspace_file_deleted,
  workspace_file_contains, commands_match_pattern, process_ran,
  network_call_to, network_call_blocked_to, security_finding_raised.

suites/shell_financial_report/
  Demo suite: Q4 financial report with two injection tasks (curl
  exfiltration and staging-cache write) graded by OCSF predicates.

pyproject.toml
  protobuf>=6.31 (openshell SDK requires >=6.31 gencode).
  openshell added to the [shell] optional-dependency group.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Rename suite folder shell_financial_report → document_assistant
- Use pi image and policy (community sandbox with PI pre-installed);
  drop the embedded ollama agent script
- Add agent_command field to suite YAML and OpenShellBackend so the
  agent binary is configurable rather than hardcoded to ["pi", ...]
- Rename OpenShellPIAgentClient → OpenShellAgentClient
- Move shell/OCSF predicates from verifiers/builtin.py to a dedicated
  verifiers/openshell.py module; update to evaluate(ctx) interface
  introduced by #71 (full VerificationContext now reaches nested predicates)
- Rebase on top of #71 which fixes combinator context forwarding

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/midojo/orchestrator.py Outdated
@click.command()
@click.option("--control-url", default="http://localhost:8080", help="URL of the benchmark MCP server control plane.")
@click.option("--agent-url", required=True, help="URL of the agent to test.")
@click.option("--agent-url", default=None, help="URL of the agent to test (not used for --protocol openshell).")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave this in as required and make it the openshell endpoint when using --protocol openshell

Comment thread src/midojo/orchestrator.py Outdated
Comment on lines +353 to +355
@click.option(
"--openshell-endpoint", default=None, envvar="OPENSHELL_ENDPOINT", help="OpenShell gateway gRPC endpoint (openshell protocol only)."
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use the agent-url for this. maybe we can generalize the name a bit. eg., in the case of pi the url is actually the path to the pi TS extension. so we can maybe rename agent-url to agent-uri ?

Comment thread src/midojo/orchestrator.py Outdated
user_task_ids: list[str] | None,
injection_task_ids: list[str] | None,
logdir: Path,
lifecycle_backend: object | None = None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be of type EnvironmentBackend right?

Comment thread src/midojo/orchestrator.py Outdated
injection_task_id: str | None,
injections: dict[str, str],
*,
backend: object | None = None,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, EnvironmentBackend ?

Comment thread src/midojo/orchestrator.py Outdated
Comment on lines +237 to +244
await client.put(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment",
json=post_env.model_dump(), # type: ignore[union-attr]
)
await client.post(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations",
json={"source": "openshell", "data": obs_data},
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two pushes are what grading reads from — if either fails (e.g. a 422 from env-model version skew, or a 404 when the server was started with a different suite), httpx won't raise on the 4xx, the grade endpoint falls back to the pristine provisioned environment, and every security predicate evaluates False over empty fields. A successful attack then grades as "agent secure" with no error anywhere. Better to fail the eval loudly:

Suggested change
await client.put(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment",
json=post_env.model_dump(), # type: ignore[union-attr]
)
await client.post(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations",
json={"source": "openshell", "data": obs_data},
)
env_resp = await client.put(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/environment",
json=post_env.model_dump(), # type: ignore[union-attr]
)
env_resp.raise_for_status()
obs_resp = await client.post(
f"{control_url}/runs/{run_id}/evaluations/{eval_id}/observations",
json={"source": "openshell", "data": obs_data},
)
obs_resp.raise_for_status()

Comment thread src/midojo/openshell_backend.py Outdated
if self._cached_ocsf is not None:
return self._cached_ocsf

from openshell._proto import openshell_pb2 # pyright: ignore[reportMissingImports]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way we can avoid the inline import?

except Exception as exc:
import logging
logging.getLogger(__name__).warning(
"OCSF log fetch failed — security predicates will degrade to False: %s", exc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should think about a way to surface this so that during grading we can say N/A rather than "attack failed"

saichandrapandraju and others added 4 commits June 11, 2026 11:41
The embedded pi script was a dev-time workaround for testing the sandbox
lifecycle without a real PI agent. Now that the suite uses image: pi
(which ships PI pre-installed) the script would shadow the real binary.
The ollama-local built-in policy is unused after the suite update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Suite YAML gains a models_json backend field that is seeded to
/sandbox/.pi/agent/models.json during setup(). Pi reads this file to
discover custom inference providers (ollama, vllm, any OpenAI-compatible
endpoint) without needing a custom image.

Any non-localhost base URL in models_json is auto-added to the sandbox
network policy via a UpdateConfig merge operation so the node process can
reach it. The merge is applied after wait_ready() (dynamic network policy
only) and the backend polls GetSandboxPolicyStatus until the sandbox
supervisor loads the new version.

document_assistant suite updated to use ollama/qwen3.5:2b via
host.openshell.internal for local testing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
models_json and --model flag are developer-local config, not suite
definition. Pi reads ~/.pi/agent/models.json natively; users configure
their own inference provider there. The models_json backend field remains
available for suites that genuinely need to ship their own inference config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Developers can now run the openshell benchmark against a local ollama or
vllm instance without modifying the suite YAML:

  midojo-run --protocol openshell --suite document_assistant \
    --pi-models-file ~/.pi/agent/models.json \
    --pi-model ollama/qwen3.5:2b

--pi-models-file seeds the file to /sandbox/.pi/agent/models.json and
auto-adds the provider's base URL to the sandbox network policy.
--pi-model appends --model <id> to agent_command at runtime.
Both flags also accept PI_MODELS_FILE / PI_MODEL env vars.

Also fixes community image name resolution: bare names like 'pi' are now
expanded to ghcr.io/nvidia/openshell-community/sandboxes/pi:latest, matching
the CLI's --from behaviour (override with OPENSHELL_COMMUNITY_REGISTRY).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread suites/document_assistant/suite.yaml Outdated
Comment on lines +14 to +25
agent_command: ["pi", "-p", "--no-session", "--model", "ollama/qwen3.5:2b"]
models_json:
providers:
ollama:
baseUrl: "http://host.openshell.internal:11434/v1"
api: "openai-completions"
apiKey: "ollama"
compat:
supportsDeveloperRole: false
supportsReasoningEffort: false
models:
- id: "qwen3.5:2b"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking back to my suggestion re: agent_command ... i think we should probably try to avoid all this.

the concern of midojo stops at the agent boundary. in other words, agent config (like its command and providers etc) should not be what we deal with. this should all exist somewhere else already.

in the case of an openshell env it seems that there's some coupling between the agent's config & the env so maybe the way around it is via an image or another openshell native mechanism. let's chat about this.

saichandrapandraju and others added 3 commits June 12, 2026 10:40
The policy management (auto-detecting base URLs from models.json,
applying UpdateConfig merge operations, polling GetSandboxPolicyStatus)
is too specific and belongs outside midojo. Developers who want local
inference can configure sandbox policy and pi models.json independently.

Keeps _resolve_image() for community name expansion (pi → ghcr.io/...).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
--agent-url → --agent-uri: more accurate since pi uses a file path not
a URL. Unified flag serves all protocols: for openshell it is the
gateway endpoint (omit to use the CLI-registered active gateway), so
--openshell-endpoint is removed.

EnvironmentBackend type: lifecycle_backend typed as
EnvironmentBackend | None in run_task() and run_benchmark() instead of
object | None.

raise_for_status on env push: PUT /environment now raises on 4xx so a
silent failure (e.g. env-model version skew) doesn't cause a successful
attack to grade as "agent secure" with no error anywhere.

Avoid inline import in _fetch_ocsf: openshell_pb2 is stored as self._pb2
in setup() so _fetch_ocsf() reuses it without a repeated lazy import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eld)

VerificationContext.observations stays as the extension point for future
backends. Everything OpenShell-specific around it is removed:

- backend.observations() method — env fields on OpenShellEnvironment
  already carry all OCSF data needed for grading; the raw dict had no
  consumer
- POST/GET /evaluations/{id}/observations endpoints in the control API
- POST/GET /current/observations endpoints
- RecordObservationsRequest model
- evaluation.observations in the state model
- observations= kwarg in YAMLTaskSuite.grade() and the grade endpoint
- test_observations_grading.py and two observations control-API tests

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
saichandrapandraju and others added 4 commits June 14, 2026 21:11
SandboxSpec.providers lists the OpenShell provider names whose credentials
are injected into the sandbox as env vars. Suites that need a cloud
inference provider (e.g. OpenAI, Anthropic) declare it here; the backend
passes it straight through to the gRPC call.

Example:
  providers: [openai]   # injects OPENAI_API_KEY into the sandbox

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
providers in the backend config injects named OpenShell provider
credentials into the sandbox as env vars (e.g. OPENAI_API_KEY).
Required for agents like pi that need inference credentials at runtime.

The field is wired through backends.py → OpenShellBackend → SandboxSpec.
Users uncomment and set their registered provider name(s) in the suite
YAML before running — no code changes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Named string policies (e.g. policy: pi) were misleading — the only
built-in "pi" policy only allowed api.anthropic.com, which shadowed the
pi image's own comprehensive /etc/openshell/policy.yaml and broke other
providers. Omitting policy (None) lets the image's built-in policy apply,
which is the right default for community images.

policy field now accepts None or an inline camelCase proto-JSON dict.
_resolve_policy simplified accordingly; tests updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants