Skip to content

feat(runtime)!: collapse to node/python + fix nsjail invocation on hardened hosts#10

Merged
Harsh-2002 merged 2 commits into
mainfrom
dev
Jun 4, 2026
Merged

feat(runtime)!: collapse to node/python + fix nsjail invocation on hardened hosts#10
Harsh-2002 merged 2 commits into
mainfrom
dev

Conversation

@Harsh-2002

Copy link
Copy Markdown
Owner

Summary — BREAKING runtime change

Collapses Orva's four runtime ids (node22/node24/python313/python314) to two generic, latest-stable-only ids: node (Node.js 24) and python (Python 3.14). Version is now a display-only label. Node 22 / Python 3.13 are removed completely (rootfs builds, CI matrix, docs). Plus two pre-existing nsjail invocation bugs found during live testing.

Runtime collapse (strict cutover)

  • sandbox.Languagenode/python; validRuntimes, MCP validRuntimesSet, and GET /runtimes → 2 entries (+ display version). Legacy ids rejected on input (REST/CLI/MCP).
  • In-place DB migration rewrites stored values (node20|node22|node24→node, python312|python313|python314→python) so existing functions keep loading — TestCollapseRuntimes added.
  • backend/runtimes/ collapsed to node/+python/; Makefile, Dockerfile (2 rootfs stages), scripts/{build-rootfs,entrypoint,install}.sh, release.yml matrix build only node/python. UI build toolchain bumped node 22→24.
  • CLI help/examples, frontend (Editor/Docs/templates/aiPrompts), AI prompt, all docs/* updated; make docs-embed synced. e2e updated + a legacy-id-rejected case.

nsjail invocation fixes (pre-existing; functions crashed on the OLD binary too on affected hosts)

  1. cgroup false-positive (sandbox.go): enabled nsjail cgroup limits whenever a child cgroup could be created, even when controllers weren't delegated (subtree_control empty under the cgroup-v2 "no internal processes" rule) → memory.max write failed → every worker crashed. Now verifies the probe child exposes memory/pids/cpu; else falls back to rlimit-only instead of crashing.
  2. /proc overmount (install.sh unit): ProtectKernelTunables=true overmounts /proc/sys, blocking nsjail's procfs mount in its userns (Failed to mount mandatory point: /proc). Dropped from the unit; nsjail still isolates via userns + seccomp + chroot.

Validation

  • go build / go vet / go test -race ./... green (incl. migration test, 4→2 runtimes-count assertion).
  • Isolated e2e (fresh image, node24/python3.14 rootfs only): all 25 modules green; test_functions includes legacy-rejected checks.
  • Live systemd: migrated (greeting→node, greeting-py→python) + fresh node/python deploys invoke 200 (Node 24 / Python 3.14); legacy node24 rejected; /runtimes = 2.

Notes

  • Breaking: clients/scripts passing node24/python314 must switch to node/python.
  • No release in this PR — merge to main only, keep ready.

…nvocation on hardened/constrained hosts

BREAKING: the four versioned runtime ids (node22/node24/python313/python314) are
replaced by two generic, latest-stable-only ids — `node` (Node.js 24) and
`python` (Python 3.14). The version is now an implementation detail surfaced only
as a display label.

Runtime collapse
- sandbox.Language: Node="node", Python="python"; IsNode/IsPython retargeted.
- validRuntimes + MCP validRuntimesSet + GET /runtimes catalog → {node, python};
  /runtimes gains a display-only `version` field. Legacy ids are REJECTED on input
  (strict cutover) across REST/CLI/MCP.
- DB migration collapses stored values in place (node20|node22|node24→node,
  python312|python313|python314→python) so existing functions keep loading; added
  TestCollapseRuntimes. builder pythonVersionFor → 3.14.
- runtimes/ dirs collapsed to node/ + python/; Makefile adapters-embed, Dockerfile
  (two rootfs stages), scripts/{build-rootfs,entrypoint,install}.sh, and
  release.yml rootfs matrix now build only node/python. UI build toolchain bumped
  node 22→24 (ci.yml/release.yml/Dockerfile).
- CLI --runtime help/examples, frontend (Editor/Docs/templates/aiPrompts), AI
  system prompt, and all docs/* updated; make docs-embed synced. e2e updated +
  a legacy-id-rejected case added (test_functions).

nsjail invocation fixes (pre-existing, surfaced during live testing — function
invocation crashed on the old binary too on affected hosts)
- sandbox cgroup false-positive: cgroupv2Delegate enabled nsjail cgroup limits
  whenever it could mkdir a child cgroup, even when the controllers weren't
  delegated (cgroup.subtree_control empty under the cgroup-v2 "no internal
  processes" rule). nsjail's memory.max write then failed and every worker
  crashed. Now verify a probe child exposes memory/pids/cpu; otherwise fall back
  to rlimit-only (functions run without per-sandbox cgroup caps) instead of
  crashing.
- systemd unit /proc overmount: ProtectKernelTunables=true overmounts /proc/sys,
  which blocks nsjail's procfs mount inside its user namespace ("Failed to mount
  mandatory point: /proc"). Dropped from the install.sh systemd unit; nsjail still
  isolates via userns + seccomp + chroot.

Validated: go build/vet/test -race; isolated e2e (fresh image, node24/python3.14
rootfs only) all modules green; live systemd — migrated + fresh node/python
functions invoke 200 (Node 24 / Python 3.14), legacy ids rejected.
…oup rlimit fallback

Address PR #10 review:
- test/atscale.sh: the two node loops produced duplicate names (ascale-node-$i
  twice → 5 dup create 409s). Now 10 node + 10 python with distinct prefixes.
- docs/API.md: fix garbled runtime comment (node|node|python|python → node | python).
- docs/CAPACITY.md: refresh stale py313/py314 + duplicate node snapshot rows.
- sandbox.go: log a one-time WARN when no cgroup-v2 delegate is usable so
  operators know per-sandbox memory/pid/cpu caps fell back to rlimit-only.
@Harsh-2002 Harsh-2002 merged commit 784f9eb into main Jun 4, 2026
14 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant