feat(gateway): OpenShell gateway microVM with libkrun#100
Draft
feat(gateway): OpenShell gateway microVM with libkrun#100
Conversation
|
All contributors have signed the DCO ✍️ ✅ |
58d7858 to
6e6759e
Compare
Add a new navigator-vm library crate that boots k3s inside a libkrun microVM, accessible from the host via gvproxy port forwarding. Key components: - FFI bindings to libkrun C API (krun_create_ctx, krun_add_net_unixgram, etc.) - VmConfig with gateway() preset for k3s and custom exec mode - gvproxy integration: virtio-net via unixgram, DHCP, native HTTP port forwarding - gateway-init.sh: PID 1 init script with DHCP via udhcpc, mounts, k3s exec - build-rootfs.sh: builds Ubuntu 22.04 arm64 rootfs with k3s + busybox-static - Kubeconfig auto-extraction to ~/.kube/gateway.yaml - CLI integration as 'ncl gateway' with --exec, --port, --net flags - macOS codesigning and DYLD_FALLBACK_LIBRARY_PATH in ncl wrapper
Enable full NemoClaw control plane deployment inside the libkrun microVM so e2e tests can run against the VM instead of Docker. Build-time (build-rootfs.sh): - Package helm chart and inject into k3s static charts directory - Copy HelmChart CR and agent-sandbox manifests into rootfs - Pull and save arm64 container images as tarballs for airgap boot Boot-time (gateway-init.sh): - Enable flannel CNI (remove --flannel-backend=none and related flags) - Deploy bundled manifests to k3s auto-deploy directory - Patch HelmChart CR for VM context (pullPolicy, SSH placeholders) - Ensure DNS fallback when DHCP doesn't configure resolv.conf Post-boot (lib.rs): - Wait for navigator namespace created by Helm controller - Generate PKI and apply TLS secrets via host kubectl - Store cluster metadata and mTLS creds for CLI/SDK access - Set 'gateway' as active cluster for e2e test discovery Also bump VM to 8GB RAM / 4 vCPUs, add port 30051 forwarding, fix nemoclaw wrapper fingerprint to include navigator-vm crate, and add test:e2e:vm mise task.
Stop deleting meta.db in gateway-init.sh and include the native snapshotter, content store, and metadata DB in the rootfs built by build-rootfs.sh. Without meta.db, containerd re-extracts all image layers on every boot (~2 min for navigator/server on virtio-fs), causing kubelet CreateContainer timeouts. Also replace the etcd-snapshot approach with direct SQLite cleanup of the kine DB to remove stale pod/event/lease records.
Move the gateway VM launching out of `nemoclaw gateway` into its own `gateway` binary built from the navigator-vm crate. The nemoclaw CLI no longer links against libkrun or requires macOS hypervisor codesigning. Add scripts/bin/gateway wrapper (build + codesign + exec) and clean up scripts/bin/nemoclaw to remove navigator-vm artifacts.
Two #[ignore] tests that require libkrun + pre-built rootfs: - gateway_boots_and_service_becomes_reachable: starts the full gateway and verifies the gRPC service on port 30051 - gateway_exec_runs_guest_command: runs /bin/true inside the VM via --exec and checks the exit code
Move orphaned integration test from crates/navigator-vm/ to crates/openshell-vm/tests/ and update all navigator_bootstrap references to openshell_bootstrap, including renamed types (ClusterMetadata -> GatewayMetadata) and functions.
openshell-vm links against libkrun which is only available on macOS with Homebrew. Exclude it from cargo check, clippy, and test workspace commands so CI passes on Linux runners.
…support Enable Kubernetes-compatible networking in the gateway microVM by building a custom libkrunfw kernel with CONFIG_BRIDGE, CONFIG_NETFILTER, CONFIG_NF_CONNTRACK, CONFIG_IP_NF_IPTABLES, and CONFIG_VETH compiled in. Key changes: - Docker-based kernel build pipeline for macOS (build-custom-libkrunfw.sh) - Kernel config fragment enabling bridge/netfilter/conntrack/NAT/IPVS - Feature-flagged bridge CNI with auto-detection fallback to legacy ptp - Runtime provenance tracking (SHA-256, build metadata, manifest validation) - VM capability checker and host-side verification matrix scripts - Mise tasks: vm:build-custom-runtime, vm:verify, vm:check-capabilities - Architecture and operator documentation
…orking Switch kube-proxy to nftables mode and add missing kernel config options (NFT_NUMGEN, NFT_FIB_IPV4/6, NFT_LIMIT, NFT_REDIR, NFT_TPROXY) plus xtables match modules required by CNI bridge masquerade. Add stale CNI state cleanup on boot (cni0 bridge, veth pairs, IPAM allocations, pod network namespaces, sandbox controller shim) to prevent 'route already exists' errors from persistent rootfs. Remove dual bridge/legacy-vm-net profile system in favor of bridge-only with fail-fast kernel validation. Drop host-mapped port 6443 (kube-apiserver) since it is not needed for normal gateway operation. Update bundle script to fall back to Homebrew for libkrun.dylib (VMM) while still requiring custom libkrunfw (kernel).
…aligned pre-bake Two issues caused the gateway service readiness check to time out: 1. Port mapping mismatch: gvproxy mapped host:30051 → VM:8080, but with bridge CNI the pod listens on 8080 inside its network namespace, not on the VM's root namespace. Changed to 30051:30051 so traffic flows through the NodePort service (kube-proxy nftables → pod:8080). 2. Pod cycling from helm upgrade: build-rootfs.sh pre-baked with hostNetwork=true and automountServiceAccountToken=false, but gateway-init.sh changed these at boot, triggering a HelmChart reconcile that killed the pre-baked pod ~90s in. Aligned pre-bake values (hostNetwork=false, automountServiceAccountToken=true) to match runtime, eliminating the manifest delta.
The previous commit (070bcca) dropped port 6443 from the gvproxy port_map, breaking all host-side kubectl commands including the readiness check and stale pod recovery. k3s runs the API server with host networking so VM:6443 is directly reachable — restore the 6443:6443 mapping alongside the 30051:30051 NodePort mapping.
Remove all kubectl calls from the host-side boot sequence, eliminating the need to forward port 6443 (kube-apiserver) outside the VM. Changes: - wait_for_gateway_service: TCP probe only (30051), no kubectl pod check - bootstrap_gateway: cold boot writes TLS secret manifests via virtio-fs into k3s auto-deploy dir instead of kubectl apply - bootstrap_gateway: warm boot skips namespace wait (TCP probe suffices) - recover_stale_pods: removed entirely (gateway-init.sh already cleans containerd runtime/sandbox state, CNI state, and network namespaces) - Kubeconfig copy moved to best-effort post-readiness (for debugging) - Port 6443 removed from gvproxy port_map Removed functions: recover_stale_pods, wait_for_namespace, apply_tls_secrets, kubectl_apply. Net: -362 lines, +147 lines. No kubectl binary required on host.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
openshell-vmlibrary crate that boots k3s inside a libkrun microVM on macOS ARM64gatewaybinary with gvproxy networking, DHCP, and port forwarding~/.kube/gateway.yamlfor immediatekubectlaccessDetails
Boots a full k3s Kubernetes cluster inside an Apple Hypervisor.framework microVM via libkrun. Uses gvproxy for user-mode networking (virtio-net) with DHCP, bypassing TSI which is incompatible with k3s loopback connections. Preserves containerd metadata across boots for fast startup.
New files
crates/openshell-vm/— library crate with libkrun FFI bindings,VmConfig,launch()crates/openshell-vm/src/main.rs— standalonegatewaybinarycrates/openshell-vm/scripts/— rootfs build script, init script, debug helpersscripts/bin/openshell— updated with codesigning andDYLD_FALLBACK_LIBRARY_PATHUsage
Prerequisites
brew tap slp/krun && brew install libkrun/opt/podman/bin/gvproxy)