-
Notifications
You must be signed in to change notification settings - Fork 0
feat: add systemd mode for full VM experience #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Replace shell-based init script with Go binary that supports two modes: ## Exec Mode (existing behavior) - Go init runs as PID 1 - Starts guest-agent in background - Runs container entrypoint as child process - Used for standard Docker images (nginx, python, etc.) ## Systemd Mode (new) - Auto-detected when image CMD is /sbin/init or /lib/systemd/systemd - Go init sets up rootfs, then chroots and execs systemd - Systemd becomes PID 1 and manages the full system - guest-agent runs as a systemd service (hypeman-agent.service) - Enables EC2-like experience: ssh, systemctl, journalctl all work ## Key changes: - lib/system/init/: New Go-based init binary with modular boot phases - lib/images/systemd.go: IsSystemdImage() auto-detection from CMD - lib/instances/configdisk.go: Passes INIT_MODE to guest - lib/system/init/init.sh: Shell wrapper to mount /proc /sys /dev before Go runtime (Go requires these during initialization) - integration/systemd_test.go: Full E2E test verifying: - systemd is PID 1 - hypeman-agent.service is active - journalctl works for viewing logs ## Boot flow: 1. Kernel loads initrd with busybox + Go init + guest-agent 2. init.sh mounts /proc, /sys, /dev (Go runtime needs these) 3. init.sh execs Go init binary 4. Go init mounts overlay rootfs, configures network, copies agent 5. Based on INIT_MODE: exec mode (run entrypoint) or systemd mode (chroot + exec /sbin/init)
By default, waits up to 30 seconds for the guest agent to become ready. This prevents immediate failures when the VM is still booting. Use --wait-for-agent=0 to fail immediately (old behavior). Depends on: onkernel/hypeman#50
| @@ -0,0 +1,418 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove before merge
| } | ||
|
|
||
| // dropToShell drops to an interactive shell for debugging when boot fails | ||
| func dropToShell() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this is actually useful... opus tooks some liberties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe should log something, just so it's clear what's going on from hypeman logs ...
sjmiller609
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. One primary simplification tip I think would be nice is the serialize / deserialize config as json simplification, shared single type definition in host and guest code (imported in guest code)
lib/images/systemd.go
Outdated
| // - /sbin/init | ||
| // - /lib/systemd/systemd | ||
| // - /usr/lib/systemd/systemd | ||
| // - Any path ending in /init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe any path ending in /init is too broad to decide it's a systemd image. I think it's not that uncommon to name an entrypoint script like that
lib/system/init/config.go
Outdated
| ) | ||
|
|
||
| // Config holds the parsed configuration from the config disk. | ||
| type Config struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be imported from somewhere? seems like this type is already defined wherever we serialize it into the file.
lib/system/init/config.go
Outdated
|
|
||
| // parseConfigFile parses a shell-style configuration file. | ||
| // It handles simple KEY=VALUE and KEY="VALUE" assignments. | ||
| func parseConfigFile(path string) (*Config, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be simpler to just change to writing json to the file instead of env list? Since maybe it's only an env list as a left over artifact from being used as "source" but now if it's go writing and go reading, might be cleaner to just define the config struct in one place and use normal json serialization / deserialization.
lib/system/init/config.go
Outdated
| // parseVolumeMounts parses the VOLUME_MOUNTS string. | ||
| // Format: "device:path:mode[:overlay_device] device:path:mode ..." | ||
| func parseVolumeMounts(s string) []VolumeMount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this could be way simplified I would think if we go with json serialization, define types more cleanly assuming they are being json serialized instead of kinda jamming it into env list then pulling back out from that.
| # Minimal init wrapper that sets up environment before running Go init | ||
| # The Go runtime needs /proc and /dev to exist during initialization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } | ||
|
|
||
| // dropToShell drops to an interactive shell for debugging when boot fails | ||
| func dropToShell() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe should log something, just so it's clear what's going on from hypeman logs ...
lib/system/init/mode_systemd.go
Outdated
| // Exec systemd - this replaces the current process | ||
| log.Info("systemd", "exec /sbin/init") | ||
|
|
||
| // syscall.Exec replaces the current process with the new one | ||
| // /sbin/init is typically a symlink to /lib/systemd/systemd | ||
| err := syscall.Exec("/sbin/init", []string{"/sbin/init"}, os.Environ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it always /sbin/init? shouldn't we run the user's CMD anyways instead of override it to /sbin/init?
Defines Config and VolumeMount types in a shared package that both the host (configdisk.go) and guest init binary can import, eliminating duplication.
- configdisk.go now uses vmconfig.Config instead of local GuestConfig - init binary now imports vmconfig instead of duplicating types - Also adds logging to dropToShell() for better debugging - mode_systemd.go now runs user's CMD instead of hardcoding /sbin/init
Per review feedback, matching any path ending in /init is too aggressive since many entrypoint scripts are named 'init'. Now only matches explicit systemd paths: /sbin/init, /lib/systemd/systemd, /usr/lib/systemd/systemd
- buildEnv: user's env vars now take precedence over defaults (PATH, HOME) - systemd mode: pass user's env vars via buildEnv instead of os.Environ() - volumes: use device name for overlay mount points to avoid basename collisions
Summary
This PR adds support for running systemd-based OCI images, enabling a VM experience where systemd is PID 1 and manages the full system.
Motivation
Previously, hypeman only supported "exec mode" where the Go init binary runs as PID 1 and executes the container entrypoint directly. This works great for Docker-style single-process containers, but doesn't support images designed to run a full init system.
With this change, you can now run images like
jrei/systemd-ubuntu:22.04and get a full Linux system experience:systemctlworksjournalctlworksHow it works
Auto-detection
The mode is auto-detected from the image's CMD:
/sbin/init,/lib/systemd/systemd, or similar → systemd modeBoot Flow
Guest Agent
In systemd mode, the guest-agent is installed as a systemd service (
hypeman-agent.service) that starts automatically. This enableshypeman exec,hypeman cp, and other remote operations.Key Changes
lib/system/init/*.golib/system/init/init.shlib/images/systemd.goIsSystemdImage()auto-detection from CMDlib/instances/configdisk.goINIT_MODEto guest via config diskintegration/systemd_test.goTesting
The test:
jrei/systemd-ubuntu:22.04IsSystemdImage()detects it correctlysystemd/opt/hypeman/guest-agentexistshypeman-agent.serviceis activejournalctl -u hypeman-agentworksDemo
Note
Enables full systemd-based VMs and modernizes guest boot/init flow.
lib/system/init/*) with structured logging and dual modes: exec (default) and systemd (auto-detected viaimages.IsSystemdImage) with service injection forhypeman-agentInitBinary, wrapperinit.sh); initrd build updated to include these and NVIDIA modules; staleness hash logic adjustedlib/vmconfig/); config disk now writesconfig.jsonand includes network, volumes, GPU, andinit_modewait_for_agentand guest client adds retryable vsock dialing (AgentVSockDialError) and Unavailable handlingguest-agentand newinit; build/dev/test/release-prep depend on these; clean targets updated; .gitignore includes built initintegration/systemd_test.go; unit tests forIsSystemdImage; exec tests useWaitForAgentand updated log assertionslib/system/README.mdupdated to describe Go init, storage/versioning, and workflowsWritten by Cursor Bugbot for commit 0dd3098. This will update automatically on new commits. Configure here.