Skip to content

Conversation

@rgarcia
Copy link
Contributor

@rgarcia rgarcia commented Dec 24, 2025

Summary

This PR adds support for running systemd-based OCI images, enabling a VM experience where systemd is PID 1 and manages the full system.

Motivation

Previously, hypeman only supported "exec mode" where the Go init binary runs as PID 1 and executes the container entrypoint directly. This works great for Docker-style single-process containers, but doesn't support images designed to run a full init system.

With this change, you can now run images like jrei/systemd-ubuntu:22.04 and get a full Linux system experience:

  • systemctl works
  • journalctl works
  • SSH and other system services can run
  • The VM feels like an EC2 instance

How it works

Auto-detection

The mode is auto-detected from the image's CMD:

  • If CMD is /sbin/init, /lib/systemd/systemd, or similar → systemd mode
  • Otherwise → exec mode (existing behavior)

Boot Flow

Kernel → init.sh (mount /proc /sys /dev) → Go init binary
                                              ↓
                                    ┌─────────┴─────────┐
                                    ↓                   ↓
                              Exec Mode           Systemd Mode
                                    ↓                   ↓
                            Run entrypoint      chroot + exec /sbin/init
                            as child process    (systemd becomes PID 1)

Guest Agent

In systemd mode, the guest-agent is installed as a systemd service (hypeman-agent.service) that starts automatically. This enables hypeman exec, hypeman cp, and other remote operations.

Key Changes

File Description
lib/system/init/*.go New Go-based init binary with modular boot phases
lib/system/init/init.sh Shell wrapper (Go runtime needs /proc /sys /dev before main())
lib/images/systemd.go IsSystemdImage() auto-detection from CMD
lib/instances/configdisk.go Passes INIT_MODE to guest via config disk
integration/systemd_test.go Full E2E test

Testing

# Run the systemd integration test
make test TEST=TestSystemdMode

The test:

  1. Pulls jrei/systemd-ubuntu:22.04
  2. Verifies IsSystemdImage() detects it correctly
  3. Boots a VM
  4. Waits for guest-agent to be ready
  5. Verifies:
    • PID 1 is systemd
    • /opt/hypeman/guest-agent exists
    • hypeman-agent.service is active
    • journalctl -u hypeman-agent works

Demo

# Run a systemd VM
hypeman run --name demo jrei/systemd-ubuntu:22.04

# Wait a few seconds for boot, then:
hypeman exec demo cat /proc/1/comm
# → systemd

hypeman exec demo systemctl status hypeman-agent
# → active (running)

hypeman exec demo journalctl -u hypeman-agent --no-pager -n 5
# → shows agent logs

Note

Enables full systemd-based VMs and modernizes guest boot/init flow.

  • New Go init (lib/system/init/*) with structured logging and dual modes: exec (default) and systemd (auto-detected via images.IsSystemdImage) with service injection for hypeman-agent
  • Replaces shell init script with embedded binaries (InitBinary, wrapper init.sh); initrd build updated to include these and NVIDIA modules; staleness hash logic adjusted
  • Host→guest configuration switched to JSON (lib/vmconfig/); config disk now writes config.json and includes network, volumes, GPU, and init_mode
  • Exec API adds wait_for_agent and guest client adds retryable vsock dialing (AgentVSockDialError) and Unavailable handling
  • Makefile builds embedded guest-agent and new init; build/dev/test/release-prep depend on these; clean targets updated; .gitignore includes built init
  • Tests: new integration/systemd_test.go; unit tests for IsSystemdImage; exec tests use WaitForAgent and updated log assertions
  • Docs: lib/system/README.md updated to describe Go init, storage/versioning, and workflows

Written by Cursor Bugbot for commit 0dd3098. This will update automatically on new commits. Configure here.

Replace shell-based init script with Go binary that supports two modes:

## Exec Mode (existing behavior)
- Go init runs as PID 1
- Starts guest-agent in background
- Runs container entrypoint as child process
- Used for standard Docker images (nginx, python, etc.)

## Systemd Mode (new)
- Auto-detected when image CMD is /sbin/init or /lib/systemd/systemd
- Go init sets up rootfs, then chroots and execs systemd
- Systemd becomes PID 1 and manages the full system
- guest-agent runs as a systemd service (hypeman-agent.service)
- Enables EC2-like experience: ssh, systemctl, journalctl all work

## Key changes:
- lib/system/init/: New Go-based init binary with modular boot phases
- lib/images/systemd.go: IsSystemdImage() auto-detection from CMD
- lib/instances/configdisk.go: Passes INIT_MODE to guest
- lib/system/init/init.sh: Shell wrapper to mount /proc /sys /dev
  before Go runtime (Go requires these during initialization)
- integration/systemd_test.go: Full E2E test verifying:
  - systemd is PID 1
  - hypeman-agent.service is active
  - journalctl works for viewing logs

## Boot flow:
1. Kernel loads initrd with busybox + Go init + guest-agent
2. init.sh mounts /proc, /sys, /dev (Go runtime needs these)
3. init.sh execs Go init binary
4. Go init mounts overlay rootfs, configures network, copies agent
5. Based on INIT_MODE: exec mode (run entrypoint) or systemd mode (chroot + exec /sbin/init)
@rgarcia rgarcia changed the title feat: add systemd mode for EC2-like VMs feat: add systemd mode for full VM experience Dec 24, 2025
rgarcia added a commit to onkernel/hypeman-cli that referenced this pull request Dec 24, 2025
By default, waits up to 30 seconds for the guest agent to become ready.
This prevents immediate failures when the VM is still booting.

Use --wait-for-agent=0 to fail immediately (old behavior).

Depends on: onkernel/hypeman#50
@@ -0,0 +1,418 @@
---
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove before merge

}

// dropToShell drops to an interactive shell for debugging when boot fails
func dropToShell() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is actually useful... opus tooks some liberties

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should log something, just so it's clear what's going on from hypeman logs ...

@rgarcia rgarcia requested a review from sjmiller609 December 24, 2025 14:08
Copy link
Collaborator

@sjmiller609 sjmiller609 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. One primary simplification tip I think would be nice is the serialize / deserialize config as json simplification, shared single type definition in host and guest code (imported in guest code)

// - /sbin/init
// - /lib/systemd/systemd
// - /usr/lib/systemd/systemd
// - Any path ending in /init
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe any path ending in /init is too broad to decide it's a systemd image. I think it's not that uncommon to name an entrypoint script like that

)

// Config holds the parsed configuration from the config disk.
type Config struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be imported from somewhere? seems like this type is already defined wherever we serialize it into the file.


// parseConfigFile parses a shell-style configuration file.
// It handles simple KEY=VALUE and KEY="VALUE" assignments.
func parseConfigFile(path string) (*Config, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be simpler to just change to writing json to the file instead of env list? Since maybe it's only an env list as a left over artifact from being used as "source" but now if it's go writing and go reading, might be cleaner to just define the config struct in one place and use normal json serialization / deserialization.

Comment on lines 145 to 147
// parseVolumeMounts parses the VOLUME_MOUNTS string.
// Format: "device:path:mode[:overlay_device] device:path:mode ..."
func parseVolumeMounts(s string) []VolumeMount {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this could be way simplified I would think if we go with json serialization, define types more cleanly assuming they are being json serialized instead of kinda jamming it into env list then pulling back out from that.

Comment on lines +2 to +3
# Minimal init wrapper that sets up environment before running Go init
# The Go runtime needs /proc and /dev to exist during initialization
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

// dropToShell drops to an interactive shell for debugging when boot fails
func dropToShell() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should log something, just so it's clear what's going on from hypeman logs ...

Comment on lines 37 to 42
// Exec systemd - this replaces the current process
log.Info("systemd", "exec /sbin/init")

// syscall.Exec replaces the current process with the new one
// /sbin/init is typically a symlink to /lib/systemd/systemd
err := syscall.Exec("/sbin/init", []string{"/sbin/init"}, os.Environ())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it always /sbin/init? shouldn't we run the user's CMD anyways instead of override it to /sbin/init?

Defines Config and VolumeMount types in a shared package that both
the host (configdisk.go) and guest init binary can import, eliminating
duplication.
- configdisk.go now uses vmconfig.Config instead of local GuestConfig
- init binary now imports vmconfig instead of duplicating types
- Also adds logging to dropToShell() for better debugging
- mode_systemd.go now runs user's CMD instead of hardcoding /sbin/init
Per review feedback, matching any path ending in /init is too aggressive
since many entrypoint scripts are named 'init'. Now only matches explicit
systemd paths: /sbin/init, /lib/systemd/systemd, /usr/lib/systemd/systemd
- buildEnv: user's env vars now take precedence over defaults (PATH, HOME)
- systemd mode: pass user's env vars via buildEnv instead of os.Environ()
- volumes: use device name for overlay mount points to avoid basename collisions
@rgarcia rgarcia merged commit a08c2c8 into main Dec 26, 2025
4 checks passed
@rgarcia rgarcia deleted the full-vm branch December 26, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants