lifecycle-go

Go representation of pleme-io's service-lifecycle model. The Go counterpart to the Rust service-lifecycle crate: the same model, so every Go service and tool supervises itself the same way — one signal-aware context, one ordered graceful shutdown, one health surface, one run-loop.

No ad-hoc signal.Notify boilerplate, no hand-rolled shutdown ordering, no per-service /healthz handler. Wire one App once; every binary behaves identically under SIGINT/SIGTERM, a failing dependency, and a Kubernetes probe.

The single external dependency is golang.org/x/sync/errgroup (Go-team owned, de-facto stdlib) for the ctx-aware goroutine group. Everything else is stdlib (context, os/signal, log/slog, errors, net/http).

`App` — the composed owner (BOREALIS §2.5)

lifecycle.App is the one owner that runs the work, flips readiness, drains, and tears down in order. Construct it the canonical way and call Run once:

app, err := lifecycle.New(cfg.Lifecycle, lifecycle.WithLogger(log))
// or: lifecycle.FromConfig(cfg.Lifecycle)   // consumes the shikumi sub-struct
if err != nil { return err }

app.Go("reconcile", reconcileLoop).                       // errgroup (ctx-aware)
    Actor("http",  srv.serve, func(error){ srv.Close() }). // oklog/run shape
    Supervise("poller", poll, lifecycle.DefaultBackoff()). // suture-style restart
    Probe("db", lifecycle.ProbeFunc(db.PingContext)).      // readiness dependency
    OnShutdown("db", func(context.Context) error { return db.Close() })

return app.Run(ctx)   // encodes the k8s shutdown choreography

Run choreography, in order:

derive a signal-aware run context;
mount the health planes (+ the deferred /metrics seam) and start any periodic probe loops;
run the work group — the first fatal error or a delivered signal begins teardown;
flip readiness DOWN first → drain (sleep DrainInterval and, concurrently, run every OnDrain remote-session Drainable inside that one window) → cancel the group → wait → run the LIFO Shutdown stack under ShutdownGrace (kept below the pod's terminationGracePeriodSeconds). This ordering eliminates rolling-deploy 502s.

Draining remote sessions (`OnDrain` / `Drainable`)

The readiness-down sleep only tells external load balancers to stop sending new traffic — it does nothing about sessions the process is already holding on a remote peer (SRA SSH/web sessions, a SOCKS tunnel, an event-forwarding channel, a long-poll subscription). A local LIFO OnShutdown close stack cannot reach those: the sessions live on the peer, not in this process. OnDrain registers a Drainable that runs during the drain window to release them:

app.OnDrain("sra-sessions", lifecycle.DrainFunc(func(ctx context.Context) error {
    sra.StopAcceptingSessions()           // refuse new remote sessions
    return sra.WaitForActiveSessions(ctx) // let live ones finish within ctx
}))

All registered drainers run concurrently (different peers, independent waits) under the single DrainInterval budget — a Drainable that ignores its ctx deadline is abandoned (and reported as an error) when the window closes, never blocking past the budget. Panics are isolated. Registering after Run starts is ignored. With no drainers registered the drain is exactly the historical sleep. Use OnShutdown for local resources, OnDrain for remote sessions.

Three goroutine shapes

Verb	Shape	Use when
`app.Go(name, fn)`	x/sync/errgroup (ctx-aware)	the work watches a context
`app.Actor(name, execute, interrupt)`	oklog/run pair (in-package)	the work blocks and can't watch ctx (`Accept` loops)
`app.Supervise(name, fn, backoff)`	suture-style restart (in-package)	the work should restart with backoff on crash

Supervise honours ErrDoNotRestart (stop cleanly) and ErrTerminate (stop and propagate). Every spawned unit and shutdown hook recovers panics into errors.

The four leaf primitives

App composes four primitives that are also usable directly:

SignalContext — a context.Context that cancels when the process is signalled (SIGINT/SIGTERM by default). The root of every run.
Shutdown — named hooks run in LIFO order under a single deadline, with errors aggregated (errors.Join). The observable analog of a defer stack.
Registry / Probe — liveness/readiness/startup aggregation, tri-state (up/down/unknown), optional per-probe WithCache/WithPeriodic, transition listeners, plus a stdlib http.Handler exposing /livez,/healthz,/readyz, /startupz. lifecycle-go is the single fleet owner of the health planes.
RunLoop — a ticking work loop that stops on context cancellation, with optional exponential backoff on error.

Usage

package main

import (
	"context"
	"log/slog"
	"net/http"
	"time"

	"github.com/pleme-io/lifecycle-go"
)

func main() {
	// 1. Root context cancels on SIGINT/SIGTERM.
	ctx, stop := lifecycle.SignalContext(context.Background())
	defer stop()

	// 2. Ordered, bounded teardown (LIFO — reverse of acquisition order).
	srv := &http.Server{Addr: ":8080"}
	sd := lifecycle.NewShutdown(slog.Default())
	sd.Add("http-server", srv.Shutdown)
	sd.Add("db", func(context.Context) error { return db.Close() })

	// 3. Health surface for Kubernetes probes.
	reg := lifecycle.NewRegistry()
	reg.RegisterLiveness("self", lifecycle.ProbeFunc(func(context.Context) error { return nil }))
	reg.RegisterReadiness("db", lifecycle.ProbeFunc(db.PingContext))
	srv.Handler = reg.Handler() // serves /healthz and /readyz

	go srv.ListenAndServe()

	// 4. Background work loop, with backoff on error.
	go lifecycle.RunLoop(ctx, 30*time.Second, reconcile,
		lifecycle.WithLoopLogger(slog.Default()),
		lifecycle.WithBackoff(5*time.Minute),
	)

	<-ctx.Done() // a signal arrived
	_ = sd.Run(context.Background(), 30*time.Second)
}

Health endpoints

Path	Plane	Question	Failure action (k8s)
`/healthz`, `/livez`	liveness	"is the process wedged?"	restart the pod
`/readyz`	readiness	"can it serve traffic now?"	pull from rotation
`/startupz`	startup	"has it finished booting?"	gate liveness during boot

Each returns 200 when its plane is OK and 503 otherwise, with a small JSON body — {"status":"ok"|"fail","checks":{<name>:"ok"|<error>}} — for humans and log scrapers. Keep liveness probes dependency-free so a flaky downstream never triggers restarts.

Shutdown ordering

Hooks run last-in-first-out: the resource registered last (typically acquired last) is released first. The HTTP server stops accepting before the DB pool closes, the pool closes before the metrics flusher, and so on. Errors are aggregated with errors.Join, never short-circuited — one failing close does not skip the rest. Once the per-shutdown deadline passes, remaining hooks are skipped and reported.

Run-loop options

WithImmediateTick() — fire once on entry before the first interval.
WithStopOnError() — a tick error terminates the loop (becomes the return).
WithBackoff(max) — double the inter-tick delay on consecutive errors up to max, resetting on the first success.
WithLoopLogger(log) — log tick errors and backoff decisions.

Build & test

go build ./...
go test ./...

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
nix/modules		nix/modules
.pleme-io-release.toml		.pleme-io-release.toml
LICENSE		LICENSE
README.md		README.md
actor.go		actor.go
app.go		app.go
app_test.go		app_test.go
caixa.lisp		caixa.lisp
config.go		config.go
config_test.go		config_test.go
drain.go		drain.go
drain_test.go		drain_test.go
flake.nix		flake.nix
go.mod		go.mod
go.sum		go.sum
health.go		health.go
health_test.go		health_test.go
lifecycle.go		lifecycle.go
lifecycle_test.go		lifecycle_test.go
probe.go		probe.go
probe_test.go		probe_test.go
runloop.go		runloop.go
runloop_test.go		runloop_test.go
shutdown.go		shutdown.go
shutdown_test.go		shutdown_test.go
supervise.go		supervise.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lifecycle-go

`App` — the composed owner (BOREALIS §2.5)

Draining remote sessions (`OnDrain` / `Drainable`)

Three goroutine shapes

The four leaf primitives

Usage

Health endpoints

Shutdown ordering

Run-loop options

Build & test

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

lifecycle-go

App — the composed owner (BOREALIS §2.5)

Draining remote sessions (OnDrain / Drainable)

Three goroutine shapes

The four leaf primitives

Usage

Health endpoints

Shutdown ordering

Run-loop options

Build & test

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`App` — the composed owner (BOREALIS §2.5)

Draining remote sessions (`OnDrain` / `Drainable`)

Packages