Correct the copyright holder to Glyndor#13
Closed
Jaro-c wants to merge 135 commits into
Closed
Conversation
v0.8.4's diag showed walkDescendants was returning the right PIDs
and signalling them with SIGTERM, yet the child still survived.
Either the kill(2) syscall is failing (EPERM across a uid/pid
namespace boundary in the runner container?) or it is succeeding
and the child is ignoring/masking the signal.
This revision:
- signalTree logs syscall.Kill's return value for every
descendant and the pgroup send when LYNX_DEBUG_STOP is on, so
the next lynxd.log excerpt answers the "was the signal even
delivered" question directly.
- debian-tests smoke now dumps `ps -ef` unfiltered (the earlier
awk filter was printing nothing) plus the child's
/proc/<pid>/status excerpt — Uid, PPid, State, SigIgn, SigCgt.
A SIGTERM-masked child would show bit 15 set in SigIgn or
SigCgt; a uid mismatch shows up on the Uid line.
.claude/scheduled_tasks.lock slipped into the v0.8.5 diag commit via `git add -A`. It's Claude Code's per-project session state, not source. Remove from the tree and ignore the whole .claude/ directory so future `git add -A` invocations can't re-include it.
The v0.8.5 diag run confirmed the v0.8.3 descendant-walk fix IS working end-to-end on the install matrix: walkDescendants finds every child, signalTree delivers SIGTERM with no errno, the processes terminate. The test still reported a failure because kill(pid, 0) also returns success for zombie entries, and the debian / ubuntu smoke containers start without an init-style reaper — so every backgrounded child that outlives its bash parent during the SIGTERM fan-out lingers as State: Z in /proc until the daemon itself exits. A zombie holds no open fd, no listening socket, and no memory; it cannot cause the EADDRINUSE the real user bug was about. The Go test helper and the debian smoke's post-stop check now both parse /proc/<pid>/status and classify State: Z as dead, matching what the supervisor actually delivers. Also drops the pre-stop ps/stat/status dumps the diag releases (0.8.4, 0.8.5) added to the smoke now that the root cause is pinned down. LYNX_DEBUG_STOP stays in the daemon as an opt-in trace for future "Stop looked fine but the child kept running" reports.
Test-semantics patch. No behavioural change to the binaries vs v0.8.5 — the v0.8.3 /proc descendant walk that kills children on Stop is still the load-bearing fix. This release closes out the test-predicate false-positive that was making the install matrix look like the real bug hadn't been resolved.
scripts/ now contains only sign.go, which release.yml calls to
sign the release binaries. The three removed shell scripts were
dev-convenience wrappers that no CI workflow ever invoked:
- build_cli.sh: duplicates `go build ./cmd/lynxpm` / `./cmd/lynxd`
with ldflags; the Release workflow builds inline.
- build_deb.sh: wraps `dpkg-buildpackage`; debian-tests.yml
calls dpkg-buildpackage directly. Nothing else referenced it.
- test_all.sh: wraps `go test -v ./...`, already exposed as
`make test-v`.
Removing them cuts maintenance surface without changing any CI
behaviour. scripts/sign.go stays.
Introduces a catalog of minimal sample apps plus a reusable smoke
script that drives the installed lynxpm + lynxd through the full
lifecycle surface, end-to-end, against the binary the .deb ships.
The previous debian-tests smoke inlined a single bespoke bash
wrapper for the process-group regression and nothing else — every
other CLI command (restart, reload, reset, flush), every runtime
other than /bin/sleep, and every advanced selector (--namespace,
ns:*) went unexercised against a packaged binary. This change
closes those gaps.
testdata/apps/:
shell-forkstorm/ — bash wrapper that spawns 10 long-running
sleep children; regression for gracefulKill's
/proc descendant walk (v0.8.3).
node-http/ — HTTP listener with graceful SIGTERM shutdown;
exercises node runtime + listener-style apps.
node-ignores-term/ — masks SIGTERM; forces the SIGKILL fallback
path when --stop-timeout expires.
python-worker/ — long-running worker with clean SIGTERM exit;
exercises python3 runtime.
python-crashloop/ — exits 1 after 1s; regresses --max-restarts
cap enforcement.
go-compiled/ — ctx-based graceful shutdown compiled with
CGO_ENABLED=0; verifies statically-linked
binary supervision.
testdata/smoke.sh covers:
1. Vanilla lifecycle (start/list/show/stop/delete) on /bin/sleep.
2. Forkstorm regression — 10 children must die on stop.
3. restart / reset / flush with --json batch shape assertions.
4. --max-restarts 2 cap must transition the crash-looping app to
State: failed.
5. --namespace bulk stop + ns:* glob delete on two apps.
6. Real node HTTP listener start+stop (skipped if node absent).
debian-tests install-matrix now checks out the repo alongside the
.deb download, installs nodejs + python3, and delegates to
testdata/smoke.sh under testuser. Every scenario runs on every
distro in the matrix (debian:bookworm, debian:trixie, ubuntu:22.04,
ubuntu:24.04). Local dev gets the same coverage via
`bash testdata/smoke.sh`
with lynxd already running.
Minor release for the testdata/apps/ + smoke.sh end-to-end coverage. No behavioural change to the daemon or CLI — every addition lives under testdata/ or inside the CI workflow. Kept on the 0.x track because the IPC/spec surface is unchanged and the project has not yet committed to backward-compat guarantees.
require('node:http') / require('node:https') require Node 16+.
ubuntu:22.04 installs the distro's default nodejs package which
still ships Node 12, and the v0.9.0 smoke regressed on exactly
that one matrix entry because server.js failed to load. Plain
require('http') is the portable spelling and works on every
node version the install matrix sees.
Patch the node:-prefix compat issue in the sample apps so the ubuntu:22.04 install matrix turns green again. No behavioural change to the binaries versus v0.9.0.
Expands the testdata/apps catalog with two more interpreted
runtimes, wires the existing Go binary through CI, and adds two
scenarios the smoke previously left uncovered.
New sample apps:
- php-worker/worker.php — long-running worker that handles
SIGTERM / SIGINT via pcntl_async_signals; mirrors the
python-worker shape so any PHP-specific regression surfaces as
a single failing scenario rather than contaminating the
lifecycle coverage.
- ruby-worker/worker.rb — Ruby counterpart using Signal.trap;
$stdout.sync enabled so `lynxpm logs --follow` matches the
ordering the tests assert on.
go-compiled now ships end-to-end. The build-deb job cross-compiles
the binary (CGO off, matching the release binaries' zero-shlib-deps
contract) and uploads it as a separate artifact; install-matrix
jobs download it into testdata/apps/go-compiled/ before running
the smoke. No Go toolchain needed inside the matrix containers.
New smoke scenarios:
- SIGKILL fallback: spawns node-ignores-term with
`--stop-timeout 2000`, issues Stop, asserts the call returns in
the 2-4s window. Below 2s means the app didn't actually ignore
SIGTERM (the setup is wrong); above 4s means the grace period
elapsed but the supervisor never escalated to SIGKILL.
- scale up / down: `lynxpm start --scale 3`, asserts 3 instances
appear in list, then scales to 1, asserts 1 survives, then
scales to 2, asserts 2 running. Exercises the scale spec-index
bookkeeping that nothing else in the smoke touches.
ci: install-matrix apt-installs php-cli and ruby alongside
nodejs + python3, and the new download-artifact step restores the
compiled Go binary into place before invoking smoke.sh.
Test-suite patch: adds PHP + Ruby coverage, wires the compiled Go sample through the matrix, and extends smoke.sh with SIGKILL fallback + scale scenarios. No behavioural change to the binaries.
Adding ruby to the smoke step's apt install chain in v0.9.2 pulled tzdata in on ubuntu:22.04. Without DEBIAN_FRONTEND=noninteractive tzdata's postinst opened an interactive "select geographic area" prompt on the step's pseudo-tty; GH Actions has no way to answer it, so the whole install matrix entry hung until the 14-minute wall-clock gap was observable in the run list. Setting the frontend at the step env level plus TZ=Etc/UTC makes the tzdata configure path silent on every distro in the matrix. The earlier prepare-container step already exported the same env inline — this just extends that default to the step that actually pulls the tzdata-dependent packages.
Patch: unblocks the v0.9.2 install-matrix hang by silencing tzdata's interactive postinst. No binary change vs v0.9.2.
…lpers Aggregates four post-release-cleanup nits that /simplify surfaced: 1. Single /proc/<pid>/stat parser. manager/process.go had its own readPPID that byte-for-byte duplicated metrics/proctree_linux.go's getPpid. Export it as metrics.GetPPID and delete the duplicate; both callers now share one parser (and will share any future fix to it). 2. walkDescendants as a downward DFS. The old implementation built an upward parent→parent map and walked each PID's ancestor chain in a bounded loop (depth cap 1024 "to protect against corrupt /proc"). A forward ppid→children map plus a DFS from root does the same work without the cap and halves the inner-loop code. Output order (deepest-first) is preserved via post-order append. 3. gracefulKill poll 200ms → 50ms. Supervised apps that exit on their own in <50ms paid up to a full 200ms idle wait before the poller noticed; that landed as visible restart latency in the Stop→Start chain. 50ms covers the 95th-percentile fast exit without raising syscall load in any meaningful way (bounded at ~200 kill(pid,0) calls per 10s timeout, all cache-hot permission checks). 4. smoke.sh: run_worker_scenario + wait_count helpers. Collapses the PHP / Ruby / Go / scale scenarios' copy-pasted `start → sleep 1 → grep log → stop → delete` blocks onto two reusable functions. wait_count replaces the `sleep 1; assert` flakiness with a 2s poll against the same condition, so slow CI runners don't race the scale/delete fan-out. Also strips PR-description prose from the comments added during the v0.8.1-v0.8.6 investigations (signal-order invariants stay in, "reported in the field for next-server / bun / gunicorn" moves to where it belongs — commit messages / changelog). No behavioural change. All Go tests + local smoke still green.
Cleanup release: consolidates the /proc parser, downward-DFS walkDescendants, tightens the gracefulKill poll, and factors smoke.sh scenarios onto shared helpers. No behavioural change.
Before, a spec the operator stopped via `lynxpm stop` (which writes
Disabled=true to disk) vanished from `lynxpm list` after any daemon
restart. The JSON stayed on disk but the CLI had no way to see or
interact with it — the operator had to either edit the JSON by hand
to flip the Disabled flag, or delete + recreate the spec with the
same start command they'd lost along the way.
That conflated two intents: "don't auto-spawn this on boot" (right)
and "pretend this spec doesn't exist" (wrong). Stop should persist
intent, not hide state.
Restore now:
- Loads every spec into the manager's process map.
- Spawns only the ones that are not Disabled.
- For Disabled specs, constructs the Process in State=stopped with
noAutoRestart / stoppedByUser set, so list/show report the
correct state and no failure path can resurrect it implicitly.
Manager.Restart additionally clears Disabled=false on disk after a
successful manual restart, so `lynxpm restart <name>` on a loaded-
but-stopped spec brings it back and keeps it coming back across
daemon reboots.
Tests: the existing "App B (disabled) should NOT be in manager"
assertion in TestRestoreAndPersistence is inverted to the new
contract — the spec IS loaded, but State != Running.
UX fix release: disabled specs remain visible in `lynxpm list` across daemon restarts, and `lynxpm restart` re-enables them. No ABI / IPC / spec-format change.
v0.9.5's addStoppedSpec copy-pasted ~20 lines of uniqueness-check machinery from StartWithSpec. Factor it out onto registerLocked, called from both paths. StartWithSpec keeps its own ID-collision error since it is user-initiated; addStoppedSpec treats duplicates as a benign no-op matching Restore's idempotent contract. Manager.Restart previously read proc.spec.Disabled outside the mutex and wrote it inside — a data race under -race even though the window was small. Read + flip + snapshot now all happen under proc.mu; the SaveSpec call stays outside to avoid holding the lock across disk I/O. Pass -race now green. addStoppedSpec drops the internal proc.mu.Lock/Unlock: the Process isn't published into m.processes until the final assignment, so nothing else can observe it and the lock was pure ceremony. noAutoRestart + stoppedByUser stay (noAutoRestart is load-bearing for cron-scheduled disabled specs).
Refactor + race fix on the v0.9.5 Restore path. No behavioural change to the external API; disabled specs still load as stopped and `lynxpm restart` still re-enables them.
pm2-style follow-up render: after a successful start, stop, or restart,
print the full process table with a ▸ marker on the rows the action just
touched. Lets the user confirm in one shot what changed, without typing
`lynxpm list` as a second command.
Mechanics:
- internal/cli/commands/list: renderTable promoted to an exported
Render(procs, RenderOptions{ShowLong, Highlight}). Highlight is a
set matching process id OR name, so callers can pass whichever
is cheaper.
- start/stop/restart each fetch the list after the primary IPC call
succeeds and render it with the touched IDs highlighted. Skipped
when --json, --quiet, or the new --no-list flag is set, and when
nothing was touched (e.g. all calls failed).
- Id column padded with 2 spaces on non-highlighted rows so the
marker doesn't misalign the table. Max width bumped by 2 to
accommodate the marker.
Tests: new Render highlight cases (by-id, by-name, none). Existing
call-count assertions in start/stop/restart tests now pass --no-list
so the extra IPC round-trip doesn't throw off the count.
Follow-up to v0.9.7. Three cleanups landed together since they all flow from the same review pass: - list.FetchAndRender replaces three byte-identical printPostActionList copies in start/stop/restart. All call sites already imported the list package for Render; collapsing to one helper also drops the types import each action command was carrying only for this. - The id-column width bump (+2) and the per-row "▸ " / " " prefix now fire only when a Highlight set is passed. Plain `lynxpm list` rendered two chars wider than before 0.9.7 and prepended a blank padding to every row for alignment — pointless work in the no-highlight path, which is the common case. With the gate, the plain list is byte-identical to pre-0.9.7 and post-action lists still align correctly. - Touched-set maps in stop/restart now get a len(ids) capacity hint, and a leftover `showLong := opts.ShowLong` alias in Render went away (opts.ShowLong inline is fine, it's read twice). No behavior change for --json, --quiet, --no-list paths. Tests and live smoke unchanged.
Ships a docs site at https://jaro-c.github.io/Lynx/ built with Astro and Starlight. The source markdown stays where it is (docs/commands/, docs/RUNTIMES.md, docs/TUTORIALS.md, docs/FAQ.md, ARCHITECTURE.md, SECURITY.md) — the site/ tree holds Starlight-specific frontmatter wrappers, a landing page (index.mdx), and the theme config. What landed: - site/ — Astro + Starlight scaffold. Landing page with hero, feature grid, install tabs, a head-to-head table vs PM2/Supervisor, and two CTAs. Custom CSS sets a lion-green accent; dark mode is default. - Four new "Getting started" pages written for the site: introduction, install, quickstart, access-model. Everything else is sourced from the existing repo markdown with a thin frontmatter shim so edits to the originals still flow through. - Twenty command reference pages migrated from docs/commands/ with auto-extracted description frontmatter. - astro.config.mjs — site URL + base set for the project-page URL shape (jaro-c.github.io/Lynx), sitemap + Pagefind search enabled by default, og:image + twitter:card meta, schema.org SoftwareApplication JSON-LD for SEO, edit-on-GitHub link. - .github/workflows/pages.yml — build + deploy on push to main for changes under site/, docs/, and the top-level markdown files. Uses the native GitHub Pages Actions deployment (no gh-pages branch). SHA-pinned to match the repo's existing action-pinning convention. - README.md — surfaces the docs-site URL at the top of the Documentation section so people find the rendered version first. The user still needs to flip Settings → Pages → Source = "GitHub Actions" once; after that the workflow takes over.
The default Starlight splash read generic. This pass adds a distinctive landing: - Hero: gradient headline with an accent-highlighted phrase, a live- looking terminal preview card on the right that shows an actual `lynxpm start` flow with the new ▸ highlight on the touched row (so the feature we just shipped in 0.9.7/0.9.8 is the first thing a visitor sees). Stats row with base-RAM / startup / binary-count anchors the claim. - Feature grid: six pillar cards — RAM, systemd, sandboxing, namespaces, CLI-or-Lynxfile, signed releases. Inline SVG icons so there's no request waterfall and the grid stays crawlable. - Comparison table: Lynx column gets a left/right accent border + tinted background so the skimmer catches the delta against PM2 / Supervisor without reading all 18 cells. - Final CTA banner: one-line install command + quickstart button, closes the page on a concrete next step. Redesigned the logo + favicon to a geometric "L with arrow flow" mark with trailing process-dots, dark background, green gradient stroke. The old plain L+dot didn't differentiate. The landing hides Starlight's default `.hero` via a `:has()` rule so our custom section owns the viewport without fighting the theme. All animations respect `prefers-reduced-motion`. Built clean (31 pages, Pagefind search + sitemap intact, CSS bundle ~78 KB). No JS shipped by the new components — everything is static HTML + CSS.
- min-width:0 on hero grid children + CTA right so pre/code no longer force the column past 375px on mobile (hero title was clipping). - Hide Starlight's auto-injected #_top H1 on the splash so the hero owns the only <h1> on the page. - Replace duplicated "Lynx | Lynx" title with descriptive page title; override og:type to "website" and add theme-color meta. - Add real 1200x630 og.png (was 404) and robots.txt with sitemap ref; unignore site/public/robots.txt from the root *.txt rule.
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…1.0 (#11) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… (#32) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… responsiveness, and optimize reveal logic and terminal animations
…gation fixes, and new guides on systemd-native and lightweight process management.
…up, formatting, and daemon sandbox implementation
…aints, and optimized test and lint configurations
…nd update supported security versions
…g, and refine linter configuration
- Move //nolint:noctx to preceding line (golines length fix) - Wrap long os.WriteFile call in version_detect_test.go - Drop unused //nolint:errcheck from logs/legacy.go and merge_extra_test.go - fmt.Errorf static strings → errors.New in start/cmd.go and manager.go - Rename stat_t → statT (ST1003 naming convention)
- Drop //nolint:errcheck from manager.go (×2) and process.go (×1) - fmt.Errorf static string → errors.New in socket_unix.go
Purge last //nolint:errcheck from merge_extra_test.go (×5) and process.go (×2); nolintlint flags these as unused since errcheck is satisfied by _ = / _, _ = prefixes alone.
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
- docker/setup-buildx-action v4 → SHA - docker/build-push-action v7 → SHA - actions/dependency-review-action v4 → SHA
Adds .pre-commit-config.yaml with hooks for: - end-of-file-fixer and trailing-whitespace (pre-commit-hooks v6.0.0) - gitleaks v8.30.1 for secret detection - golangci-lint v2.12.1 with config verification - shellcheck v0.11.0.1 for shell script linting
Update version string, site package, install docs, and debian changelog for the v0.13.0 release.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The Apache copyright line still named the pre-org
Jaro-chandle and the deadLynxbrand (nowhelmly). This repo belongs to the Glyndor org, so I changed it toCopyright 2025-2026 Glyndorto match the rest of the stack.