Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:

strategy:
matrix:
node-version: [22.x, 24.x]
node-version: [24.x, 26.x]

steps:
- uses: actions/checkout@v4
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ui-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ jobs:

strategy:
matrix:
node-version: [22.x, 24.x]
node-version: [24.x, 26.x]

services:
postgres:
Expand Down
67 changes: 67 additions & 0 deletions .yarn/changelogs/service.9a1a533d.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
<!-- version-type: minor -->
# service

<!--
FORMATTING GUIDE:

### Detailed Entry (appears first when merging)

Use h3 (###) and below for detailed entries with paragraphs, code examples, and lists.

### Simple List Items

- Simple changes can be added as list items
- They are collected together at the bottom of each section

TIP: When multiple changelog drafts are merged, heading-based entries
appear before simple list items within each section.
-->

## 🗑️ Deprecated
<!-- PLACEHOLDER: Describe deprecated features. Double-check if they are annotated with a `@deprecated` jsdoc tag. -->

## ✨ Features

### Supervised service processes survive abrupt parent termination

Managed services launched by `ProcessRunner.spawnCommand()` now run inside a small Node supervisor (the `process-supervisor` module) instead of a bare shell. The supervisor watches the parent stack-craft process and, when the parent disappears without a graceful stop (`kill -9`, IDE force-stop, abrupt WSL exit, or a crash), tears down the entire child process tree.

- Parent death is detected via EOF on the supervisor's stdin pipe (the parent holds the write end and never writes to it). This fires the instant the parent dies and is immune to PID reuse, unlike polling `process.kill(parentPid, 0)`.
- On POSIX, the supervisor is the process-group leader and escalates SIGTERM → SIGKILL across the group, mirroring `ServiceLifecycleManager.shutdownAll()`.
- On Windows, it uses `taskkill /T` (then `/F`) to walk and kill the descendant tree, since process groups are unavailable.
- SIGTERM/SIGINT/SIGHUP from the parent are forwarded into the same kill cascade, so `killProcessGroup()` keeps working unchanged.
- The grace period before the SIGKILL escalation is configurable via the `WATCHDOG_GRACE_MS` environment variable (default: 5000 ms).

The supervisor is a first-class TypeScript module (type-checked and linted), resolved next to `process-runner` with the same extension — `.js` under `dist/` in production and the source `.ts` in dev/test (Node strips types natively).

This prevents orphaned service processes from lingering when stack-craft itself goes away unexpectedly.

## 🐛 Bug Fixes
<!-- PLACEHOLDER: Describe the nasty little bugs that has been eradicated (fix:) -->

## 📚 Documentation
<!-- PLACEHOLDER: Describe documentation changes (docs:) -->

## ⚡ Performance
<!-- PLACEHOLDER: Describe performance improvements (perf:) -->

## ♻️ Refactoring
<!-- PLACEHOLDER: Describe code refactoring (refactor:) -->

## 🧪 Tests

- Updated `process-runner` spec coverage to assert that user commands are wrapped in the supervisor module (resolved path + stdin/pipe wiring) on both POSIX and Windows.
- Added `process-supervisor` unit specs covering the POSIX and Windows kill-tree branches in-process.
- Added a POSIX integration spec that spawns a real supervised process tree, drops the parent (closes stdin), and asserts the descendant is reaped.

## 📦 Build
<!-- PLACEHOLDER: Describe build system changes (build:) -->

## 👷 CI
<!-- PLACEHOLDER: Describe CI configuration changes (ci:) -->

## ⬆️ Dependencies
<!-- PLACEHOLDER: Describe dependency updates (deps:) -->

## 🔧 Chores
<!-- PLACEHOLDER: Describe other changes (chore:) -->
51 changes: 51 additions & 0 deletions .yarn/changelogs/stack-craft.9a1a533d.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
<!-- version-type: patch -->
# stack-craft

<!--
FORMATTING GUIDE:

### Detailed Entry (appears first when merging)

Use h3 (###) and below for detailed entries with paragraphs, code examples, and lists.

### Simple List Items

- Simple changes can be added as list items
- They are collected together at the bottom of each section

TIP: When multiple changelog drafts are merged, heading-based entries
appear before simple list items within each section.
-->

## ✨ Features
<!-- PLACEHOLDER: Describe your shiny new features (feat:) -->

## 🐛 Bug Fixes

- Managed services are now reliably terminated when stack-craft exits abruptly (`kill -9`, IDE force-stop, crash, or WSL shutdown), preventing orphaned service processes from lingering after the app is gone.

## 📚 Documentation
<!-- PLACEHOLDER: Describe documentation changes (docs:) -->

## ⚡ Performance
<!-- PLACEHOLDER: Describe performance improvements (perf:) -->

## ♻️ Refactoring
<!-- PLACEHOLDER: Describe code refactoring (refactor:) -->

## 🧪 Tests
<!-- PLACEHOLDER: Describe test changes (test:) -->

## 📦 Build

- Raise the minimum Node version to `>=24.0.0` so the process-supervisor module can be spawned directly (native TypeScript type-stripping in dev/test).

## 👷 CI

- Pin the CI Node matrix and Azure pipeline to Node 24.x to match the new engines floor.

## ⬆️ Dependencies
<!-- PLACEHOLDER: Describe dependency updates (deps:) -->

## 🔧 Chores
<!-- PLACEHOLDER: Describe other changes (chore:) -->
3 changes: 3 additions & 0 deletions .yarn/versions/9a1a533d.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
releases:
service: minor
stack-craft: patch
2 changes: 1 addition & 1 deletion azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ pool:
steps:
- task: NodeTool@0
inputs:
versionSpec: '20.x'
versionSpec: '24.x'
displayName: 'Install Node.js'
- script: yarn install
displayName: 'Yarn install'
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
"format:check": "prettier --check ."
},
"engines": {
"node": ">=22.0.0"
"node": ">=24.0.0"
},
"packageManager": "yarn@4.15.0"
}
43 changes: 31 additions & 12 deletions service/src/services/process-runner.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,19 @@ describe('ProcessRunner', () => {
process.env = original
}))

it('should forward WATCHDOG_GRACE_MS so supervisor overrides take effect', () =>
withRunnerContext(async () => {
const original = process.env
process.env = { WATCHDOG_GRACE_MS: '250', SECRET_KEY: 'nope' }

const env = ProcessRunnerImpl.getSafeEnv()

expect(env.WATCHDOG_GRACE_MS).toBe('250')
expect(env).not.toHaveProperty('SECRET_KEY')

process.env = original
}))

it('should handle case-insensitive matching for safe keys', () =>
withRunnerContext(async () => {
const original = process.env
Expand Down Expand Up @@ -304,24 +317,28 @@ describe('ProcessRunner', () => {
})

describe('spawnCommand', () => {
it('should call spawn with correct shell arguments', () =>
it('should wrap the user command in the node supervisor on POSIX', () =>
withRunnerContext(async ({ runner }) => {
const originalPlatform = process.platform
Object.defineProperty(process, 'platform', { value: 'linux', writable: true })

const { spawn } = await import('child_process')
vi.mocked(spawn).mockClear()
const mockChild = { pid: 1, stdout: null, stderr: null } as unknown as ChildProcess
vi.mocked(spawn).mockReturnValue(mockChild)

const result = runner.spawnCommand('echo hello', '/tmp')

expect(result).toBe(mockChild)
expect(spawn).toHaveBeenCalledWith('/bin/sh', ['-c', 'echo hello'], {
cwd: '/tmp',
stdio: ['ignore', 'pipe', 'pipe'],
env: expect.objectContaining({}),
detached: true,
})
expect(spawn).toHaveBeenCalledWith(
process.execPath,
[expect.stringMatching(/process-supervisor\.(ts|js)$/), '/bin/sh', '-c', 'echo hello'],
expect.objectContaining({
cwd: '/tmp',
stdio: ['pipe', 'pipe', 'pipe'],
detached: true,
}),
)

Object.defineProperty(process, 'platform', { value: originalPlatform, writable: true })
}))
Expand All @@ -332,14 +349,15 @@ describe('ProcessRunner', () => {
Object.defineProperty(process, 'platform', { value: 'linux', writable: true })

const { spawn } = await import('child_process')
vi.mocked(spawn).mockClear()
const mockChild = { pid: 1, stdout: null, stderr: null } as unknown as ChildProcess
vi.mocked(spawn).mockReturnValue(mockChild)

runner.spawnCommand('echo hello', '/tmp', { MY_VAR: 'test' })

expect(spawn).toHaveBeenCalledWith(
'/bin/sh',
['-c', 'echo hello'],
process.execPath,
[expect.stringMatching(/process-supervisor\.(ts|js)$/), '/bin/sh', '-c', 'echo hello'],
expect.objectContaining({
env: expect.objectContaining({ MY_VAR: 'test' }),
}),
Expand All @@ -348,20 +366,21 @@ describe('ProcessRunner', () => {
Object.defineProperty(process, 'platform', { value: originalPlatform, writable: true })
}))

it('should use cmd.exe on Windows', () =>
it('should hand the supervisor cmd.exe on Windows', () =>
withRunnerContext(async ({ runner }) => {
const originalPlatform = process.platform
Object.defineProperty(process, 'platform', { value: 'win32', writable: true })

const { spawn } = await import('child_process')
vi.mocked(spawn).mockClear()
const mockChild = { pid: 1, stdout: null, stderr: null } as unknown as ChildProcess
vi.mocked(spawn).mockReturnValue(mockChild)

runner.spawnCommand('echo hello', 'C:\\temp')

expect(spawn).toHaveBeenCalledWith(
'cmd.exe',
['/c', 'echo hello'],
process.execPath,
[expect.stringMatching(/process-supervisor\.(ts|js)$/), 'cmd.exe', '/c', 'echo hello'],
expect.objectContaining({ cwd: 'C:\\temp' }),
)

Expand Down
33 changes: 30 additions & 3 deletions service/src/services/process-runner.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import { defineService, type Token } from '@furystack/inject'
import { type ChildProcess, spawn, spawnSync } from 'child_process'
import { dirname, extname, join } from 'node:path'
import { fileURLToPath } from 'node:url'
import { defineService, type Token } from '@furystack/inject'

import { LogStorageService } from './log-storage-service.js'
import type { killProcessTree } from './process-supervisor.js'
import type { ServiceLifecycleManager } from './service-lifecycle-manager.js'

export type ManagedProcess = {
Expand Down Expand Up @@ -46,6 +49,8 @@ const SAFE_ENV_KEYS = new Set([
'TMPDIR',
'TMP',
'TEMP',
// Supervisor tuning — forwarded so WATCHDOG_GRACE_MS overrides reach node -e.
'WATCHDOG_GRACE_MS',
// Windows-specific
'USERPROFILE',
'APPDATA',
Expand Down Expand Up @@ -98,17 +103,39 @@ export class ProcessRunnerImpl {
return env
}

/**
* Spawns the user command inside the {@link killProcessTree process-supervisor}
* module, which watches the parent stack-craft process via its stdin pipe and
* tears the child tree down if stack-craft dies before it can issue a graceful
* stop — handles `kill -9`, IDE force-stop, abrupt WSL exits, and any other
* path that bypasses {@link ServiceLifecycleManager.shutdownAll}.
*
* The supervisor sibling is resolved with the same extension as this module so
* it works both under `dist/` (`.js`) in production and under the source `.ts`
* in dev/test (Node strips types natively, hence the `engines.node >= 24`).
*/
public spawnCommand(command: string, cwd: string, extraEnv?: Record<string, string>): ChildProcess {
const isWindows = process.platform === 'win32'
const shell = isWindows ? 'cmd.exe' : '/bin/sh'
const shellFlag = isWindows ? '/c' : '-c'

return spawn(shell, [shellFlag, command], {
const here = fileURLToPath(import.meta.url)
const supervisorPath = join(dirname(here), `process-supervisor${extname(here)}`)

const child = spawn(process.execPath, [supervisorPath, shell, shellFlag, command], {
cwd,
stdio: ['ignore', 'pipe', 'pipe'],
// stdin is a pipe the supervisor watches for EOF (parent-death signal);
// stdout/stderr carry the child's output back for log capture.
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...ProcessRunnerImpl.getSafeEnv(), ...extraEnv },
detached: true,
})

// The supervisor watches this stdin pipe for EOF as its parent-death signal,
// so leave the write end open. Swallow EPIPE so it can't crash us at teardown.
child.stdin?.on('error', () => {})

return child
}

/**
Expand Down
74 changes: 74 additions & 0 deletions service/src/services/process-runner.watchdog.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import { usingAsync } from '@furystack/utils'
import { describe, expect, it } from 'vitest'

import type { LogStorageService } from './log-storage-service.js'
import { ProcessRunnerImpl } from './process-runner.js'
import '../test-shims.js'

const isAlive = (pid: number): boolean => {
try {
process.kill(pid, 0)
return true
} catch {
return false
}
}

const waitFor = async (predicate: () => boolean, timeoutMs: number): Promise<boolean> => {
const deadline = Date.now() + timeoutMs
while (Date.now() < deadline) {
if (predicate()) return true
await new Promise((resolve) => setTimeout(resolve, 25))
}
return predicate()
}

const noopLogStorage = { addEntry: async () => undefined } as unknown as LogStorageService

// The supervisor is spawned as a `.ts` file under vitest, so the real-spawn path
// needs a Node that strips types natively (>= 24, the project's engines floor).
const hasNativeTypeScript = Boolean((process.features as { typescript?: unknown }).typescript)

// POSIX-only: the watchdog reaps via process-group signals. The Windows path
// uses taskkill and cannot be exercised on the Linux CI runner.
describe.skipIf(process.platform === 'win32' || !hasNativeTypeScript)('ProcessRunner watchdog (integration)', () => {
it('reaps the child tree when the parent (stdin) goes away', async () => {
const original = process.env.WATCHDOG_GRACE_MS
process.env.WATCHDOG_GRACE_MS = '200'

try {
await usingAsync(new ProcessRunnerImpl(noopLogStorage), async (runner) => {
// Background a long sleep and print its pid; the supervisor's group includes it.
const child = runner.spawnCommand('sleep 30 & echo "GRANDCHILD:$!"; wait', process.cwd())

let output = ''
child.stdout?.on('data', (data: Buffer) => {
output += data.toString()
})

const sawPid = await waitFor(() => /GRANDCHILD:\d+/.test(output), 3000)
expect(sawPid).toBe(true)

const grandchildPid = Number.parseInt(/GRANDCHILD:(\d+)/.exec(output)?.[1] ?? '', 10)
expect(Number.isFinite(grandchildPid)).toBe(true)
expect(isAlive(grandchildPid)).toBe(true)

// Simulate abrupt parent death: closing the stdin write end is exactly what
// the OS does to the supervisor's stdin when the real parent process dies.
child.stdin?.end()

const grandchildReaped = await waitFor(() => !isAlive(grandchildPid), 3000)
const supervisorReaped = await waitFor(() => child.exitCode !== null || child.killed, 3000)

expect(grandchildReaped).toBe(true)
expect(supervisorReaped).toBe(true)
})
} finally {
if (original === undefined) {
delete process.env.WATCHDOG_GRACE_MS
} else {
process.env.WATCHDOG_GRACE_MS = original
}
}
})
})
Loading
Loading