Skip to content

fix: shutdown timeout, NPM body close, and checked fetch helper#81

Merged
Will-Luck merged 1 commit into
mainfrom
fix/concurrency-sweep
Apr 10, 2026
Merged

fix: shutdown timeout, NPM body close, and checked fetch helper#81
Will-Luck merged 1 commit into
mainfrom
fix/concurrency-sweep

Conversation

@Will-Luck

Copy link
Copy Markdown
Owner

Summary

Three reliability fixes from the concurrency/reliability sweep (the third and final PR in the security+reliability series after #79 and #80).

F. Graceful shutdown timeout

srv.Shutdown(context.Background()) replaced with context.WithTimeout(..., 10*time.Second). Previously, if any SSE client was connected when SIGTERM arrived, Shutdown waited indefinitely because http.Server.Shutdown waits for all active connections to drain but does not cancel request contexts. The 10-second bound ensures SIGTERM always completes within a predictable window.

O. NPM client body close

defer resp.Body.Close() inside the retry loop replaced with an immediate Close() before return. The defer was safe today but would silently leak response bodies if a retry continue were ever added.

K+L. Frontend checked fetch helper

New authFetchJSON() helper in auth.js wraps authFetch with r.ok check + r.json(). SSE updateContainerRow (highest-frequency dashboard fetch) migrated. Non-2xx responses now reject the promise, triggering the existing .catch(scheduleReload) fallback instead of silently corrupting DOM state.

Findings verified and downgraded

Finding Original Verdict
E (pending-map channel race) Critical Refuted
G (self-update context.Background) High Mitigated by design
M (notifier flush timer race) Medium Safe
N (scheduler timer leak) Medium Negligible
P (NPM reader reuse) Medium Latent

Test plan

  • go build ./... clean
  • go test ./... - 1547 tests in 28 packages
  • make frontend clean
  • Gitea CI green
  • GitHub Actions CI green

Three reliability fixes from issue #69:

* Graceful shutdown timeout (finding F): srv.Shutdown now uses a
  10-second context timeout instead of context.Background(). The
  unbounded Shutdown could hang indefinitely if any SSE client was
  connected, because http.Server.Shutdown waits for all active
  connections to drain but does not cancel request contexts. SSE
  handlers loop on select{} until the client disconnects or the
  request context is cancelled. With the 10-second bound, SIGTERM
  always completes within a predictable window.

* NPM client body close (finding O): Replaced defer resp.Body.Close()
  inside the retry loop with an immediate Close() before return.
  The defer was safe today (all error paths return immediately) but
  would leak response bodies if a retry continue were ever added.

* Frontend authFetchJSON helper (findings K+L): Added authFetchJSON()
  to auth.js which wraps authFetch with an r.ok check and r.json()
  parse. Updated the SSE updateContainerRow function (the highest-
  frequency fetch in the dashboard) to use authFetchJSON instead of
  raw fetch().then(r => r.json()). Previously, a 401 or 500 response
  would be silently parsed as JSON, corrupting the DOM state with
  error-body data. Now non-2xx responses reject the promise, which
  triggers the existing .catch(scheduleReload) fallback.

Findings verified and downgraded from original PR 3 scope:
  E (pending-map race): REFUTED
  G (self-update context): MITIGATED by design
  M (notifier flush timer): SAFE
  N (scheduler timer leak): NEGLIGIBLE
  P (NPM reader reuse): LATENT

All 1547 tests pass across 28 packages.

Refs: #69
@Will-Luck Will-Luck merged commit 29ca660 into main Apr 10, 2026
2 checks passed
@Will-Luck Will-Luck deleted the fix/concurrency-sweep branch April 10, 2026 00:19
Will-Luck added a commit that referenced this pull request Apr 17, 2026
Three reliability fixes from issue #69:

* Graceful shutdown timeout (finding F): srv.Shutdown now uses a
  10-second context timeout instead of context.Background(). The
  unbounded Shutdown could hang indefinitely if any SSE client was
  connected, because http.Server.Shutdown waits for all active
  connections to drain but does not cancel request contexts. SSE
  handlers loop on select{} until the client disconnects or the
  request context is cancelled. With the 10-second bound, SIGTERM
  always completes within a predictable window.

* NPM client body close (finding O): Replaced defer resp.Body.Close()
  inside the retry loop with an immediate Close() before return.
  The defer was safe today (all error paths return immediately) but
  would leak response bodies if a retry continue were ever added.

* Frontend authFetchJSON helper (findings K+L): Added authFetchJSON()
  to auth.js which wraps authFetch with an r.ok check and r.json()
  parse. Updated the SSE updateContainerRow function (the highest-
  frequency fetch in the dashboard) to use authFetchJSON instead of
  raw fetch().then(r => r.json()). Previously, a 401 or 500 response
  would be silently parsed as JSON, corrupting the DOM state with
  error-body data. Now non-2xx responses reject the promise, which
  triggers the existing .catch(scheduleReload) fallback.

Findings verified and downgraded from original PR 3 scope:
  E (pending-map race): REFUTED
  G (self-update context): MITIGATED by design
  M (notifier flush timer): SAFE
  N (scheduler timer leak): NEGLIGIBLE
  P (NPM reader reuse): LATENT

All 1547 tests pass across 28 packages.

Refs: #69

Co-authored-by: Will Luck <noreply@github.com>
Will-Luck added a commit that referenced this pull request Jun 1, 2026
Transitive otel bump to clear the CVE-2026-29181 Dependabot alert. See #81.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants