Skip to content

feat(env): source a fresh login environment when launching sessions#246

Merged
mgabor3141 merged 2 commits into
mainfrom
fresh-login-env
May 30, 2026
Merged

feat(env): source a fresh login environment when launching sessions#246
mgabor3141 merged 2 commits into
mainfrom
fresh-login-env

Conversation

@mgabor3141
Copy link
Copy Markdown
Contributor

Problem

Sessions launched by gmuxd inherit the daemon's environment, which is frozen at daemon startup (typically when gmux first auto-started it). The result:

  • Editing ~/.zshrc/~/.profile then clicking Restart session never picks up the change — the restarted runner reuses the same stale env.
  • Even gmuxd restart only helps when run from a pristine login shell. Run from a terminal inside a gmux session (the common case), it re-freezes the already-stale env.

While investigating I also found two env leaks: the gmuxgmuxd auto-start path stripped nothing (so a daemon auto-started from inside a session inherited GMUX_SESSION_ID/GMUX_SOCKET/GMUX_ADAPTER and stamped them onto every future session), and the bare GMUX=1 marker slipped through the existing GMUX_-prefix filter everywhere.

Changes

Two commits:

1. fix(env): stop leaking session identity into spawned daemons
New packages/sessionenv.Strip removes bare GMUX + all GMUX_* identity vars, while preserving GMUX_SOCKET_DIR (config — daemon and runner must agree on the socket dir) and GMUXD_*. Applied at all three spawn sites (launchGmux, startBackground, startGmuxd).

2. feat(env): source a fresh login environment when launching sessions
launchGmux now sources a fresh interactive-login environment — $SHELL -l -i, run in the session's cwd — via a hidden gmux --dump-env probe that writes os.Environ() NUL-delimited to fd 3 (keeps the payload clean of rc-file banner noise; NUL handles bash function exports with embedded newlines).

  • Merge semantics: the probe inherits the daemon env and layers dotfiles on top, so DISPLAY/SSH_AUTH_SOCK/XDG_RUNTIME_DIR survive while .zshrc changes apply. Still run through sessionenv.Strip.
  • Robust: synchronous, 5s timeout, process-group kill on expiry (via cmd.Cancel). Any failure ($SHELL unset → headless daemons, probe error, timeout) falls back to today's behavior — never a worse env, never blocks forever.
  • Scope: daemon-initiated launch/resume/restart only. Terminal-initiated gmux <cmd> is unchanged (already fresh).

Design rationale and alternatives are captured in ADR 0006.

Verification

  • Unit tests: sessionenv.Strip; capture/fallback/timeout/NUL-parse/shell-quote.
  • e2e (TestFreshLoginEnvOnLaunch): proves a dotfile edit reaches a restarted session with no daemon restart.
  • Manually reproduced before/after.

Notes

  • Drive-by: TestEndToEnd invoked gmuxd without the run subcommand (pre-existing breakage) — fixed.
  • Changelog is auto-generated from conventional commits, so it's left to the release workflow.

The daemon captures its environment once at startup and stamps it onto
every session it launches/restarts. Three spawn sites filtered env
inconsistently:

- gmux auto-starting gmuxd (startGmuxd) stripped nothing, so a daemon
  auto-started from inside a session inherited GMUX_SESSION_ID/SOCKET/
  ADAPTER and forwarded that stale identity to every future session.
- launchGmux / startBackground stripped the GMUX_ prefix but missed the
  bare GMUX=1 marker, which leaked everywhere.

Centralize the filter in packages/sessionenv.Strip: drop bare GMUX and
all GMUX_* session-identity vars, while preserving GMUX_SOCKET_DIR
(config, so daemon and runner agree on the socket directory) and
GMUXD_* daemon config. Apply it at all three spawn sites.

This is fix #2 of the stale-env investigation; refreshing env on
restart (#1) is follow-up work.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

Try this PR

curl -sSfL https://gmux.app/install-pr.sh | sh -s -- 246

Built from 8706e6e — feat(env): source a fresh login environment when launching sessions
Requires GitHub CLI with auth. Artifacts expire after 7 days.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 29, 2026

Greptile Summary

This PR fixes two stale-environment problems in gmuxd: an env-leak fix (sessionenv.Strip) and a fresh-login-env feature that sources $SHELL -l -i before each daemon-initiated launch/resume/restart. The previous reviewer comment about <-readCh blocking past the deadline has been addressed with a select + explicit pr.Close() in the timeout branch.

  • packages/sessionenv: new Strip replaces the old filterEnvPrefix("GMUX_"), adding the bare GMUX marker and preserving GMUX_SOCKET_DIR (config, not identity).
  • services/gmuxd/cmd/gmuxd/loginenv.go: captureLoginEnv forks $SHELL -l -i -c '<gmux> --dump-env' with a 5s process-group-kill timeout; falls back to os.Environ() on any failure, ensuring no regression for headless deployments.
  • cli/gmux/cmd/gmux/dumpenv.go: daemon-internal --dump-env mode writes os.Environ() NUL-delimited to fd 3, keeping shell rc-file noise out of the payload.

Confidence Score: 5/5

Safe to merge; all failure paths fall back to the daemon's current env, so this change cannot produce a worse session environment than today.

The fresh-env capture is entirely opt-in at runtime (gated by $SHELL being set) and fails gracefully on any error — timeout, non-zero exit, empty dump, or unresolved binary. The previously-flagged blocking issue with <-readCh has been correctly resolved with a select + pr.Close(). sessionenv.Strip is well-tested and replaces a narrower filter that missed the bare GMUX marker. The only note is a robustness nit in the post-select ctx.Err() check.

loginenv.go is the most complex new file; the timeout/cleanup path is correct but warrants a read-through on the ctx.Err() guard at line 150.

Important Files Changed

Filename Overview
services/gmuxd/cmd/gmuxd/loginenv.go Core env-capture logic; timeout/fallback/cleanup path is well-designed. One minor: the post-select ctx.Err() guard checks only DeadlineExceeded rather than the more defensive != nil.
packages/sessionenv/sessionenv.go Correctly strips bare GMUX and GMUX_* identity vars while preserving GMUX_SOCKET_DIR and leaving GMUXD_* untouched; well-tested.
cli/gmux/cmd/gmux/dumpenv.go Simple, correct fd-3 env dumper; returns non-zero on write failure so a partial dump triggers the gmuxd fallback path.
tests/e2e/loginenv_test.go End-to-end proof of the ADR-0006 behavior; builds both binaries into the same dir so resolveGmux finds the sibling, and correctly simulates dotfile edits between launch and restart.
services/gmuxd/cmd/gmuxd/loginenv_test.go Comprehensive unit tests cover all fallback paths: SHELL unset, empty gmuxBin, non-zero exit, empty dump, timeout, and background process holding fd 3.
services/gmuxd/cmd/gmuxd/main.go Replaces filterEnvPrefix call-sites with sessionenv.Strip and hooks in captureLoginEnv; startBackground now correctly preserves GMUX_SOCKET_DIR via the new Strip semantics.
cli/gmux/cmd/gmux/daemon.go Applies sessionenv.Strip to the auto-start gmuxd path, fixing the env leak when gmux is invoked from inside a running session.
tests/e2e/e2e_test.go Drive-by fix: adds the missing run subcommand to gmuxd invocation that was breaking the existing TestEndToEnd test.

Sequence Diagram

sequenceDiagram
    participant D as gmuxd launchGmux
    participant S as SHELL login shell
    participant P as gmux --dump-env
    participant R as gmux runner

    D->>D: os.Pipe creates pr and pw
    D->>S: exec SHELL -l -i -c gmux--dump-env with ExtraFiles pw as fd3 and Setpgid
    D->>D: pw.Close parent copy
    D->>D: go io.ReadAll pr into readCh
    D->>S: cmd.Wait blocks
    S->>S: source dotfiles zshrc bashrc profile
    S->>P: exec gmux --dump-env
    P->>P: os.NewFile fd3 then writeNulEnv os.Environ
    P-->>D: NUL-delimited env written to pipe
    P->>P: exit 0
    S-->>D: exit 0 cmd.Wait returns
    D->>D: select readCh or ctx.Done timeout
    D->>D: sessionenv.Strip on captured env
    D->>R: cmd.Env set to stripped fresh env
    R->>R: session starts with updated dotfile env
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
services/gmuxd/cmd/gmuxd/loginenv.go:150-152
This guard handles the race where both `readCh` and `ctx.Done()` become ready simultaneously and the `readCh` case wins the select. Using `== context.DeadlineExceeded` is equivalent today because `runLoginEnvProbe` always creates its own `context.WithTimeout` — but if this function is ever refactored to accept a caller-supplied context (e.g., the HTTP handler's request context), a cancellation would silently slip through and the function would proceed to use potentially-incomplete data. `ctx.Err() != nil` covers both `DeadlineExceeded` and `Canceled` for the same cost.

```suggestion
	if ctx.Err() != nil {
		return nil, fmt.Errorf("timed out after %s", timeout)
	}
```

Reviews (2): Last reviewed commit: "feat(env): source a fresh login environm..." | Re-trigger Greptile

Comment thread services/gmuxd/cmd/gmuxd/loginenv.go Outdated
The daemon froze its environment at startup and stamped that frozen copy
onto every session it launched, resumed, or restarted. Editing a dotfile
and clicking "Restart session" never picked up the change, and
"gmuxd restart" only helped when run from a pristine login shell (not
from inside a gmux session, the common case).

launchGmux now sources a fresh interactive-login environment
($SHELL -l -i, run in the session's cwd) via a hidden `gmux --dump-env`
probe that writes os.Environ() NUL-delimited to fd 3 — keeping the
payload clean of rc-file banner noise. The captured env merges onto the
daemon env (preserving DISPLAY/SSH_AUTH_SOCK/XDG_RUNTIME_DIR) and is
still run through sessionenv.Strip.

Robustness: synchronous with a 5s timeout and process-group kill on
expiry; falls back to the daemon's own env when $SHELL is unset
(headless daemons), the probe fails, or it times out — never producing a
worse env than before, never blocking forever. Terminal-initiated
launches are unchanged (already fresh).

Includes ADR 0006, an environment.md note, unit tests (capture/fallback/
timeout/parse) and an e2e proving a dotfile edit reaches a restarted
session without a daemon restart.

Drive-by: e2e TestEndToEnd invoked gmuxd without the `run` subcommand.
@mgabor3141
Copy link
Copy Markdown
Contributor Author

Addressed both Greptile findings (folded into the feature commit, force-pushed):

1. <-readCh could block past the 5s deadline (P1). Correct — once cmd.Wait() returns, Go's context watcher is gone and cmd.Cancel won't fire again, so a background process spawned by an rc file (&) holding fd 3 open would wedge io.ReadAll indefinitely. Replaced the unconditional rr := <-readCh with a select on ctx.Done(): on timeout we pr.Close() to unblock the reader and kill(-pgid) the lingering group, then return the timeout error (→ fallback). Added TestCaptureLoginEnv_BackgroundHoldsFD3 (shell writes env, backgrounds sleep 30 & holding fd 3, exits 0) which returns at the 300ms timeout instead of hanging 30s.

2. Stale doubled doc header on launchGmux. Removed the duplicate lead-in lines.

@mgabor3141 mgabor3141 merged commit 38b2574 into main May 30, 2026
8 checks passed
@mgabor3141 mgabor3141 deleted the fresh-login-env branch May 30, 2026 09:23
@github-actions github-actions Bot mentioned this pull request May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant