From 3878fd2df0afc89bba927c70b20b633caec01804 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 18:29:14 -0600 Subject: [PATCH 01/24] chore(ci-gate): add ORCHESTRATE implementation plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Working artifact for feature/ci-full-suite-gate. Implement in a fresh session from this worktree. See docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md. Phased: measure (non-blocking) → determinism/skip → promote to required. Delete this file during dev merge cleanup. Co-Authored-By: Claude Opus 4.8 (1M context) --- ORCHESTRATE-ci-full-suite-gate.md | 93 +++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) create mode 100644 ORCHESTRATE-ci-full-suite-gate.md diff --git a/ORCHESTRATE-ci-full-suite-gate.md b/ORCHESTRATE-ci-full-suite-gate.md new file mode 100644 index 000000000..151572faa --- /dev/null +++ b/ORCHESTRATE-ci-full-suite-gate.md @@ -0,0 +1,93 @@ +# ORCHESTRATE: Gate the full test suite in CI + +> **Working artifact** for `feature/ci-full-suite-gate`. Implement in a fresh session from +> this worktree (`cd ~/.git-worktrees/flow-cli/ci-full-suite-gate && claude`). +> Authoritative design: `docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md` (read it first). +> **Delete this file during the dev-merge cleanup.** + +## Branch / base +- Worktree: `~/.git-worktrees/flow-cli/ci-full-suite-gate` +- Branch: `feature/ci-full-suite-gate` (off `dev` @ `3b89ff09`) +- Target PR: `--base dev` +- Version: no bump (CI/infra; not a user-facing release on its own) + +## Gating-set decision (default — confirm against Phase 1 data) +**Full-minus-IMAP:** the required gate is all 65 suites EXCEPT `e2e-em-dispatcher` (external IMAP, +can only ever skip on a hosted runner). `test-atlas-contract` stays in the gate but must *skip* +its warm-path cleanly. If Phase 1 reveals more genuinely-external suites, widen the skip list and +note it here. (Smaller fallback per the spec: gate only `test-doctor.zsh` first.) + +--- + +## Phase 1 — Measure (non-blocking) ⏱️ first +Goal: capture ground-truth CI results before changing any test. + +- [ ] Add a **separate** job `full-suite` to `.github/workflows/test.yml`: + - mirrors the existing mock-project setup steps (reuse the `Create mock project structure` block) + - runs `cd ~/projects/dev-tools/flow-cli && ./tests/run-all.sh` + - **non-blocking:** `continue-on-error: true` (do NOT add to required checks yet) + - emits the run-all summary to `$GITHUB_STEP_SUMMARY` +- [ ] Open a draft PR to `dev`; let CI run; **record the actual pass/skip/fail list** in this file. +- [ ] Compare runner reality vs the local table in the spec (atlas absent ⇒ expect the + atlas-skew tests to pass and atlas-contract warm-path to skip). + +**Checkpoint:** paste the Phase 1 runner result here before starting Phase 2. + +``` +Phase 1 CI result (fill in): _________________________________ +``` + +--- + +## Phase 2 — Make the suite deterministic & green in CI +Goal: `run-all.sh` exits 0 on the runner; identical result locally with/without `atlas` on PATH. + +- [ ] **Determinism (atlas-skew):** in tests that assert *standalone* fallback behavior, pin + `FLOW_ATLAS_ENABLED=no` in setup so installing atlas can't flip the result. + - `tests/e2e-core-commands.zsh` → `status reads .STATUS` ([1]) and `catch creates capture` ([7]); + audit the whole file for other atlas-delegating asserts. +- [ ] **Clean-skip service-dependent tests** (skip only when the dep is genuinely absent; `return 77`): + - `tests/test-atlas-contract.zsh` → route the 4 warm-path tests (`atlas stats|parked|trail`, + currently exit 127) through `skip_without_atlas()` (or skip when `atlas stats` ≠ 0). + - `tests/e2e-em-dispatcher.zsh` → skip IMAP cases (`em unread`/`em read`) when no account is + configured; add a short per-call timeout so a hang can't wedge the suite. +- [ ] **run-all.sh CI semantics:** decide timeout handling. Once IMAP tests SKIP rather than hang, + make `TIMEOUT>0` a hard failure in the gated context (a real hang must be caught). Keep local + behavior unchanged or gate on an env flag (e.g. `FLOW_TEST_CI=1`). +- [ ] Run locally **both ways** to prove determinism: + - with atlas: `./tests/run-all.sh` → 0 + - without: `PATH=$(echo $PATH | tr ':' '\n' | grep -v homebrew | paste -sd:) ./tests/run-all.sh` + (or temporarily shadow atlas) → 0 +- [ ] Update `docs/guides/TESTING.md`: document the gate + how service tests skip; refresh counts + if any test counts change. + +**Definition of green:** every non-skipped suite passes; service-dependent cases report SKIP +(visible in output), never FAIL/TIMEOUT. + +--- + +## Phase 3 — Promote to required +- [ ] Flip `full-suite` to blocking (drop `continue-on-error`); confirm green on the PR. +- [ ] Add `full-suite` to required checks on **`dev`** branch protection; soak ≥1 PR. +- [ ] Then add it to **`main`** protection: + `gh api -X PUT repos/Data-Wise/flow-cli/branches/main/protection --input ` (include the + existing `ZSH Plugin Tests` + new `full-suite`; preserve PR-required/no-force/no-delete). + ⚠️ Outward-facing — do only after dev soak; confirm with user. +- [ ] Keep the fast smoke job (quick signal) alongside the full gate. + +--- + +## Integrate +- [ ] `git fetch origin dev && git rebase origin/dev` +- [ ] `./tests/run-all.sh` green (the whole point — it now gates itself) +- [ ] `gh pr create --base dev` +- [ ] On merge: delete this ORCHESTRATE file; remove worktree + branch (force-delete via user — hook-blocked). + +## Verification (Definition of Done) +1. CI runs the full suite on every PR to dev/main. +2. `run-all.sh` is green on the runner AND locally with/without atlas (determinism proven). +3. Service-dependent cases SKIP visibly; a deliberately-broken test reddens the required check. +4. `docs/guides/TESTING.md` updated; no version/count drift. + +## Notes / decisions log (append during impl) +- (date) — … From 07041f73e4466e64dde6e6ad7211d0efa1eb4665 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 18:49:39 -0600 Subject: [PATCH 02/24] ci(test): add non-blocking full-suite job (Phase 1 measure) Phase 1 of ci-full-suite-gate: run the full 65-suite run-all.sh on the hosted runner to capture ground-truth pass/skip/fail before gating. - Separate job (parallel to smoke), continue-on-error: true => non-blocking - Captures run-all.sh real exit via PIPESTATUS (tee masks it); re-exits so job color reflects reality (0=clean,1=FAIL,2=TIMEOUT) but never blocks PR - Emits full output + exit code to $GITHUB_STEP_SUMMARY for measurement - NOT added to required checks (that's Phase 3) Ref: docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 74 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 8c2e7982f..963fcf274 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -61,3 +61,77 @@ jobs: echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY + + # --------------------------------------------------------------------------- + # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. + # Non-blocking on purpose (continue-on-error) — this job exists to capture the + # ground-truth pass/skip/fail list on a hosted runner (no atlas, no IMAP) + # before we make it a required gate (Phase 3). Do NOT add to required checks + # while this comment is here. + # --------------------------------------------------------------------------- + full-suite: + name: Full Test Suite (non-blocking) + runs-on: ubuntu-latest + continue-on-error: true + + steps: + - name: Checkout code + uses: actions/checkout@v6 + + - name: Ensure zsh is installed + run: | + if ! command -v zsh >/dev/null 2>&1; then + sudo apt-get update && sudo apt-get install -y zsh + fi + + - name: Record start time + id: start + run: echo "time=$(date +%s)" >> $GITHUB_OUTPUT + + - name: Create mock project structure + run: | + mkdir -p ~/projects/dev-tools/flow-cli/.git + mkdir -p ~/projects/r-packages/active/mediationverse/.git + mkdir -p ~/projects/r-packages/stable/rmediation/.git + mkdir -p ~/projects/teaching/stat-440/.git + mkdir -p ~/projects/research/mediation-planning/.git + mkdir -p ~/projects/quarto/manuscripts/paper1/.git + mkdir -p ~/projects/apps/examify/.git + cp -r . ~/projects/dev-tools/flow-cli/ + + - name: Run full suite (non-blocking) + id: fullsuite + run: | + cd ~/projects/dev-tools/flow-cli + set +e + # Capture run-all.sh output; tee returns 0, so grab run-all's real + # exit via PIPESTATUS (1=FAIL, 2=TIMEOUT, 0=clean) and re-exit with it + # so the job color reflects reality. continue-on-error keeps it from + # blocking the PR. + ./tests/run-all.sh 2>&1 | tee /tmp/full-suite.log + rc=${PIPESTATUS[0]} + echo "rc=$rc" >> "$GITHUB_OUTPUT" + echo "Full suite exit code: $rc" + exit "$rc" + + - name: Full Suite Summary + if: always() + run: | + END_TIME=$(date +%s) + DURATION=$((END_TIME - ${{ steps.start.outputs.time }})) + { + echo "## 🧪 Full Suite (run-all.sh) — non-blocking measurement" + echo "" + echo "| Metric | Value |" + echo "|--------|-------|" + echo "| Duration | ${DURATION}s |" + echo "| Exit code | \`${{ steps.fullsuite.outputs.rc }}\` (0=clean, 1=FAIL, 2=TIMEOUT) |" + echo "" + echo "
Full run-all.sh output" + echo "" + echo '```' + cat /tmp/full-suite.log 2>/dev/null || echo "(no log captured)" + echo '```' + echo "" + echo "
" + } >> "$GITHUB_STEP_SUMMARY" From f8c7badb0c1bc39701394c7149458538d888c1ec Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 18:53:52 -0600 Subject: [PATCH 03/24] docs(ci-gate): record Phase 1 CI ground truth (51/14/0) Spec prediction inverted: e2e-core-commands + test-atlas-contract PASS on runner; 14 OTHER suites FAIL (tool-absent: brew/atlas/himalaya/R/quarto). 3 pure-zsh suites fail unexpectedly -> triage as possible real bugs. Co-Authored-By: Claude Opus 4.8 (1M context) --- ORCHESTRATE-ci-full-suite-gate.md | 39 +++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/ORCHESTRATE-ci-full-suite-gate.md b/ORCHESTRATE-ci-full-suite-gate.md index 151572faa..e2631a218 100644 --- a/ORCHESTRATE-ci-full-suite-gate.md +++ b/ORCHESTRATE-ci-full-suite-gate.md @@ -34,9 +34,41 @@ Goal: capture ground-truth CI results before changing any test. **Checkpoint:** paste the Phase 1 runner result here before starting Phase 2. ``` -Phase 1 CI result (fill in): _________________________________ +Phase 1 CI result (PR #465, run 27483939884, ubuntu-24.04, 2026-06-14): + 51 passed, 14 failed, 0 timeout (run-all.sh exit 1) ``` +### Phase 1 finding — spec table was WRONG; inverse skew +The runner is NOT cleaner than local. The 2 suites the spec predicted would fail +(`e2e-core-commands`, `test-atlas-contract`) **PASS on the runner** (atlas absent ⇒ +fallback/skip path fires, exactly as hypothesized). But **14 OTHER suites FAIL** — +they pass locally because the Mac has tools the Ubuntu runner lacks (brew, atlas, +himalaya, R, quarto). `e2e-em-dispatcher` **failed** (not timeout) → 0 timeouts. + +14 failing suites (cause = likely, confirm in Phase 2): +| Suite | Likely cause | +|---|---| +| test-doctor | `flow doctor` probes brew/atlas/plugins — none on runner | +| test-cc-dispatcher | cc/claude launcher binary absent | +| test-em-dispatcher | himalaya absent | +| dogfood-em-dispatcher | himalaya absent | +| e2e-em-dispatcher | IMAP/himalaya absent (FAILS, not timeout) | +| dogfood-atlas-bridge | atlas absent | +| dogfood-teach-doctor-v2 | R/renv absent | +| test-teach-deploy-v2-unit | R/quarto/rsync absent | +| test-teach-deploy-v2-integration | R/quarto/rsync absent | +| dogfood-teach-deploy-v2 | R/quarto/rsync absent | +| e2e-teach-deploy-v2 | R/quarto/rsync absent | +| test-help-compliance | ⚠️ pure-zsh — UNEXPECTED, investigate | +| test-help-compliance-dogfood | ⚠️ pure-zsh — UNEXPECTED, investigate | +| automated-plugin-dogfood | ⚠️ pure-zsh — UNEXPECTED, investigate | + +Implication: Phase 2 scope is much larger than the spec's 3 named fixes. The "smaller +fallback = gate just test-doctor" is ALSO non-viable as-is (test-doctor FAILS on runner). +Two sub-problems: (a) ~11 service/tool-dependent suites must clean-SKIP when the tool is +absent (rc 77), not FAIL; (b) the 3 pure-zsh ⚠️ suites are possible REAL bugs/path issues +that smoke-only CI never caught — triage those first. + --- ## Phase 2 — Make the suite deterministic & green in CI @@ -90,4 +122,7 @@ Goal: `run-all.sh` exits 0 on the runner; identical result locally with/without 4. `docs/guides/TESTING.md` updated; no version/count drift. ## Notes / decisions log (append during impl) -- (date) — … +- 2026-06-14 — Phase 1 done. Non-blocking `full-suite` job added (commit 18ba82db), + draft PR #465 → dev. CI ground truth: 51/14/0. Spec prediction was inverted (see + Phase 1 finding above). Phase 2 scope expanded to ~11 clean-skips + 3 pure-zsh + triage. Paused for user scoping decision before touching tests. From 9c5b0bd914972832ab3c3ea15dd8f579a441f0bc Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 18:56:25 -0600 Subject: [PATCH 04/24] ci(tmp): diagnostic job for 3 pure-zsh CI-only failures Temporary Phase 2 triage. Runs help-compliance, help-compliance-dogfood, automated-plugin-dogfood with output visible + locale fingerprint to find why they fail on the runner but pass locally. To be deleted post-diagnosis. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 49 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 963fcf274..58bba1310 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,6 +62,55 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY + # --------------------------------------------------------------------------- + # TEMPORARY (ci-full-suite-gate Phase 2 triage): capture the real runner + # output for the 3 pure-zsh suites that fail in CI but pass locally. DELETE + # this job once diagnosed. + # --------------------------------------------------------------------------- + diagnose-pure-zsh: + name: Diagnose pure-zsh failures (temporary) + runs-on: ubuntu-latest + continue-on-error: true + + steps: + - name: Checkout code + uses: actions/checkout@v6 + + - name: Ensure zsh is installed + run: | + if ! command -v zsh >/dev/null 2>&1; then + sudo apt-get update && sudo apt-get install -y zsh + fi + + - name: Environment fingerprint + run: | + echo "zsh: $(zsh --version)" + echo "LANG=$LANG LC_ALL=$LC_ALL LC_CTYPE=$LC_CTYPE" + locale 2>&1 || true + + - name: Create mock project structure + run: | + mkdir -p ~/projects/dev-tools/flow-cli/.git + mkdir -p ~/projects/r-packages/active/mediationverse/.git + mkdir -p ~/projects/r-packages/stable/rmediation/.git + mkdir -p ~/projects/teaching/stat-440/.git + mkdir -p ~/projects/research/mediation-planning/.git + mkdir -p ~/projects/quarto/manuscripts/paper1/.git + mkdir -p ~/projects/apps/examify/.git + cp -r . ~/projects/dev-tools/flow-cli/ + + - name: test-help-compliance + if: always() + run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/test-help-compliance.zsh + + - name: test-help-compliance-dogfood + if: always() + run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/test-help-compliance-dogfood.zsh + + - name: automated-plugin-dogfood + if: always() + run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/automated-plugin-dogfood.zsh + # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the From 54e61ff7c352b322a9672f39c556dc69e5bb2c5d Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 19:05:58 -0600 Subject: [PATCH 05/24] fix(loader): keep tm dispatcher when a tm binary is on PATH MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The binary-precedence guard's default keep-list was (r mcp cc), omitting tm. On systems with a `tm` binary (some Linux distros, GitHub ubuntu runners) the documented tm terminal-manager dispatcher was SILENTLY unfunctioned at load (the skip notice only prints under FLOW_DEBUG) — invisible on macOS dev boxes with no tm binary. Surfaced by Phase 1 of the CI full-suite gate: 3 suites (help-compliance, help-compliance-dogfood, automated-plugin-dogfood) failed on the runner but passed locally; all traced to `tm` not being a function. - flow.plugin.zsh: default FLOW_INTENTIONAL_SHADOWS now (r mcp cc tm) - test-dispatcher-binary-precedence.zsh: regression test simulating a tm binary collision against the real tm-dispatcher.zsh; extend the "intentional shadow survives" loop to cover tm - CLAUDE.md + CHANGELOG x2: document the new default (historical CHANGELOG entries left as-is per project convention) - remove the temporary diagnose-pure-zsh CI job (triage complete) Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 49 --------------------- CHANGELOG.md | 17 +++++++ CLAUDE.md | 4 +- docs/CHANGELOG.md | 17 +++++++ flow.plugin.zsh | 7 ++- tests/test-dispatcher-binary-precedence.zsh | 21 ++++++++- 6 files changed, 60 insertions(+), 55 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 58bba1310..963fcf274 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,55 +62,6 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY - # --------------------------------------------------------------------------- - # TEMPORARY (ci-full-suite-gate Phase 2 triage): capture the real runner - # output for the 3 pure-zsh suites that fail in CI but pass locally. DELETE - # this job once diagnosed. - # --------------------------------------------------------------------------- - diagnose-pure-zsh: - name: Diagnose pure-zsh failures (temporary) - runs-on: ubuntu-latest - continue-on-error: true - - steps: - - name: Checkout code - uses: actions/checkout@v6 - - - name: Ensure zsh is installed - run: | - if ! command -v zsh >/dev/null 2>&1; then - sudo apt-get update && sudo apt-get install -y zsh - fi - - - name: Environment fingerprint - run: | - echo "zsh: $(zsh --version)" - echo "LANG=$LANG LC_ALL=$LC_ALL LC_CTYPE=$LC_CTYPE" - locale 2>&1 || true - - - name: Create mock project structure - run: | - mkdir -p ~/projects/dev-tools/flow-cli/.git - mkdir -p ~/projects/r-packages/active/mediationverse/.git - mkdir -p ~/projects/r-packages/stable/rmediation/.git - mkdir -p ~/projects/teaching/stat-440/.git - mkdir -p ~/projects/research/mediation-planning/.git - mkdir -p ~/projects/quarto/manuscripts/paper1/.git - mkdir -p ~/projects/apps/examify/.git - cp -r . ~/projects/dev-tools/flow-cli/ - - - name: test-help-compliance - if: always() - run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/test-help-compliance.zsh - - - name: test-help-compliance-dogfood - if: always() - run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/test-help-compliance-dogfood.zsh - - - name: automated-plugin-dogfood - if: always() - run: cd ~/projects/dev-tools/flow-cli && zsh ./tests/automated-plugin-dogfood.zsh - # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the diff --git a/CHANGELOG.md b/CHANGELOG.md index 065814fbc..c70c587ae 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Fixed + +- **`tm` dispatcher silently dropped when a `tm` binary is on `PATH`** — the + binary-precedence guard's default keep-list was `(r mcp cc)`, omitting `tm`. + On systems with a `tm` binary (some Linux distros, GitHub `ubuntu-latest` + runners) the documented `tm` terminal-manager dispatcher was unfunctioned at + load with no error — invisible on macOS dev boxes. Added `tm` to the default + `FLOW_INTENTIONAL_SHADOWS` (now `(r mcp cc tm)`). Caught by running the full + test suite in CI for the first time (see CI full-suite gate below). + Regression test added in `tests/test-dispatcher-binary-precedence.zsh`. + +### Changed + +- **CI now measures the full test suite.** Added a non-blocking `full-suite` + job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR + (Phase 1 of the CI full-suite gate; promotion to a required check is staged). + ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) ### Added diff --git a/CLAUDE.md b/CLAUDE.md index c775e6054..e54a8f82e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -205,11 +205,11 @@ export FLOW_QUIET=1 # Suppress welcome export FLOW_DEBUG=1 # Debug mode # Binary-precedence guard (drops a dispatcher that shadows a PATH binary) -export FLOW_INTENTIONAL_SHADOWS=(r mcp cc) # Commands kept even when a same-named binary exists +export FLOW_INTENTIONAL_SHADOWS=(r mcp cc tm) # Commands kept even when a same-named binary exists export FLOW_FORCE_DISPATCHER_OBS=1 # Force-keep one dispatcher (FLOW_FORCE_DISPATCHER_) ``` -> **Guard caveat:** `FLOW_INTENTIONAL_SHADOWS` defaults to `(r mcp cc)` only when unset. Setting it to an empty array (`=()`) is treated as an explicit override, so `cc` (vs `/usr/bin/cc`) etc. would then be dropped — append (`+=(...)`) rather than reassign if you only want to add entries. +> **Guard caveat:** `FLOW_INTENTIONAL_SHADOWS` defaults to `(r mcp cc tm)` only when unset (`tm` was added in the ci-full-suite-gate work — a `tm` binary exists on some Linux/CI runners and was silently dropping the dispatcher). Setting it to an empty array (`=()`) is treated as an explicit override, so `cc` (vs `/usr/bin/cc`) etc. would then be dropped — append (`+=(...)`) rather than reassign if you only want to add entries. --- diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 44422dacb..f0ed84d34 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -8,6 +8,23 @@ The format follows [Keep a Changelog](https://keepachangelog.com/), and this pro ## [Unreleased] +### Fixed + +- **`tm` dispatcher silently dropped when a `tm` binary is on `PATH`** — the + binary-precedence guard's default keep-list was `(r mcp cc)`, omitting `tm`. + On systems with a `tm` binary (some Linux distros, GitHub `ubuntu-latest` + runners) the documented `tm` terminal-manager dispatcher was unfunctioned at + load with no error — invisible on macOS dev boxes. Added `tm` to the default + `FLOW_INTENTIONAL_SHADOWS` (now `(r mcp cc tm)`). Caught by running the full + test suite in CI for the first time (see CI full-suite gate below). + Regression test added in `tests/test-dispatcher-binary-precedence.zsh`. + +### Changed + +- **CI now measures the full test suite.** Added a non-blocking `full-suite` + job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR + (Phase 1 of the CI full-suite gate; promotion to a required check is staged). + ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) ### Added diff --git a/flow.plugin.zsh b/flow.plugin.zsh index d7f6661f4..d24dd09bc 100644 --- a/flow.plugin.zsh +++ b/flow.plugin.zsh @@ -75,10 +75,13 @@ if [[ "$FLOW_LOAD_DISPATCHERS" == "yes" ]]; then # Commands flow-cli deliberately provides even when a PATH binary of the # same name exists (e.g. `cc` launches Claude Code, not the C compiler; - # `r` is the R-package dispatcher, not Homebrew's R launcher). Pre-set + # `r` is the R-package dispatcher, not Homebrew's R launcher; `tm` is the + # terminal-manager dispatcher, which collides with a `tm` binary present on + # some Linux distros / CI runners). These are documented, man-paged core + # dispatchers — the guard must keep them, not silently drop them. Pre-set # FLOW_INTENTIONAL_SHADOWS before sourcing the plugin to customize. if (( ! ${+FLOW_INTENTIONAL_SHADOWS} )); then - typeset -ga FLOW_INTENTIONAL_SHADOWS=(r mcp cc) + typeset -ga FLOW_INTENTIONAL_SHADOWS=(r mcp cc tm) fi # Binary-precedence guard (B3): after sourcing the dispatcher files, drop any diff --git a/tests/test-dispatcher-binary-precedence.zsh b/tests/test-dispatcher-binary-precedence.zsh index 9ec8d7c76..d6d991070 100644 --- a/tests/test-dispatcher-binary-precedence.zsh +++ b/tests/test-dispatcher-binary-precedence.zsh @@ -104,13 +104,30 @@ assert_contains "$out" "HELP=_flowfaketool_helper: function" "helper should surv test_case_end # Don't break the user: intentional shadows survive a real plugin load even -# though r/mcp/cc all have PATH binaries on dev machines. -for d in r mcp cc; do +# though r/mcp/cc/tm all have PATH binaries on dev machines. +for d in r mcp cc tm; do test_case "intentional shadow '$d' survives plugin load" assert_function_exists "$d" test_case_end done +# Regression (ci-full-suite-gate): `tm` collides with a real `tm` binary present +# on some Linux distros / GitHub ubuntu runners (absent on macOS dev boxes), so +# the guard used to SILENTLY unfunction the documented `tm` dispatcher there — +# only caught once the full suite ran in CI. tm must now survive the collision. +mkdir -p "$TMPROOT/tmbin" +print -r -- '#!/bin/sh' > "$TMPROOT/tmbin/tm" +print -r -- 'echo "fake tm binary"' >> "$TMPROOT/tmbin/tm" +chmod +x "$TMPROOT/tmbin/tm" +test_case "tm dispatcher survives a real 'tm' binary on PATH (runner regression)" +out=$( + PATH="$TMPROOT/tmbin:$PATH"; rehash + _flow_load_dispatcher "$PROJECT_ROOT/lib/dispatchers/tm-dispatcher.zsh" + whence -w tm +) +assert_contains "$out" "tm: function" "tm must stay a function despite a 'tm' binary on PATH" +test_case_end + # Non-colliding real dispatchers still load. for d in g qu em teach tok dots; do test_case "dispatcher '$d' loads" From 401aea0db52bfaacf61306cb64bfd5a61f303bb3 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 19:11:06 -0600 Subject: [PATCH 06/24] ci(tmp): diagnose why tm still dropped on runner despite fix Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 963fcf274..07213a779 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,6 +62,41 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY + # --------------------------------------------------------------------------- + # TEMPORARY (ci-full-suite-gate): the (r mcp cc tm) fix keeps tm locally but + # tm is still dropped on the runner. Dump runner-side guard state. DELETE. + # --------------------------------------------------------------------------- + diag-tm: + name: Diagnose tm drop (temporary) + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v6 + - name: Ensure zsh + run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } + - name: Probe tm + guard state + run: | + echo "== real tm binary on runner ==" + command -v tm || echo "(no tm binary)" + type -a tm 2>/dev/null || true + echo "" + echo "== fix present in checkout? ==" + grep -n "FLOW_INTENTIONAL_SHADOWS=(r mcp cc" flow.plugin.zsh || echo "FIX MISSING" + echo "" + echo "== source plugin (mirror test) and inspect ==" + PLUGIN_DIR="$PWD" zsh -c ' + export FLOW_QUIET=1 FLOW_ATLAS_ENABLED=no + export FLOW_PLUGIN_DIR="$PLUGIN_DIR" + echo "pre-source: +FLOW_INTENTIONAL_SHADOWS=${+FLOW_INTENTIONAL_SHADOWS} +FLOW_LOAD_DISPATCHERS=${+FLOW_LOAD_DISPATCHERS}" + source ./flow.plugin.zsh 2>/dev/null + echo "post-source: FLOW_LOAD_DISPATCHERS=[$FLOW_LOAD_DISPATCHERS]" + echo "post-source: FLOW_INTENTIONAL_SHADOWS=[${FLOW_INTENTIONAL_SHADOWS[*]}]" + echo "post-source: commands[tm]=[${commands[tm]}]" + echo -n "post-source: tm => "; whence -w tm 2>&1 || echo "(undef)" + echo -n "post-source: _tm_help => "; whence -w _tm_help 2>&1 || echo "(undef)" + echo -n "membership test (Ie tm) => "; echo "${FLOW_INTENTIONAL_SHADOWS[(Ie)tm]}" + ' + # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the From f992d0a141ac890784ece960f55a7227f3996073 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 19:15:46 -0600 Subject: [PATCH 07/24] revert(loader): undo wrong tm-shadow fix; record real cause (ait absent) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Diagnostic on the runner disproved the binary-precedence hypothesis: there is NO tm binary on ubuntu-latest (commands[tm] empty). tm resolves to an ALIAS and _tm_help is undefined because tm-dispatcher.zsh requires the `ait` (aiterm) CLI and early-returns to a graceful "not installed" alias when it's absent (tm-dispatcher.zsh:44-55). The runner has no ait. So the 3 "pure-zsh" failures are NOT pure-zsh — they are the SAME tool-absent skew class as the other 11 (atlas/himalaya/R/quarto). The correct fix belongs in the tests (gate tm assertions on `ait`), not in the loader. Reverting: - flow.plugin.zsh: default keep-list back to (r mcp cc) - test-dispatcher-binary-precedence.zsh: drop the bogus tm collision test - CLAUDE.md + CHANGELOG x2: remove the false "tm binary" claim; keep the accurate "CI measures full suite" note - remove the temporary diag-tm CI job Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 35 --------------------- CHANGELOG.md | 14 ++------- CLAUDE.md | 4 +-- docs/CHANGELOG.md | 14 ++------- flow.plugin.zsh | 7 ++--- tests/test-dispatcher-binary-precedence.zsh | 21 ++----------- 6 files changed, 12 insertions(+), 83 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 07213a779..963fcf274 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,41 +62,6 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY - # --------------------------------------------------------------------------- - # TEMPORARY (ci-full-suite-gate): the (r mcp cc tm) fix keeps tm locally but - # tm is still dropped on the runner. Dump runner-side guard state. DELETE. - # --------------------------------------------------------------------------- - diag-tm: - name: Diagnose tm drop (temporary) - runs-on: ubuntu-latest - steps: - - name: Checkout code - uses: actions/checkout@v6 - - name: Ensure zsh - run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } - - name: Probe tm + guard state - run: | - echo "== real tm binary on runner ==" - command -v tm || echo "(no tm binary)" - type -a tm 2>/dev/null || true - echo "" - echo "== fix present in checkout? ==" - grep -n "FLOW_INTENTIONAL_SHADOWS=(r mcp cc" flow.plugin.zsh || echo "FIX MISSING" - echo "" - echo "== source plugin (mirror test) and inspect ==" - PLUGIN_DIR="$PWD" zsh -c ' - export FLOW_QUIET=1 FLOW_ATLAS_ENABLED=no - export FLOW_PLUGIN_DIR="$PLUGIN_DIR" - echo "pre-source: +FLOW_INTENTIONAL_SHADOWS=${+FLOW_INTENTIONAL_SHADOWS} +FLOW_LOAD_DISPATCHERS=${+FLOW_LOAD_DISPATCHERS}" - source ./flow.plugin.zsh 2>/dev/null - echo "post-source: FLOW_LOAD_DISPATCHERS=[$FLOW_LOAD_DISPATCHERS]" - echo "post-source: FLOW_INTENTIONAL_SHADOWS=[${FLOW_INTENTIONAL_SHADOWS[*]}]" - echo "post-source: commands[tm]=[${commands[tm]}]" - echo -n "post-source: tm => "; whence -w tm 2>&1 || echo "(undef)" - echo -n "post-source: _tm_help => "; whence -w _tm_help 2>&1 || echo "(undef)" - echo -n "membership test (Ie tm) => "; echo "${FLOW_INTENTIONAL_SHADOWS[(Ie)tm]}" - ' - # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the diff --git a/CHANGELOG.md b/CHANGELOG.md index c70c587ae..8335cad33 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,22 +9,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] -### Fixed - -- **`tm` dispatcher silently dropped when a `tm` binary is on `PATH`** — the - binary-precedence guard's default keep-list was `(r mcp cc)`, omitting `tm`. - On systems with a `tm` binary (some Linux distros, GitHub `ubuntu-latest` - runners) the documented `tm` terminal-manager dispatcher was unfunctioned at - load with no error — invisible on macOS dev boxes. Added `tm` to the default - `FLOW_INTENTIONAL_SHADOWS` (now `(r mcp cc tm)`). Caught by running the full - test suite in CI for the first time (see CI full-suite gate below). - Regression test added in `tests/test-dispatcher-binary-precedence.zsh`. - ### Changed - **CI now measures the full test suite.** Added a non-blocking `full-suite` job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR (Phase 1 of the CI full-suite gate; promotion to a required check is staged). + First full-suite CI run surfaced 14 suites that fail on a hosted runner + because external tools they exercise (`atlas`, `ait`/aiterm, `himalaya`, R, + quarto) are absent — being made to skip/degrade deterministically (Phase 2). ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) diff --git a/CLAUDE.md b/CLAUDE.md index e54a8f82e..c775e6054 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -205,11 +205,11 @@ export FLOW_QUIET=1 # Suppress welcome export FLOW_DEBUG=1 # Debug mode # Binary-precedence guard (drops a dispatcher that shadows a PATH binary) -export FLOW_INTENTIONAL_SHADOWS=(r mcp cc tm) # Commands kept even when a same-named binary exists +export FLOW_INTENTIONAL_SHADOWS=(r mcp cc) # Commands kept even when a same-named binary exists export FLOW_FORCE_DISPATCHER_OBS=1 # Force-keep one dispatcher (FLOW_FORCE_DISPATCHER_) ``` -> **Guard caveat:** `FLOW_INTENTIONAL_SHADOWS` defaults to `(r mcp cc tm)` only when unset (`tm` was added in the ci-full-suite-gate work — a `tm` binary exists on some Linux/CI runners and was silently dropping the dispatcher). Setting it to an empty array (`=()`) is treated as an explicit override, so `cc` (vs `/usr/bin/cc`) etc. would then be dropped — append (`+=(...)`) rather than reassign if you only want to add entries. +> **Guard caveat:** `FLOW_INTENTIONAL_SHADOWS` defaults to `(r mcp cc)` only when unset. Setting it to an empty array (`=()`) is treated as an explicit override, so `cc` (vs `/usr/bin/cc`) etc. would then be dropped — append (`+=(...)`) rather than reassign if you only want to add entries. --- diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index f0ed84d34..6770c4145 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -8,22 +8,14 @@ The format follows [Keep a Changelog](https://keepachangelog.com/), and this pro ## [Unreleased] -### Fixed - -- **`tm` dispatcher silently dropped when a `tm` binary is on `PATH`** — the - binary-precedence guard's default keep-list was `(r mcp cc)`, omitting `tm`. - On systems with a `tm` binary (some Linux distros, GitHub `ubuntu-latest` - runners) the documented `tm` terminal-manager dispatcher was unfunctioned at - load with no error — invisible on macOS dev boxes. Added `tm` to the default - `FLOW_INTENTIONAL_SHADOWS` (now `(r mcp cc tm)`). Caught by running the full - test suite in CI for the first time (see CI full-suite gate below). - Regression test added in `tests/test-dispatcher-binary-precedence.zsh`. - ### Changed - **CI now measures the full test suite.** Added a non-blocking `full-suite` job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR (Phase 1 of the CI full-suite gate; promotion to a required check is staged). + First full-suite CI run surfaced 14 suites that fail on a hosted runner + because external tools they exercise (`atlas`, `ait`/aiterm, `himalaya`, R, + quarto) are absent — being made to skip/degrade deterministically (Phase 2). ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) diff --git a/flow.plugin.zsh b/flow.plugin.zsh index d24dd09bc..d7f6661f4 100644 --- a/flow.plugin.zsh +++ b/flow.plugin.zsh @@ -75,13 +75,10 @@ if [[ "$FLOW_LOAD_DISPATCHERS" == "yes" ]]; then # Commands flow-cli deliberately provides even when a PATH binary of the # same name exists (e.g. `cc` launches Claude Code, not the C compiler; - # `r` is the R-package dispatcher, not Homebrew's R launcher; `tm` is the - # terminal-manager dispatcher, which collides with a `tm` binary present on - # some Linux distros / CI runners). These are documented, man-paged core - # dispatchers — the guard must keep them, not silently drop them. Pre-set + # `r` is the R-package dispatcher, not Homebrew's R launcher). Pre-set # FLOW_INTENTIONAL_SHADOWS before sourcing the plugin to customize. if (( ! ${+FLOW_INTENTIONAL_SHADOWS} )); then - typeset -ga FLOW_INTENTIONAL_SHADOWS=(r mcp cc tm) + typeset -ga FLOW_INTENTIONAL_SHADOWS=(r mcp cc) fi # Binary-precedence guard (B3): after sourcing the dispatcher files, drop any diff --git a/tests/test-dispatcher-binary-precedence.zsh b/tests/test-dispatcher-binary-precedence.zsh index d6d991070..9ec8d7c76 100644 --- a/tests/test-dispatcher-binary-precedence.zsh +++ b/tests/test-dispatcher-binary-precedence.zsh @@ -104,30 +104,13 @@ assert_contains "$out" "HELP=_flowfaketool_helper: function" "helper should surv test_case_end # Don't break the user: intentional shadows survive a real plugin load even -# though r/mcp/cc/tm all have PATH binaries on dev machines. -for d in r mcp cc tm; do +# though r/mcp/cc all have PATH binaries on dev machines. +for d in r mcp cc; do test_case "intentional shadow '$d' survives plugin load" assert_function_exists "$d" test_case_end done -# Regression (ci-full-suite-gate): `tm` collides with a real `tm` binary present -# on some Linux distros / GitHub ubuntu runners (absent on macOS dev boxes), so -# the guard used to SILENTLY unfunction the documented `tm` dispatcher there — -# only caught once the full suite ran in CI. tm must now survive the collision. -mkdir -p "$TMPROOT/tmbin" -print -r -- '#!/bin/sh' > "$TMPROOT/tmbin/tm" -print -r -- 'echo "fake tm binary"' >> "$TMPROOT/tmbin/tm" -chmod +x "$TMPROOT/tmbin/tm" -test_case "tm dispatcher survives a real 'tm' binary on PATH (runner regression)" -out=$( - PATH="$TMPROOT/tmbin:$PATH"; rehash - _flow_load_dispatcher "$PROJECT_ROOT/lib/dispatchers/tm-dispatcher.zsh" - whence -w tm -) -assert_contains "$out" "tm: function" "tm must stay a function despite a 'tm' binary on PATH" -test_case_end - # Non-colliding real dispatchers still load. for d in g qu em teach tok dots; do test_case "dispatcher '$d' loads" From d4909239e9b53930aab9941d1918baf4dc45ace6 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 19:16:25 -0600 Subject: [PATCH 08/24] =?UTF-8?q?docs(ci-gate):=20correct=20triage=20recor?= =?UTF-8?q?d=20=E2=80=94=2014=20failures=20are=20one=20tool-absent=20class?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- ORCHESTRATE-ci-full-suite-gate.md | 38 +++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/ORCHESTRATE-ci-full-suite-gate.md b/ORCHESTRATE-ci-full-suite-gate.md index e2631a218..02648b26c 100644 --- a/ORCHESTRATE-ci-full-suite-gate.md +++ b/ORCHESTRATE-ci-full-suite-gate.md @@ -59,15 +59,25 @@ himalaya, R, quarto). `e2e-em-dispatcher` **failed** (not timeout) → 0 timeout | test-teach-deploy-v2-integration | R/quarto/rsync absent | | dogfood-teach-deploy-v2 | R/quarto/rsync absent | | e2e-teach-deploy-v2 | R/quarto/rsync absent | -| test-help-compliance | ⚠️ pure-zsh — UNEXPECTED, investigate | -| test-help-compliance-dogfood | ⚠️ pure-zsh — UNEXPECTED, investigate | -| automated-plugin-dogfood | ⚠️ pure-zsh — UNEXPECTED, investigate | - -Implication: Phase 2 scope is much larger than the spec's 3 named fixes. The "smaller -fallback = gate just test-doctor" is ALSO non-viable as-is (test-doctor FAILS on runner). -Two sub-problems: (a) ~11 service/tool-dependent suites must clean-SKIP when the tool is -absent (rc 77), not FAIL; (b) the 3 pure-zsh ⚠️ suites are possible REAL bugs/path issues -that smoke-only CI never caught — triage those first. +| test-help-compliance | `ait`/aiterm absent → `tm` dispatcher degrades (CONFIRMED) | +| test-help-compliance-dogfood | `ait`/aiterm absent → `tm` degrades (CONFIRMED) | +| automated-plugin-dogfood | `ait`/aiterm absent → `tm` is alias not function (CONFIRMED) | + +CORRECTION (2026-06-14, runner-instrumented): the 3 "pure-zsh ⚠️" suites are NOT pure-zsh +and NOT real bugs. Root cause confirmed by a CI diagnostic job: the runner has **no `tm` +binary** (`commands[tm]` empty — the binary-precedence guard was never involved). `tm` is +the **aiterm** dispatcher (`lib/dispatchers/tm-dispatcher.zsh:44-55`): when the `ait` CLI is +absent it intentionally degrades to `alias tm='_tm_not_installed'` and early-returns, so +`tm()`/`_tm_help()` are never defined. The 3 suites assert `tm` is a full dispatcher → +they fail only because `ait` is absent. This is the **same tool-absent skew class** as the +other 11 (atlas/himalaya/R/quarto). A first (wrong) attempt added `tm` to +FLOW_INTENTIONAL_SHADOWS — reverted (commit ed98365f); the fix belongs in the TESTS. + +Implication: ALL 14 failures are one uniform class — service/tool-dependent suites that +must clean-SKIP or accept graceful degradation when their tool is absent. There are NO +real-bug outliers. The spec's "smaller fallback = gate just test-doctor" is non-viable +(test-doctor itself FAILS on the runner). Tools whose absence drives the 14: `atlas`, +`ait` (aiterm), `himalaya` (email/IMAP), `R`/`renv`, `quarto`/`rsync`. --- @@ -123,6 +133,10 @@ Goal: `run-all.sh` exits 0 on the runner; identical result locally with/without ## Notes / decisions log (append during impl) - 2026-06-14 — Phase 1 done. Non-blocking `full-suite` job added (commit 18ba82db), - draft PR #465 → dev. CI ground truth: 51/14/0. Spec prediction was inverted (see - Phase 1 finding above). Phase 2 scope expanded to ~11 clean-skips + 3 pure-zsh - triage. Paused for user scoping decision before touching tests. + draft PR #465 → dev. CI ground truth: 51/14/0. Spec prediction was inverted. +- 2026-06-14 — Triage of the 3 "pure-zsh" failures (user-selected). Runner-instrumented + diagnostic disproved the binary-precedence guess (no `tm` binary on runner). Real cause: + `ait`/aiterm absent → `tm` dispatcher degrades to an alias (tm-dispatcher.zsh:44-55). + Wrong fix (tm→FLOW_INTENTIONAL_SHADOWS) committed then reverted (ed98365f). Conclusion: + all 14 are ONE class (tool-absent skew); no real bugs. Phase 2 = uniform skip/degrade + strategy across all 14. Paused for user scoping decision. From 599abcd8737edaf900dd2bb14a11be5e20bcc96d Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 19:42:11 -0600 Subject: [PATCH 09/24] test(ci-gate): deterministic tm/aiterm skip (3 suites) + run-all rc-77 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 2, batch 1 of the tool-absent-skew fixes. The `tm` dispatcher only loads fully when the `ait` (aiterm) CLI is present; on hosted runners it degrades to an alias, so suites asserting tm-is-a-full-dispatcher failed. Foundation: - run-all.sh: exit code 77 now counted as SKIP (not FAIL); shown in the results line + an explanatory note. Whole-suite tool guards will use it. tm/aiterm determinism (mixed suites — keep full coverage when ait present, skip only the tm cases when absent): - automated-plugin-dogfood.zsh: include tm in the dispatcher / help-fn checks only when `ait` exists. - lib/help-compliance.zsh: _FLOW_HELP_DISPATCHERS includes tm only when `ait` exists — also stops `flow doctor --help-check` from false-flagging tm as non-compliant on machines without aiterm (real fix, not just tests). Fixes test-help-compliance.zsh (no edit needed) via the shared list. - test-help-compliance-dogfood.zsh: skip tm in all subject loops when ait absent; expected dispatcher count is dynamic (14 with ait, 13 without). Verified locally both ways (tool present AND hidden via PATH sandbox): with ait: 16/16, 379/379, 60/60 without ait: 15/15, 351/351, 58/58 (all exit 0) Co-Authored-By: Claude Opus 4.8 (1M context) --- lib/help-compliance.zsh | 13 +++++++-- tests/automated-plugin-dogfood.zsh | 17 ++++++++++-- tests/run-all.sh | 25 ++++++++++++++++- tests/test-help-compliance-dogfood.zsh | 37 ++++++++++++++++++++------ 4 files changed, 79 insertions(+), 13 deletions(-) diff --git a/lib/help-compliance.zsh b/lib/help-compliance.zsh index f63241dca..65cb93399 100644 --- a/lib/help-compliance.zsh +++ b/lib/help-compliance.zsh @@ -17,8 +17,17 @@ # 8. Color codes (_C_ or \033[) # 9. Help function naming (__help) -# All 14 dispatchers to check -typeset -ga _FLOW_HELP_DISPATCHERS=(g r mcp qu wt v cc tm teach dots sec tok prompt em) +# Dispatchers to check for help compliance. `tm` (aiterm terminal-manager) +# only defines its help function (`_tm_help`) when the `ait` CLI is installed; +# without it the dispatcher degrades to a "not installed" alias and early- +# returns (lib/dispatchers/tm-dispatcher.zsh). Including tm unconditionally +# would make `flow doctor --help-check` (and the help-compliance tests) report +# a false "non-compliant" on any machine without aiterm — e.g. CI runners — so +# tm is only checked when ait is present. +typeset -ga _FLOW_HELP_DISPATCHERS=(g r mcp qu wt v cc teach dots sec tok prompt em) +if command -v ait >/dev/null 2>&1; then + _FLOW_HELP_DISPATCHERS+=(tm) +fi # Map dispatcher names to their help function names typeset -gA _FLOW_HELP_FUNCTIONS=( diff --git a/tests/automated-plugin-dogfood.zsh b/tests/automated-plugin-dogfood.zsh index 2f9290753..2f4c2c4ee 100644 --- a/tests/automated-plugin-dogfood.zsh +++ b/tests/automated-plugin-dogfood.zsh @@ -101,7 +101,17 @@ echo "" echo "${CYAN}--- Section 2: Dispatcher Functions ---${RESET}" -dispatchers=(g mcp qu r cc tm wt dots sec tok teach prompt v em) +# `tm` (aiterm terminal-manager) only defines its dispatcher function when the +# `ait` CLI is installed; without it the dispatcher intentionally degrades to a +# "not installed" alias and early-returns (lib/dispatchers/tm-dispatcher.zsh). +# Include tm only when ait is present so this suite is deterministic on hosted +# CI runners (where aiterm is absent) without losing coverage on dev machines. +dispatchers=(g mcp qu r cc wt dots sec tok teach prompt v em) +if command -v ait >/dev/null 2>&1; then + dispatchers+=(tm) +else + echo "${YELLOW} (skipping tm dispatcher check — aiterm 'ait' not installed)${RESET}" +fi for disp in "${dispatchers[@]}"; do run_test "Dispatcher '$disp' is a function" " @@ -145,7 +155,6 @@ help_fns=( qu _qu_help r _r_help cc _cc_help - tm _tm_help wt _wt_help dots _dots_help sec _sec_help @@ -155,6 +164,10 @@ help_fns=( v _v_help em _em_help ) +# tm's _tm_help only exists when aiterm ('ait') is installed (see note above). +if command -v ait >/dev/null 2>&1; then + help_fns[tm]=_tm_help +fi for disp fn in "${(@kv)help_fns}"; do run_test "'$disp help' produces non-empty output" " diff --git a/tests/run-all.sh b/tests/run-all.sh index 4edc5b719..acb7a8656 100755 --- a/tests/run-all.sh +++ b/tests/run-all.sh @@ -12,6 +12,15 @@ echo "" PASS=0 FAIL=0 TIMEOUT=0 +SKIP=0 + +# Exit code 77 = the suite (or its only meaningful cases) cleanly skipped +# because a required external tool/service is absent (atlas, ait/aiterm, +# himalaya, R, quarto, …). This is the standard automake "skip" code. A +# skipped suite is NOT a failure — it must never redden the gate — but it is +# surfaced distinctly so a skip is visible (and never silently masks a real +# pass that should have happened on a fully-provisioned runner). +readonly SKIP_RC=77 run_test() { local test_file="$1" @@ -32,6 +41,10 @@ run_test() { # 124 = timeout echo "⏱️ (timeout after ${timeout_seconds}s)" ((TIMEOUT++)) + elif [[ $exit_code -eq $SKIP_RC ]]; then + # 77 = clean skip (required tool/service absent) + echo "⏭️ (skipped — required tool absent)" + ((SKIP++)) elif [[ $exit_code -eq 0 ]]; then echo "✅" ((PASS++)) @@ -43,6 +56,9 @@ run_test() { if [[ $exit_code -eq 124 ]]; then echo "⏱️ (timeout after ${timeout_seconds}s)" ((TIMEOUT++)) + elif [[ $exit_code -eq $SKIP_RC ]]; then + echo "⏭️ (skipped — required tool absent)" + ((SKIP++)) elif [[ $exit_code -eq 0 ]]; then echo "✅" ((PASS++)) @@ -154,9 +170,16 @@ run_test ./tests/test-scholar-config-sync.zsh echo "" echo "=========================================" -echo " Results: $PASS passed, $FAIL failed, $TIMEOUT timeout" +echo " Results: $PASS passed, $FAIL failed, $TIMEOUT timeout, $SKIP skipped" echo "=========================================" +if [[ $SKIP -gt 0 ]]; then + echo "" + echo "Note: $SKIP suite(s) skipped — a required external tool/service was" + echo "absent (atlas, ait/aiterm, himalaya, R, quarto). Expected on a hosted" + echo "CI runner; locally they run when the tool is installed." +fi + if [[ $FAIL -gt 0 ]]; then exit 1 fi diff --git a/tests/test-help-compliance-dogfood.zsh b/tests/test-help-compliance-dogfood.zsh index c55b48dfb..e97a6fbfa 100755 --- a/tests/test-help-compliance-dogfood.zsh +++ b/tests/test-help-compliance-dogfood.zsh @@ -108,6 +108,17 @@ source "$FLOW_DIR/lib/help-compliance.zsh" 2>/dev/null || { source "$FLOW_DIR/commands/doctor.zsh" 2>/dev/null +# `tm` (aiterm terminal-manager) only defines its dispatcher/help when the +# `ait` CLI is installed; otherwise it degrades to a "not installed" alias and +# early-returns (lib/dispatchers/tm-dispatcher.zsh). On machines/CI runners +# without aiterm, tm has no compliant help — skip tm-specific cases and adjust +# the expected dispatcher count so this suite is deterministic everywhere. +_HAS_AIT=0 +command -v ait >/dev/null 2>&1 && _HAS_AIT=1 +_EXPECTED_DISPATCHERS=14 +(( _HAS_AIT )) || _EXPECTED_DISPATCHERS=13 +(( _HAS_AIT )) || echo -e "${YELLOW}Note: aiterm 'ait' not installed — skipping tm help-compliance cases (expecting ${_EXPECTED_DISPATCHERS} dispatchers).${NC}" + echo "══════════════════════════════════════════════════════════════" echo " Help Compliance Dogfooding Tests" echo "══════════════════════════════════════════════════════════════" @@ -160,8 +171,9 @@ _test_individual_rules() { echo "" } -# Test all 14 dispatchers individually +# Test all 14 dispatchers individually (tm only when aiterm is installed) for d in g r mcp qu wt v cc tm teach dots sec tok prompt em; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue _test_individual_rules "$d" done @@ -189,6 +201,7 @@ _test_help_invocation() { # Test all three invocation forms for each dispatcher for cmd in g r mcp qu wt v cc tm prompt; do + [[ "$cmd" == tm ]] && (( ! _HAS_AIT )) && continue for form in help --help -h; do _test_help_invocation "$cmd" "$form" done @@ -272,6 +285,7 @@ _test_content_quality() { } for d in g r mcp qu wt v cc tm teach dots sec tok prompt; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue _test_content_quality "$d" done @@ -307,6 +321,7 @@ _test_color_fallback() { # Only test the 7 dispatchers we fixed (they all define their own fallbacks) for d in prompt dots sec tok cc tm teach v; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue _test_color_fallback "$d" done echo "" @@ -353,6 +368,7 @@ _test_box_format() { } for d in g r mcp qu wt v cc tm teach dots sec tok prompt; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue _test_box_format "$d" done echo "" @@ -368,6 +384,7 @@ echo "" _test_function_naming() { # Standard pattern: __help for d in g r mcp qu wt v cc tm dots sec tok prompt; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue local expected="_${d}_help" if typeset -f "$expected" > /dev/null 2>&1; then assert_pass "$d: function $expected() exists" @@ -406,12 +423,13 @@ _test_doctor_integration() { assert_contains "$output" "Help Function Compliance Check" \ "doctor --help-check shows compliance header" - # Output should report all 14 dispatchers - assert_contains "$output" "All 14 dispatchers compliant" \ - "doctor --help-check reports all 14 compliant" + # Output should report all dispatchers compliant (13 without aiterm's tm) + assert_contains "$output" "All ${_EXPECTED_DISPATCHERS} dispatchers compliant" \ + "doctor --help-check reports all ${_EXPECTED_DISPATCHERS} compliant" # Each dispatcher should appear in output for d in g r mcp qu wt v cc tm teach dots sec tok prompt em; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue assert_grep "$output" "✅ $d:" "doctor output includes $d result" done } @@ -428,12 +446,12 @@ echo -e "${BLUE}── Section 8: Compliance Library API ──${NC}" echo "" _test_compliance_api() { - # Dispatcher list has exactly 14 entries + # Dispatcher list has the expected entries (14 with aiterm, 13 without tm) local count=${#_FLOW_HELP_DISPATCHERS[@]} - if [[ $count -eq 14 ]]; then - assert_pass "dispatcher list has exactly 14 entries" + if [[ $count -eq $_EXPECTED_DISPATCHERS ]]; then + assert_pass "dispatcher list has exactly $_EXPECTED_DISPATCHERS entries" else - assert_fail "dispatcher list has exactly 14 entries" "found $count" + assert_fail "dispatcher list has exactly $_EXPECTED_DISPATCHERS entries" "found $count" fi # Function map has entry for every dispatcher @@ -511,6 +529,7 @@ _test_consistency() { # All dispatchers should have the same section order: # box → MOST COMMON → QUICK EXAMPLES → 📋 sections → TIP → See also for d in g r mcp qu wt v cc tm teach dots sec tok prompt; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue local help_fn="${_FLOW_HELP_FUNCTIONS[$d]}" local output output="$($help_fn 2>&1)" @@ -576,6 +595,7 @@ _test_edge_cases() { # Help output contains no raw FLOW_COLORS references (all converted) for d in prompt dots sec tok cc tm teach; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue local help_fn="${_FLOW_HELP_FUNCTIONS[$d]}" local output output="$($help_fn 2>&1)" @@ -584,6 +604,7 @@ _test_edge_cases() { # Help output contains no literal \033[ (should be rendered as actual ESC) for d in prompt dots sec tok cc tm teach; do + [[ "$d" == tm ]] && (( ! _HAS_AIT )) && continue local help_fn="${_FLOW_HELP_FUNCTIONS[$d]}" local output output="$($help_fn 2>&1)" From 81f8c72c2ab8fde369e7e21afefb35348104d453 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:25:11 -0600 Subject: [PATCH 10/24] fix(cache): zsh-compatible flock fd allocation (Linux-only doctor bug) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit run-all.sh in CI exposed a real Linux-only runtime bug: lib/doctor-cache.zsh and lib/analysis-cache.zsh used bash-only high-fd redirection (`exec 201>`, `exec 200>`). In zsh a literal fd >= 10 is parsed as a COMMAND, so `exec 201>file` errors with "command not found: 201". The flock branch only runs when `flock` exists — true on Linux, false on macOS (which falls back to mkdir locking) — so this only ever broke on hosted CI runners, never locally. Fix: use zsh's dynamic `exec {var}>file` allocation and reference $var on acquire, lock, and release. Verified the syntax in zsh; the old literal form is proven to fail. This is exactly the regression-class the full-suite gate was built to catch (smoke-only CI never ran test-doctor). Co-Authored-By: Claude Opus 4.8 (1M context) --- lib/analysis-cache.zsh | 16 ++++++++++------ lib/doctor-cache.zsh | 18 +++++++++++------- 2 files changed, 21 insertions(+), 13 deletions(-) diff --git a/lib/analysis-cache.zsh b/lib/analysis-cache.zsh index 39714f10f..3bc3b6052 100644 --- a/lib/analysis-cache.zsh +++ b/lib/analysis-cache.zsh @@ -182,9 +182,12 @@ _cache_acquire_lock() { # Create lock file if it doesn't exist touch "$lock_path" 2>/dev/null - # Use flock with timeout - exec 200>"$lock_path" - if ! flock -w "$ANALYSIS_CACHE_LOCK_TIMEOUT" 200 2>/dev/null; then + # Use flock with timeout. zsh requires the dynamic `{var}` form for + # file descriptors >= 10; the literal `exec 200>` is bash-only and + # errors in zsh ("command not found: 200") on Linux — where this flock + # branch runs. macOS lacks flock and uses the mkdir fallback below. + exec {_ANALYSIS_CACHE_LOCK_FD}>"$lock_path" + if ! flock -w "$ANALYSIS_CACHE_LOCK_TIMEOUT" "$_ANALYSIS_CACHE_LOCK_FD" 2>/dev/null; then _flow_log_debug "Failed to acquire cache lock (timeout)" 2>/dev/null return 1 fi @@ -238,9 +241,10 @@ _cache_release_lock() { local lock_path lock_path=$(_cache_get_lock_path "$course_dir") - # Release flock (if using flock) - if command -v flock >/dev/null 2>&1; then - exec 200>&- 2>/dev/null || true + # Release flock (if using flock). Close the dynamically-allocated fd from + # the acquire path (zsh {var} form; see the note there). + if command -v flock >/dev/null 2>&1 && [[ -n "$_ANALYSIS_CACHE_LOCK_FD" ]]; then + exec {_ANALYSIS_CACHE_LOCK_FD}>&- 2>/dev/null || true fi # Remove mkdir-based lock diff --git a/lib/doctor-cache.zsh b/lib/doctor-cache.zsh index 6ad9c1006..2287c6c4c 100644 --- a/lib/doctor-cache.zsh +++ b/lib/doctor-cache.zsh @@ -157,10 +157,13 @@ _doctor_cache_acquire_lock() { # Create lock file if it doesn't exist touch "$lock_path" 2>/dev/null - # Use flock with timeout - # Use file descriptor 201 for doctor cache locks - exec 201>"$lock_path" - if ! flock -w "$DOCTOR_CACHE_LOCK_TIMEOUT" 201 2>/dev/null; then + # Use flock with timeout. zsh requires the dynamic `{var}` form for + # file descriptors >= 10; the literal `exec 201>` is bash-only and + # errors in zsh ("command not found: 201") on Linux — where this flock + # branch actually runs. macOS has no flock and uses the mkdir fallback + # below, which is why this only ever broke on hosted CI runners. + exec {_DOCTOR_CACHE_LOCK_FD}>"$lock_path" + if ! flock -w "$DOCTOR_CACHE_LOCK_TIMEOUT" "$_DOCTOR_CACHE_LOCK_FD" 2>/dev/null; then _flow_log_debug "Failed to acquire cache lock for $key (timeout)" 2>/dev/null return 1 fi @@ -214,9 +217,10 @@ _doctor_cache_release_lock() { local lock_path lock_path=$(_doctor_cache_get_lock_path "$key") - # Release flock (if using flock) - if command -v flock >/dev/null 2>&1; then - exec 201>&- 2>/dev/null || true + # Release flock (if using flock). Close the dynamically-allocated fd from + # _doctor_cache_acquire_lock (zsh {var} form; see the note there). + if command -v flock >/dev/null 2>&1 && [[ -n "$_DOCTOR_CACHE_LOCK_FD" ]]; then + exec {_DOCTOR_CACHE_LOCK_FD}>&- 2>/dev/null || true fi # Remove mkdir-based lock From 706be60cc34fcf10596842d78b563e8cc7968fac Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:25:11 -0600 Subject: [PATCH 11/24] test(ci-gate): deterministic tool-absent skips (cc/em/teach-deploy/teach-doctor) Phase 2, batch 2. Make tool-dependent cases skip cleanly when the tool is absent (CI runner), preserving full coverage when present: - test-cc-dispatcher: gate 2 cases that exec `claude` (HERE path). - e2e-em-dispatcher: bound the IMAP/himalaya check with `timeout` and exit 77 when unreachable (real cause was a HANG, not a missing binary). - dogfood-teach-doctor-v2: gate the renv.lock case on `R`. - teach-deploy-v2 (unit/integration/dogfood/e2e): gate on `yq` (the deploy history helpers parse YAML via yq), exit 77 when absent. Verified both ways (tool present AND hidden via PATH sandbox): identical pass counts with the tool, clean skip/exit-0 without. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/dogfood-teach-deploy-v2.zsh | 1 + tests/dogfood-teach-doctor-v2.zsh | 4 ++++ tests/e2e-em-dispatcher.zsh | 19 ++++++++++++++----- tests/e2e-teach-deploy-v2.zsh | 4 ++++ tests/test-cc-dispatcher.zsh | 14 ++++++++++++++ tests/test-teach-deploy-v2-integration.zsh | 4 ++++ tests/test-teach-deploy-v2-unit.zsh | 4 ++++ 7 files changed, 45 insertions(+), 5 deletions(-) diff --git a/tests/dogfood-teach-deploy-v2.zsh b/tests/dogfood-teach-deploy-v2.zsh index ed52e51ba..c340ba122 100755 --- a/tests/dogfood-teach-deploy-v2.zsh +++ b/tests/dogfood-teach-deploy-v2.zsh @@ -494,6 +494,7 @@ echo "" echo "${CYAN}--- Section 5: Deploy Rollback Helpers ---${RESET}" run_test "Rollback in CI mode without index returns error" ' + [[ "$_YQ_AVAILABLE" == "true" ]] || return 77 local tmpdir=$(mktemp -d) _DOGFOOD_TEMP_DIRS+=("$tmpdir") ( diff --git a/tests/dogfood-teach-doctor-v2.zsh b/tests/dogfood-teach-doctor-v2.zsh index 08172abdf..4a54381ce 100644 --- a/tests/dogfood-teach-doctor-v2.zsh +++ b/tests/dogfood-teach-doctor-v2.zsh @@ -367,6 +367,10 @@ run_test "--verbose shows full check header" ' ' run_test "--verbose shows renv.lock age detail (if renv present)" ' + # The renv.lock age detail is only emitted after the R-availability check + # passes (teach-doctor-impl.zsh returns early when R is absent), so skip + # cleanly on CI runners where R is not installed. + command -v R >/dev/null 2>&1 || return 77 # Create minimal renv setup echo "{\"R\":{\"Version\":\"4.4.2\"},\"Packages\":{}}" > renv.lock mkdir -p renv diff --git a/tests/e2e-em-dispatcher.zsh b/tests/e2e-em-dispatcher.zsh index dbf65590e..59a4d5ab7 100644 --- a/tests/e2e-em-dispatcher.zsh +++ b/tests/e2e-em-dispatcher.zsh @@ -91,18 +91,27 @@ test_himalaya_binary() { } run_test "himalaya binary exists" "test_himalaya_binary" +# _em_hml_check reaches the configured IMAP account over the network. +# On CI (or any host without a reachable account) this can block forever, +# so bound it. A timeout (rc 124) or any failure => himalaya is not usable +# for the live suite; skip everything below rather than hang. +# (run_test runs the func in a command-substitution subshell, so we can't +# set a flag from inside it — track skips via TESTS_SKIPPED instead.) +_skipped_before_cfg=$TESTS_SKIPPED test_himalaya_configured() { - if _em_hml_check >/dev/null 2>&1; then + if timeout 10 zsh -c "FLOW_QUIET=1 FLOW_ATLAS_ENABLED=no source '$PROJECT_ROOT/flow.plugin.zsh' 2>/dev/null; _em_hml_check >/dev/null 2>&1"; then return 0 else - echo "himalaya not configured" + echo "himalaya not configured (or account unreachable)" exit 77 fi } run_test "himalaya configured" "test_himalaya_configured" -# If prerequisites failed, exit now -if [[ $TESTS_FAILED -gt 0 || $TESTS_SKIPPED -eq $TESTS_RUN ]]; then +# If prerequisites failed, or the configured-check skipped (timeout / no +# reachable account), exit now. Every test below makes live IMAP calls, so +# continuing without a usable account would hang the runner. +if [[ $TESTS_FAILED -gt 0 || $TESTS_SKIPPED -gt $_skipped_before_cfg ]]; then echo "" echo "${YELLOW}Prerequisites not met, skipping remaining tests${RESET}" exit 77 @@ -204,7 +213,7 @@ echo "${CYAN}Section 4: Email Reading${RESET}" FIRST_EMAIL_ID="" TESTS_RUN=$((TESTS_RUN + 1)) echo -n " ${CYAN}[$TESTS_RUN] get first email ID...${RESET} " -_e2e_email_data=$(_em_hml_list INBOX 1 2>/dev/null) +_e2e_email_data=$(timeout 15 zsh -c "FLOW_QUIET=1 FLOW_ATLAS_ENABLED=no source '$PROJECT_ROOT/flow.plugin.zsh' 2>/dev/null; _em_hml_list INBOX 1 2>/dev/null") if [[ -n "$_e2e_email_data" ]]; then FIRST_EMAIL_ID=$(echo "$_e2e_email_data" | jq -r '.[0].id // empty' 2>/dev/null) fi diff --git a/tests/e2e-teach-deploy-v2.zsh b/tests/e2e-teach-deploy-v2.zsh index f7e83730b..2abb12bc6 100755 --- a/tests/e2e-teach-deploy-v2.zsh +++ b/tests/e2e-teach-deploy-v2.zsh @@ -45,6 +45,10 @@ source "$PROJECT_ROOT/lib/deploy-history-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/deploy-rollback-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/dispatchers/teach-deploy-enhanced.zsh" 2>/dev/null || true +# Deploy history/rollback/preflight helpers require yq to read/write YAML. +# On CI runners where yq is absent, skip the whole suite cleanly (exit 77). +command -v yq >/dev/null 2>&1 || { echo "SKIP: yq not installed"; exit 77; } + # Stub missing functions if ! typeset -f _teach_error >/dev/null 2>&1; then _teach_error() { echo "ERROR: $1" >&2; } diff --git a/tests/test-cc-dispatcher.zsh b/tests/test-cc-dispatcher.zsh index 0664a990c..3272ad6b1 100755 --- a/tests/test-cc-dispatcher.zsh +++ b/tests/test-cc-dispatcher.zsh @@ -504,6 +504,13 @@ test_shortcut_h_expands_to_haiku() { test_explicit_here_dot() { test_case "cc . recognized as explicit HERE" + # Requires the claude binary: HERE target execs `claude` directly. + # When absent (CI runner), zsh prints "command not found" -> skip cleanly. + if ! command -v claude >/dev/null 2>&1; then + test_skip "claude not installed" + return + fi + # The . should be recognized as HERE target local output=$(cc . --help 2>&1 || echo "error") @@ -518,6 +525,13 @@ test_explicit_here_dot() { test_explicit_here_word() { test_case "cc here recognized as explicit HERE" + # Requires the claude binary: HERE target execs `claude` directly. + # When absent (CI runner), zsh prints "command not found" -> skip cleanly. + if ! command -v claude >/dev/null 2>&1; then + test_skip "claude not installed" + return + fi + local output=$(cc here --help 2>&1 || echo "error") if [[ "$output" != "error" ]]; then diff --git a/tests/test-teach-deploy-v2-integration.zsh b/tests/test-teach-deploy-v2-integration.zsh index cbc35243d..72d7523a6 100755 --- a/tests/test-teach-deploy-v2-integration.zsh +++ b/tests/test-teach-deploy-v2-integration.zsh @@ -45,6 +45,10 @@ source "$PROJECT_ROOT/lib/deploy-history-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/deploy-rollback-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/dispatchers/teach-deploy-enhanced.zsh" 2>/dev/null || true +# Deploy history/rollback/preflight helpers require yq to read/write YAML. +# On CI runners where yq is absent, skip the whole suite cleanly (exit 77). +command -v yq >/dev/null 2>&1 || { echo "SKIP: yq not installed"; exit 77; } + # Stub/override functions for test isolation # These MUST be set AFTER sourcing libs to override real implementations _teach_error() { echo "ERROR: $1" >&2; } diff --git a/tests/test-teach-deploy-v2-unit.zsh b/tests/test-teach-deploy-v2-unit.zsh index a2ffea322..9872a344d 100755 --- a/tests/test-teach-deploy-v2-unit.zsh +++ b/tests/test-teach-deploy-v2-unit.zsh @@ -45,6 +45,10 @@ source "$PROJECT_ROOT/lib/deploy-history-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/deploy-rollback-helpers.zsh" 2>/dev/null || true source "$PROJECT_ROOT/lib/dispatchers/teach-deploy-enhanced.zsh" 2>/dev/null || true +# Deploy history/rollback/preflight helpers require yq to read/write YAML. +# On CI runners where yq is absent, skip the whole suite cleanly (exit 77). +command -v yq >/dev/null 2>&1 || { echo "SKIP: yq not installed"; exit 77; } + # Stub functions that may not be available if ! typeset -f _teach_error >/dev/null 2>&1; then _teach_error() { echo "ERROR: $1" >&2; } From 2aca392dc7aee03a7f97c161292a52e037c0a8a1 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:25:11 -0600 Subject: [PATCH 12/24] ci(tmp): diagnose 3 suites that fail on CI but pass locally Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 963fcf274..95b8a5320 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,6 +62,39 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY + # --------------------------------------------------------------------------- + # TEMPORARY (ci-full-suite-gate Phase 2): capture real runner output for the + # 3 suites that fail on CI but pass locally even with tools hidden. DELETE. + # --------------------------------------------------------------------------- + diag-undiagnosed: + name: Diagnose undiagnosed suites (temporary) + runs-on: ubuntu-latest + continue-on-error: true + steps: + - name: Checkout code + uses: actions/checkout@v6 + - name: Ensure zsh + run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } + - name: Create mock project structure + run: | + mkdir -p ~/projects/dev-tools/flow-cli/.git + mkdir -p ~/projects/r-packages/active/mediationverse/.git + mkdir -p ~/projects/r-packages/stable/rmediation/.git + mkdir -p ~/projects/teaching/stat-440/.git + mkdir -p ~/projects/research/mediation-planning/.git + mkdir -p ~/projects/quarto/manuscripts/paper1/.git + mkdir -p ~/projects/apps/examify/.git + cp -r . ~/projects/dev-tools/flow-cli/ + - name: test-em-dispatcher + if: always() + run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/test-em-dispatcher.zsh + - name: dogfood-em-dispatcher + if: always() + run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-em-dispatcher.zsh + - name: dogfood-atlas-bridge + if: always() + run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-atlas-bridge.zsh + # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the From 3cb39097be946f09c621fe76fdf857dee4e69bd5 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:28:33 -0600 Subject: [PATCH 13/24] test(ci-gate): determinism for atlas-PRESENT skew (local-only failures) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit These two suites pass on CI (atlas absent) but FAILED locally (atlas installed) — the inverse skew the spec flagged. Acceptance criterion: the suite must be green locally whether or not atlas is installed. - e2e-core-commands: export FLOW_ATLAS_ENABLED=no before sourcing so `status` and `catch` exercise flow-cli's standalone fallback (with atlas installed they delegate to the binary, flipping [1]/[7]). - test-atlas-contract: add skip_without_warm_atlas() — the `atlas` on PATH may be a different/older binary whose stats/parked/trail/-v return 127; route the 4 warm-path/exit-code contract tests through it so they skip unless a flow-compatible atlas actually implements them. Verified both ways: e2e-core-commands 22/22 with atlas and without; test-atlas-contract 14/18 (4 skip) with atlas, 11/18 without — all exit 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/e2e-core-commands.zsh | 7 +++++++ tests/test-atlas-contract.zsh | 23 +++++++++++++++++++---- 2 files changed, 26 insertions(+), 4 deletions(-) diff --git a/tests/e2e-core-commands.zsh b/tests/e2e-core-commands.zsh index b6126d931..796fb8d61 100644 --- a/tests/e2e-core-commands.zsh +++ b/tests/e2e-core-commands.zsh @@ -57,6 +57,13 @@ echo "" # Load plugin FLOW_QUIET=1 FLOW_PLUGIN_DIR="$PROJECT_ROOT" +# Pin standalone mode: this suite asserts flow-cli's built-in fallback +# behavior for `status` and `catch`. With atlas installed those commands +# delegate to the atlas binary instead, flipping the result based on whether +# atlas happens to be on PATH (passes on CI where atlas is absent, fails on a +# dev box where it isn't). Forcing atlas off makes the suite deterministic +# everywhere — independent of whether atlas is installed. +export FLOW_ATLAS_ENABLED=no source "$PROJECT_ROOT/flow.plugin.zsh" 2>/dev/null || { echo "${RED}Failed to load plugin${RESET}" exit 1 diff --git a/tests/test-atlas-contract.zsh b/tests/test-atlas-contract.zsh index 7887f234e..33192e6a9 100755 --- a/tests/test-atlas-contract.zsh +++ b/tests/test-atlas-contract.zsh @@ -28,6 +28,21 @@ skip_without_atlas() { return 1 } +# Helper: skip warm-path/exit-code contract tests unless atlas actually +# implements flow-cli's expected subcommands. A same-named `atlas` binary may +# be on PATH (e.g. a different/older atlas) whose `stats`/`parked`/`trail`/`-v` +# return 127; those tests would then fail not because the contract is broken +# but because this isn't a flow-compatible atlas. Probe `atlas stats` once. +# Returns 0 (true) when skipped, 1 (false) when a functional atlas is present. +skip_without_warm_atlas() { + skip_without_atlas && return 0 + if ! atlas stats >/dev/null 2>&1; then + test_skip "atlas present but warm-path subcommands unimplemented" + return 0 + fi + return 1 +} + # ============================================================================ # BRIDGE FUNCTION TESTS (always run — these test flow-cli code) # ============================================================================ @@ -175,7 +190,7 @@ if ! skip_without_atlas; then fi test_case "atlas exit codes: success = 0" -if ! skip_without_atlas; then +if ! skip_without_warm_atlas; then atlas -v >/dev/null 2>&1 assert_exit_code $? 0 test_pass @@ -194,7 +209,7 @@ if ! skip_without_atlas; then fi test_case "Warm-path: atlas stats responds" -if ! skip_without_atlas; then +if ! skip_without_warm_atlas; then atlas stats >/dev/null 2>&1 local ec=$? assert_exit_code $ec 0 @@ -202,7 +217,7 @@ if ! skip_without_atlas; then fi test_case "Warm-path: atlas parked responds" -if ! skip_without_atlas; then +if ! skip_without_warm_atlas; then atlas parked >/dev/null 2>&1 local ec=$? assert_exit_code $ec 0 @@ -210,7 +225,7 @@ if ! skip_without_atlas; then fi test_case "Warm-path: atlas trail responds" -if ! skip_without_atlas; then +if ! skip_without_warm_atlas; then atlas trail >/dev/null 2>&1 local ec=$? assert_exit_code $ec 0 From 3af3607818d8b6221e3f2c8ca7aaaf651851c28c Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:33:27 -0600 Subject: [PATCH 14/24] fix(em-cache): portable file mtime (stat -f is macOS-only) + atlas-bridge tm gate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two more tool-absent-skew / cross-platform fixes surfaced by the CI gate: - lib/em-cache.zsh: replaced macOS-only `stat -f %m`/`stat -f '%m %N'` with a portable `_em_cache_mtime` helper (BSD `stat -f` then GNU `stat -c %Y`). On Linux the bare `stat -f` failed → mtime read as 0 → every entry looked expired → cache get/prune/cap returned empty. The email cache never worked on Linux. (Real product bug, caught by test-em/dogfood-em cache round-trip.) - tests/dogfood-atlas-bridge.zsh: the "at() coexists with all 14 dispatchers" case failed because `tm` isn't a function without aiterm (ait). Gate tm on `command -v ait` (same pattern as the other dispatcher-enumeration suites) — this was a tm/ait issue, not atlas. Verified locally: dogfood-atlas-bridge 29/29 with AND without ait; em suites 108/108 + 159/159 (macOS/BSD path); mtime helper returns a real epoch. Co-Authored-By: Claude Opus 4.8 (1M context) --- lib/em-cache.zsh | 19 +++++++++++++++---- tests/dogfood-atlas-bridge.zsh | 3 +++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/lib/em-cache.zsh b/lib/em-cache.zsh index 5ecb6e3cc..3f012c328 100644 --- a/lib/em-cache.zsh +++ b/lib/em-cache.zsh @@ -56,6 +56,14 @@ _em_cache_key() { echo "$msg_id" | md5 -q 2>/dev/null || echo "$msg_id" | md5sum 2>/dev/null | cut -d' ' -f1 } +_em_cache_mtime() { + # Portable file mtime in epoch seconds: BSD `stat -f %m` (macOS) then GNU + # `stat -c %Y` (Linux). The bare `stat -f %m` is macOS-only and FAILS on + # Linux/CI runners, where it would silently yield 0 and make EVERY cache + # entry look expired — the email cache never worked on Linux. + stat -f %m "$1" 2>/dev/null || stat -c %Y "$1" 2>/dev/null || echo 0 +} + # ═══════════════════════════════════════════════════════════════════ # CACHE READ/WRITE # ═══════════════════════════════════════════════════════════════════ @@ -74,7 +82,7 @@ _em_cache_get() { # Check TTL local ttl="${_EM_CACHE_TTL[$operation]:-3600}" local file_mod - file_mod=$(stat -f %m "$cache_file" 2>/dev/null || echo 0) + file_mod=$(_em_cache_mtime "$cache_file") local now=$(date +%s) local file_age=$(( now - file_mod )) @@ -153,7 +161,7 @@ _em_cache_prune() { ttl="${_EM_CACHE_TTL[$op_name]:-3600}" for cache_file in "$op_dir"/*.txt(N); do - file_mod=$(stat -f %m "$cache_file" 2>/dev/null || echo 0) + file_mod=$(_em_cache_mtime "$cache_file") if (( (now - file_mod) > ttl )); then rm -f "$cache_file" (( pruned++ )) @@ -185,7 +193,10 @@ _em_cache_enforce_cap() { # Evict oldest files first (LRU) local evicted=0 local files_by_age - files_by_age=("${(@f)$(find "$cache_base" -name '*.txt' -print0 2>/dev/null | xargs -0 stat -f '%m %N' 2>/dev/null | sort -n | awk '{print $2}')}") + files_by_age=("${(@f)$( + find "$cache_base" -name '*.txt' 2>/dev/null | while IFS= read -r _f; do + print -r -- "$(_em_cache_mtime "$_f") $_f" + done | sort -n | awk '{print $2}')}") for old_file in "${files_by_age[@]}"; do [[ -z "$old_file" ]] && continue @@ -229,7 +240,7 @@ _em_cache_stats() { for cache_file in "$op_dir"/*.txt(N); do (( count++ )) - file_mod=$(stat -f %m "$cache_file" 2>/dev/null || echo 0) + file_mod=$(_em_cache_mtime "$cache_file") (( (now - file_mod) > ttl )) && (( expired++ )) done diff --git a/tests/dogfood-atlas-bridge.zsh b/tests/dogfood-atlas-bridge.zsh index 5fffbca4b..d09fee80e 100644 --- a/tests/dogfood-atlas-bridge.zsh +++ b/tests/dogfood-atlas-bridge.zsh @@ -250,6 +250,9 @@ run_test "Plugin loads without stderr when Atlas disabled" ' run_test "at() coexists with all 14 dispatchers" ' local all_ok=true for d in g mcp qu r cc tm wt dots sec tok teach prompt v em; do + # tm degrades to an alias when aiterm (ait) is absent (CI runners) — + # only assert it is a function when ait is installed. + [[ "$d" == tm ]] && ! command -v ait >/dev/null 2>&1 && continue typeset -f "$d" >/dev/null 2>&1 || { echo "Missing: $d"; all_ok=false; } done typeset -f at >/dev/null 2>&1 || { echo "Missing: at"; all_ok=false; } From 26db3414ce895bdce179a947d33bbd9a39341a78 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:33:58 -0600 Subject: [PATCH 15/24] ci(tmp): re-point diagnostic at teach-deploy + env fingerprint Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 95b8a5320..f97fca54b 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -63,11 +63,12 @@ jobs: echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY # --------------------------------------------------------------------------- - # TEMPORARY (ci-full-suite-gate Phase 2): capture real runner output for the - # 3 suites that fail on CI but pass locally even with tools hidden. DELETE. + # TEMPORARY (ci-full-suite-gate Phase 2): teach-deploy suites still fail on CI + # with the yq guard, so yq must be PRESENT and the cause is elsewhere. Capture + # the real runner output + environment fingerprint. DELETE. # --------------------------------------------------------------------------- diag-undiagnosed: - name: Diagnose undiagnosed suites (temporary) + name: Diagnose teach-deploy (temporary) runs-on: ubuntu-latest continue-on-error: true steps: @@ -75,25 +76,22 @@ jobs: uses: actions/checkout@v6 - name: Ensure zsh run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } + - name: Environment fingerprint + run: | + echo "yq: $(command -v yq || echo MISSING) | $(yq --version 2>&1 | head -1)" + echo "git user.name=[$(git config --global user.name)] user.email=[$(git config --global user.email)]" + echo "R: $(command -v R || echo MISSING) | quarto: $(command -v quarto || echo MISSING) | rsync: $(command -v rsync || echo MISSING)" - name: Create mock project structure run: | mkdir -p ~/projects/dev-tools/flow-cli/.git - mkdir -p ~/projects/r-packages/active/mediationverse/.git - mkdir -p ~/projects/r-packages/stable/rmediation/.git mkdir -p ~/projects/teaching/stat-440/.git - mkdir -p ~/projects/research/mediation-planning/.git - mkdir -p ~/projects/quarto/manuscripts/paper1/.git - mkdir -p ~/projects/apps/examify/.git cp -r . ~/projects/dev-tools/flow-cli/ - - name: test-em-dispatcher - if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/test-em-dispatcher.zsh - - name: dogfood-em-dispatcher + - name: test-teach-deploy-v2-unit if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-em-dispatcher.zsh - - name: dogfood-atlas-bridge + run: cd ~/projects/dev-tools/flow-cli && timeout 90 zsh ./tests/test-teach-deploy-v2-unit.zsh + - name: dogfood-teach-deploy-v2 if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-atlas-bridge.zsh + run: cd ~/projects/dev-tools/flow-cli && timeout 90 zsh ./tests/dogfood-teach-deploy-v2.zsh # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. From cedb34477a32fe58681cba97bfbeb83dc434e70c Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:36:49 -0600 Subject: [PATCH 16/24] ci: provision git identity for full-suite + diagnostic jobs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit teach-deploy suites run 'git commit' (direct deploy, history, back-merge) which fails with 'empty ident' on a fresh runner (git user.name/email unset). Configure a CI git identity — environment provisioning, not a test change. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index f97fca54b..4f6d32586 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -76,6 +76,11 @@ jobs: uses: actions/checkout@v6 - name: Ensure zsh run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } + - name: Configure git identity + run: | + git config --global user.email "ci@flow-cli.test" + git config --global user.name "flow-cli CI" + git config --global init.defaultBranch main - name: Environment fingerprint run: | echo "yq: $(command -v yq || echo MISSING) | $(yq --version 2>&1 | head -1)" @@ -115,6 +120,15 @@ jobs: sudo apt-get update && sudo apt-get install -y zsh fi + - name: Configure git identity + run: | + # Fresh runners have no git identity; suites that exercise the deploy + # workflow (teach-deploy) run `git commit` and fail with "empty ident" + # without it. This is CI environment provisioning, not a test change. + git config --global user.email "ci@flow-cli.test" + git config --global user.name "flow-cli CI" + git config --global init.defaultBranch main + - name: Record start time id: start run: echo "time=$(date +%s)" >> $GITHUB_OUTPUT From f80e79c644da14280eaffcd0a7c4f48fc06e7635 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:43:46 -0600 Subject: [PATCH 17/24] fix(teaching-utils): portable date parsing (date -j is macOS-only) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit teach-deploy [42] "teaching_week from start_date" failed on Linux: the date math used `date -j -f` (BSD/macOS only), which fails on GNU date → empty epoch → week calc returns 0. Added _teach_date_to_epoch/_teach_epoch_to_date helpers (BSD `date -j -f` then GNU `date -d`) and routed all 6 date conversions through them. macOS behavior unchanged (BSD form wins first); Linux now works. Also re-point the temporary diagnostic at the em-cache suites + a stat/md5 probe to find why em-cache still fails on Linux after the stat -f fix. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 16 ++++++++++++---- lib/teaching-utils.zsh | 24 ++++++++++++++++++------ 2 files changed, 30 insertions(+), 10 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 4f6d32586..37848b288 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -91,12 +91,20 @@ jobs: mkdir -p ~/projects/dev-tools/flow-cli/.git mkdir -p ~/projects/teaching/stat-440/.git cp -r . ~/projects/dev-tools/flow-cli/ - - name: test-teach-deploy-v2-unit + - name: stat probe (why em-cache fails on Linux) if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 90 zsh ./tests/test-teach-deploy-v2-unit.zsh - - name: dogfood-teach-deploy-v2 + run: | + f=$(mktemp); echo hi > "$f" + echo "stat -f %m => [$(stat -f %m "$f" 2>&1)] rc=$?" + echo "stat -c %Y => [$(stat -c %Y "$f" 2>&1)] rc=$?" + echo "md5 => [$(echo x | md5 2>&1 | head -1)]" + echo "md5sum => [$(echo x | md5sum 2>&1 | head -1)]" + - name: test-em-dispatcher + if: always() + run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/test-em-dispatcher.zsh + - name: dogfood-em-dispatcher if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 90 zsh ./tests/dogfood-teach-deploy-v2.zsh + run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-em-dispatcher.zsh # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. diff --git a/lib/teaching-utils.zsh b/lib/teaching-utils.zsh index 37a7f94f1..bcd113576 100644 --- a/lib/teaching-utils.zsh +++ b/lib/teaching-utils.zsh @@ -2,6 +2,18 @@ # Teaching workflow utility functions # Part of Increment 2: Course Context +# Portable date helpers. `date -j -f` is macOS (BSD) only and FAILS on Linux/CI +# runners, where it would silently yield an empty epoch and make every +# teaching-week / semester calculation return 0. Try BSD first, then GNU `date`. +_teach_date_to_epoch() { + # YYYY-MM-DD -> epoch seconds + date -j -f "%Y-%m-%d" "$1" "+%s" 2>/dev/null || date -d "$1" "+%s" 2>/dev/null +} +_teach_epoch_to_date() { + # epoch seconds -> YYYY-MM-DD + date -j -f "%s" "$1" "+%Y-%m-%d" 2>/dev/null || date -d "@$1" "+%Y-%m-%d" 2>/dev/null +} + # ============================================================================= # Function: _calculate_current_week # Purpose: Calculate current week number from semester start date @@ -40,7 +52,7 @@ _calculate_current_week() { fi # Calculate weeks since start (macOS date compatible) - local start_epoch=$(date -j -f "%Y-%m-%d" "$start_date" "+%s" 2>/dev/null) + local start_epoch=$(_teach_date_to_epoch "$start_date") local now_epoch=$(date "+%s") if [[ -z "$start_epoch" ]]; then @@ -174,8 +186,8 @@ _date_to_week() { return 0 fi - local start_epoch=$(date -j -f "%Y-%m-%d" "$start_date" "+%s" 2>/dev/null) - local target_epoch=$(date -j -f "%Y-%m-%d" "$target_date" "+%s" 2>/dev/null) + local start_epoch=$(_teach_date_to_epoch "$start_date") + local target_epoch=$(_teach_date_to_epoch "$target_date") if [[ -z "$start_epoch" || -z "$target_epoch" ]]; then return 0 @@ -218,7 +230,7 @@ _validate_date_format() { fi # Verify it's a real date - if ! date -j -f "%Y-%m-%d" "$date_str" "+%s" &>/dev/null; then + if ! _teach_date_to_epoch "$date_str" >/dev/null 2>&1 || [[ -z "$(_teach_date_to_epoch "$date_str")" ]]; then return 1 fi @@ -250,7 +262,7 @@ _validate_date_format() { _calculate_semester_end() { local start_date="$1" - local start_epoch=$(date -j -f "%Y-%m-%d" "$start_date" "+%s" 2>/dev/null) + local start_epoch=$(_teach_date_to_epoch "$start_date") if [[ -z "$start_epoch" ]]; then return 1 @@ -258,7 +270,7 @@ _calculate_semester_end() { # Add 16 weeks (112 days) local end_epoch=$((start_epoch + (16 * 7 * 86400))) - local end_date=$(date -j -f "%s" "$end_epoch" "+%Y-%m-%d" 2>/dev/null) + local end_date=$(_teach_epoch_to_date "$end_epoch") echo "$end_date" } From c10b7d005116fce733009107d0d6e31a39564d62 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:47:08 -0600 Subject: [PATCH 18/24] =?UTF-8?q?fix(cache):=20GNU-first=20stat=20order=20?= =?UTF-8?q?=E2=80=94=20BSD-first=20corrupts=20mtime=20on=20Linux?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The em-cache stat fix was still broken on Linux. `stat -f %m FILE` there treats -f as --file-system and prints a filesystem block for FILE to stdout while erroring on the `%m` operand, so a BSD-first `stat -f %m || stat -c %Y` captures BOTH outputs → garbage mtime → cache always looks expired → email cache get/prune/cap return empty (test-em/dogfood-em cache round-trip FAIL). Fix: try GNU `stat -c %Y` FIRST (works cleanly on Linux), fall back to BSD `stat -f %m` (which fails cleanly on macOS with an illegal-option error and empty stdout). Verified the order is correct on both platforms via a CI stat probe and locally. Applied the same swap to the 5 other BSD-first stat sites (teach-doctor-impl x4, teach-dispatcher x1) to prevent the same latent bug. Local (macOS, GNU-first): test-em 108/108, dogfood-em 159/159, dogfood-teach-doctor-v2 43/43. Co-Authored-By: Claude Opus 4.8 (1M context) --- lib/dispatchers/teach-dispatcher.zsh | 2 +- lib/dispatchers/teach-doctor-impl.zsh | 8 ++++---- lib/em-cache.zsh | 14 +++++++++----- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/lib/dispatchers/teach-dispatcher.zsh b/lib/dispatchers/teach-dispatcher.zsh index 6c7dc1f36..b7b8c009b 100644 --- a/lib/dispatchers/teach-dispatcher.zsh +++ b/lib/dispatchers/teach-dispatcher.zsh @@ -4880,7 +4880,7 @@ _teach_show_status_full() { # Find most recent backup local recent=$(_teach_list_backups "$content_dir" | head -1) if [[ -n "$recent" ]]; then - local backup_time=$(stat -f %m "$recent" 2>/dev/null || stat -c %Y "$recent" 2>/dev/null) + local backup_time=$(stat -c %Y "$recent" 2>/dev/null || stat -f %m "$recent" 2>/dev/null) if [[ "$backup_time" -gt "$latest_backup_time" ]]; then latest_backup_time=$backup_time latest_backup=$(basename "$recent") diff --git a/lib/dispatchers/teach-doctor-impl.zsh b/lib/dispatchers/teach-doctor-impl.zsh index 31cb7023f..0b64ba1d2 100644 --- a/lib/dispatchers/teach-doctor-impl.zsh +++ b/lib/dispatchers/teach-doctor-impl.zsh @@ -346,7 +346,7 @@ _teach_doctor_check_r_quick() { # renv.lock freshness if [[ "$verbose" == "true" && "$quiet" == "false" && "$json" == "false" ]]; then local lock_mtime - lock_mtime=$(stat -f %m renv.lock 2>/dev/null || stat -c %Y renv.lock 2>/dev/null) + lock_mtime=$(stat -c %Y renv.lock 2>/dev/null || stat -f %m renv.lock 2>/dev/null) if [[ -n "$lock_mtime" ]]; then local age_days=$(( (EPOCHSECONDS - lock_mtime) / 86400 )) if [[ $age_days -eq 0 ]]; then @@ -751,7 +751,7 @@ _teach_doctor_check_r_packages() { # Lock file freshness local lock_mtime - lock_mtime=$(stat -f %m renv.lock 2>/dev/null || stat -c %Y renv.lock 2>/dev/null) + lock_mtime=$(stat -c %Y renv.lock 2>/dev/null || stat -f %m renv.lock 2>/dev/null) if [[ -n "$lock_mtime" ]]; then local age_days=$(( (EPOCHSECONDS - lock_mtime) / 86400 )) if [[ $age_days -eq 0 ]]; then @@ -1076,13 +1076,13 @@ _teach_doctor_check_macros() { if [[ "$macros_configured" == "true" ]]; then if [[ -f "$cache_file" ]]; then local cache_mtime=0 - cache_mtime=$(stat -f %m "$cache_file" 2>/dev/null || stat -c %Y "$cache_file" 2>/dev/null || echo 0) + cache_mtime=$(stat -c %Y "$cache_file" 2>/dev/null || stat -f %m "$cache_file" 2>/dev/null || echo 0) local stale=0 for src in "${sources[@]}"; do if [[ -f "$src" ]]; then local src_mtime - src_mtime=$(stat -f %m "$src" 2>/dev/null || stat -c %Y "$src" 2>/dev/null || echo 0) + src_mtime=$(stat -c %Y "$src" 2>/dev/null || stat -f %m "$src" 2>/dev/null || echo 0) if (( src_mtime > cache_mtime )); then stale=1 break diff --git a/lib/em-cache.zsh b/lib/em-cache.zsh index 3f012c328..edd03f96b 100644 --- a/lib/em-cache.zsh +++ b/lib/em-cache.zsh @@ -57,11 +57,15 @@ _em_cache_key() { } _em_cache_mtime() { - # Portable file mtime in epoch seconds: BSD `stat -f %m` (macOS) then GNU - # `stat -c %Y` (Linux). The bare `stat -f %m` is macOS-only and FAILS on - # Linux/CI runners, where it would silently yield 0 and make EVERY cache - # entry look expired — the email cache never worked on Linux. - stat -f %m "$1" 2>/dev/null || stat -c %Y "$1" 2>/dev/null || echo 0 + # Portable file mtime in epoch seconds. GNU `stat -c %Y` (Linux) is tried + # FIRST, BSD `stat -f %m` (macOS) second — order matters: on Linux + # `stat -f %m FILE` treats `-f` as --file-system and prints a filesystem + # block for FILE to stdout while erroring on `%m`, so a BSD-first chain + # captures BOTH outputs and corrupts the mtime (cache then looks expired — + # the email cache silently never worked on Linux). `stat -c %Y` fails + # cleanly on macOS (illegal option, empty stdout), so GNU-first is safe + # on both platforms. + stat -c %Y "$1" 2>/dev/null || stat -f %m "$1" 2>/dev/null || echo 0 } # ═══════════════════════════════════════════════════════════════════ From 54a4be3b871be9c72e6a5f30356f1fd0c862006a Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:50:05 -0600 Subject: [PATCH 19/24] fix(teach-deploy): portable teaching_week date; remove temp diagnostic job MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit dogfood-teach-deploy-v2 [42] "Status update calculates teaching_week" used a bare macOS `date -j -f` in the deploy status-update path (teach-deploy-enhanced.zsh) — added the GNU `date -d` fallback (the deploy path is separate from teaching-utils.zsh, fixed earlier). Also removes the temporary diagnostic job from test.yml now that all CI-only failures are diagnosed and fixed. Workflow is back to zsh-tests + full-suite. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 44 ----------------------- lib/dispatchers/teach-deploy-enhanced.zsh | 3 +- 2 files changed, 2 insertions(+), 45 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 37848b288..c22363f78 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -62,50 +62,6 @@ jobs: echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY - # --------------------------------------------------------------------------- - # TEMPORARY (ci-full-suite-gate Phase 2): teach-deploy suites still fail on CI - # with the yq guard, so yq must be PRESENT and the cause is elsewhere. Capture - # the real runner output + environment fingerprint. DELETE. - # --------------------------------------------------------------------------- - diag-undiagnosed: - name: Diagnose teach-deploy (temporary) - runs-on: ubuntu-latest - continue-on-error: true - steps: - - name: Checkout code - uses: actions/checkout@v6 - - name: Ensure zsh - run: command -v zsh >/dev/null 2>&1 || { sudo apt-get update && sudo apt-get install -y zsh; } - - name: Configure git identity - run: | - git config --global user.email "ci@flow-cli.test" - git config --global user.name "flow-cli CI" - git config --global init.defaultBranch main - - name: Environment fingerprint - run: | - echo "yq: $(command -v yq || echo MISSING) | $(yq --version 2>&1 | head -1)" - echo "git user.name=[$(git config --global user.name)] user.email=[$(git config --global user.email)]" - echo "R: $(command -v R || echo MISSING) | quarto: $(command -v quarto || echo MISSING) | rsync: $(command -v rsync || echo MISSING)" - - name: Create mock project structure - run: | - mkdir -p ~/projects/dev-tools/flow-cli/.git - mkdir -p ~/projects/teaching/stat-440/.git - cp -r . ~/projects/dev-tools/flow-cli/ - - name: stat probe (why em-cache fails on Linux) - if: always() - run: | - f=$(mktemp); echo hi > "$f" - echo "stat -f %m => [$(stat -f %m "$f" 2>&1)] rc=$?" - echo "stat -c %Y => [$(stat -c %Y "$f" 2>&1)] rc=$?" - echo "md5 => [$(echo x | md5 2>&1 | head -1)]" - echo "md5sum => [$(echo x | md5sum 2>&1 | head -1)]" - - name: test-em-dispatcher - if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/test-em-dispatcher.zsh - - name: dogfood-em-dispatcher - if: always() - run: cd ~/projects/dev-tools/flow-cli && timeout 60 zsh ./tests/dogfood-em-dispatcher.zsh - # --------------------------------------------------------------------------- # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. # Non-blocking on purpose (continue-on-error) — this job exists to capture the diff --git a/lib/dispatchers/teach-deploy-enhanced.zsh b/lib/dispatchers/teach-deploy-enhanced.zsh index e65ab7e56..b7c462b90 100644 --- a/lib/dispatchers/teach-deploy-enhanced.zsh +++ b/lib/dispatchers/teach-deploy-enhanced.zsh @@ -495,7 +495,8 @@ _deploy_update_status_file() { start_date=$(yq '.semester_info.start_date // ""' .flow/teach-config.yml 2>/dev/null) if [[ -n "$start_date" && "$start_date" != "null" ]]; then local start_epoch today_epoch week_num - start_epoch=$(date -j -f "%Y-%m-%d" "$start_date" "+%s" 2>/dev/null) + # Portable: BSD `date -j -f` (macOS) then GNU `date -d` (Linux/CI). + start_epoch=$(date -j -f "%Y-%m-%d" "$start_date" "+%s" 2>/dev/null || date -d "$start_date" "+%s" 2>/dev/null) today_epoch=$(date "+%s") if [[ -n "$start_epoch" ]]; then week_num=$(( (today_epoch - start_epoch) / 604800 + 1 )) From 4a70cf04ada89ae504be86adce9a551dc09d2b02 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:53:03 -0600 Subject: [PATCH 20/24] docs(ci-gate): record Phase 2 completion (suite green on runner) Co-Authored-By: Claude Opus 4.8 (1M context) --- ORCHESTRATE-ci-full-suite-gate.md | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/ORCHESTRATE-ci-full-suite-gate.md b/ORCHESTRATE-ci-full-suite-gate.md index 02648b26c..af5cb9978 100644 --- a/ORCHESTRATE-ci-full-suite-gate.md +++ b/ORCHESTRATE-ci-full-suite-gate.md @@ -137,6 +137,27 @@ Goal: `run-all.sh` exits 0 on the runner; identical result locally with/without - 2026-06-14 — Triage of the 3 "pure-zsh" failures (user-selected). Runner-instrumented diagnostic disproved the binary-precedence guess (no `tm` binary on runner). Real cause: `ait`/aiterm absent → `tm` dispatcher degrades to an alias (tm-dispatcher.zsh:44-55). - Wrong fix (tm→FLOW_INTENTIONAL_SHADOWS) committed then reverted (ed98365f). Conclusion: - all 14 are ONE class (tool-absent skew); no real bugs. Phase 2 = uniform skip/degrade - strategy across all 14. Paused for user scoping decision. + Wrong fix (tm→FLOW_INTENTIONAL_SHADOWS) committed then reverted (ed98365f). +- 2026-06-14 — PHASE 2 COMPLETE. Full suite GREEN on runner: **64 passed, 0 failed, + 0 timeout, 1 skipped (e2e-em IMAP)** — run 27486365803, exit 0. Went 14→0. + REAL product bugs the gate caught (would have shipped, Linux-broken): + 1. zsh fd: `exec 201>/200>` bash-only → errors on Linux (flock path). Fixed + doctor-cache.zsh + analysis-cache.zsh (`exec {var}>`). + 2. `stat -f %m` macOS-only → em cache never worked on Linux. Fixed em-cache.zsh + with GNU-first `_em_cache_mtime` (+5 other latent stat sites). KEY subtlety: + `stat -f %m FILE` on Linux PARTIALLY outputs (FS block) before erroring, so + BSD-first `||` corrupts mtime — must try GNU `stat -c %Y` FIRST. + 3. `date -j -f` macOS-only → teaching_week=0 on Linux. Fixed teaching-utils.zsh + (+helpers) and teach-deploy-enhanced.zsh. + 4. help-compliance.zsh listed tm unconditionally → `doctor --help-check` false-flags + tm without aiterm. Now conditional on `ait`. + Test determinism (tool-absent skew, skip when tool absent / keep coverage when present): + tm/ait (automated-plugin-dogfood, test-help-compliance(-dogfood), dogfood-atlas-bridge), + claude (test-cc-dispatcher), himalaya IMAP hang (e2e-em-dispatcher, bounded+rc77), + yq (teach-deploy ×4), R (dogfood-teach-doctor-v2). Atlas-PRESENT skew (local-only): + e2e-core-commands (pin FLOW_ATLAS_ENABLED=no), test-atlas-contract (skip_without_warm_atlas). + CI infra: run-all.sh rc-77 SKIP semantics; git identity provisioned in CI. + METHOD LESSON: instrument the runner; "passes locally with tool hidden" ≠ fixed + (3 wrong-tool diagnoses corrected only via CI diagnostic jobs). + REMAINING: verify local green both ways (atlas on/off); docs/guides/TESTING.md; + Phase 3 (promote to required — outward-facing, needs user sign-off). From 8190c79d055bcb1837cef3225e6496fb7a84b64c Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 20:59:28 -0600 Subject: [PATCH 21/24] docs(testing): document CI full-suite gate + rc-77 skip convention - Two CI jobs (smoke zsh-tests + full-suite run-all.sh); phasing note (non-blocking measurement -> required after soak) - Skip semantics: exit 77 = clean skip when a tool/service is absent; whole-suite vs mixed-suite gating; tool list; FLOW_ATLAS_ENABLED=no determinism note - Refresh stats: 65 suites, 64 passed / 1 skipped / 0 failed; 213 files Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/guides/TESTING.md | 77 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 63 insertions(+), 14 deletions(-) diff --git a/docs/guides/TESTING.md b/docs/guides/TESTING.md index 8eca12b21..c831769ca 100644 --- a/docs/guides/TESTING.md +++ b/docs/guides/TESTING.md @@ -23,10 +23,11 @@ flow-cli uses a **shared test framework** (`tests/test-framework.zsh`) with comp | Metric | Count | |--------|-------| -| Test files | 210 | -| Test suites (run-all.sh) | 58/58 passing | +| Test files | 213 | +| Test suites (run-all.sh) | 65 total — 64 passed, 1 skipped, 0 failed | | Test functions | 12,000+ | -| Expected timeouts | 1 (IMAP connectivity) | +| Expected skips | 1 (`e2e-em-dispatcher` — needs configured IMAP account) | +| CI | runs the full suite on every PR (green on the Ubuntu runner) | --- @@ -263,7 +264,36 @@ zsh tests/test-work.zsh ./tests/run-all.sh ``` -65 suites, ~12000 assertions. Expected: 64/64 pass, 1 timeout (IMAP connectivity). +65 suites, ~12000 assertions. Expected: **64 passed, 0 failed, 0 timeout, 1 skipped**. +The 1 skip is `e2e-em-dispatcher` (needs a configured IMAP account; skips cleanly +otherwise). `run-all.sh` exits **0** when there are no failures or timeouts. + +#### Skip semantics (exit code 77) + +A suite that requires an external tool/service which is absent must **skip +cleanly** rather than fail. Exit **77** (the automake "skip" convention) tells +`run-all.sh` to count the suite as ⏭️ skipped, not ❌ failed: + +```zsh +# Whole-suite guard — put after sourcing, before the tests: +command -v yq >/dev/null 2>&1 || { echo "SKIP: yq not installed"; exit 77; } +``` + +For a **mixed** suite (most cases are tool-independent), gate only the +tool-dependent cases instead of skipping the whole file — e.g. include the `tm` +dispatcher in dispatcher-enumeration checks only `if command -v ait`, so the +other assertions still run. This keeps full coverage on a dev machine that has +the tool while staying green on a hosted runner that doesn't. + +Tools whose absence triggers a skip on CI: `atlas`, `ait` (aiterm), +`himalaya` (IMAP), `R`/`renv`, `quarto`, `claude`. Skips are printed in the +suite output and summarised in the `run-all.sh` results line, so a skip is +always visible (never a silently-missing pass). + +> **Determinism:** suites that assert flow-cli's *standalone* behavior pin +> `FLOW_ATLAS_ENABLED=no` in setup so the result can't flip based on whether +> `atlas` happens to be installed. The suite is green locally **with or without** +> atlas, and on the runner (which has neither atlas nor the other tools above). ### Dogfood Quality Check @@ -302,21 +332,40 @@ test_something() { ## Continuous Integration -### GitHub Actions (`test.yml`) +### GitHub Actions (`.github/workflows/test.yml`) + +Tests run automatically on push and PR to `main`/`dev`, in **two parallel jobs**: + +| Job | Runs | Purpose | +|-----|------|---------| +| **ZSH Plugin Tests** (`zsh-tests`) | smoke tests (`test-flow.zsh`, `test-install.sh`) + man-page version-sync guard | fast signal; the long-standing required check | +| **Full Test Suite** (`full-suite`) | the whole `./tests/run-all.sh` (~4 min) | comprehensive gate — runs every PR | + +The runner has no `atlas`, `ait`, `himalaya`, `R`, or `quarto`, so service- +dependent suites **skip** there (see "Skip semantics" above); everything else +must pass. A git identity is provisioned in the job so deploy suites that +`git commit` work. The `full-suite` job captures the real exit code via +`PIPESTATUS` (so its colour reflects reality) and emits the full `run-all.sh` +output to the job summary. -Tests run automatically on push and PR: +> **Phasing:** `full-suite` starts as a **non-blocking** measurement job +> (`continue-on-error: true`) so it can never create a perpetually-red gate +> while the suite is being made deterministic. Once it has soaked green it is +> promoted to a **required** status check on `dev`, then `main`. ```yaml -name: ZSH Plugin Tests -on: [push, pull_request] -jobs: - test: + full-suite: + name: Full Test Suite (non-blocking) runs-on: ubuntu-latest + continue-on-error: true # measurement phase; drop when promoting to required steps: - - uses: actions/checkout@v4 - - name: Install ZSH - run: sudo apt-get install -y zsh - - name: Run Tests + - uses: actions/checkout@v6 + - name: Configure git identity + run: | + git config --global user.email "ci@flow-cli.test" + git config --global user.name "flow-cli CI" + # ... mock project structure ... + - name: Run full suite (non-blocking) run: ./tests/run-all.sh ``` From aea140be903a2a581f5b8027f9c1e31d8a70c414 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 21:03:33 -0600 Subject: [PATCH 22/24] docs(changelog): record Phase 2 cross-platform bug fixes + gate --- CHANGELOG.md | 34 ++++++++++++++++++++++++++++------ docs/CHANGELOG.md | 34 ++++++++++++++++++++++++++++------ 2 files changed, 56 insertions(+), 12 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8335cad33..c5580b457 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,14 +9,36 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Fixed + +- **Cache locking errored on Linux** (`lib/doctor-cache.zsh`, + `lib/analysis-cache.zsh`): the `flock` path used bash-only high-fd + redirection (`exec 201>`/`exec 200>`), which zsh parses as a command and + fails with "command not found" on Linux (where `flock` exists). Switched to + zsh's dynamic `exec {var}>` allocation. macOS was unaffected (no `flock` → + mkdir fallback), so this only ever broke on Linux/CI. +- **Email cache never worked on Linux** (`lib/em-cache.zsh`): used macOS-only + `stat -f %m`, so every entry's mtime read as 0 and looked expired. Added a + portable `_em_cache_mtime` (GNU `stat -c %Y` first — it fails cleanly on + macOS — then BSD `stat -f %m`). +- **`teaching_week` computed 0 on Linux** (`lib/teaching-utils.zsh`, + `lib/dispatchers/teach-deploy-enhanced.zsh`): used macOS-only `date -j -f`. + Added portable date helpers (BSD then GNU `date -d`). +- **`flow doctor --help-check` false-flagged `tm`** on machines without aiterm + (`lib/help-compliance.zsh`): the `tm` dispatcher only loads its help when the + `ait` CLI is present, so it's now checked only when `ait` is installed. + ### Changed -- **CI now measures the full test suite.** Added a non-blocking `full-suite` - job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR - (Phase 1 of the CI full-suite gate; promotion to a required check is staged). - First full-suite CI run surfaced 14 suites that fail on a hosted runner - because external tools they exercise (`atlas`, `ait`/aiterm, `himalaya`, R, - quarto) are absent — being made to skip/degrade deterministically (Phase 2). +- **CI now runs the full test suite on every PR.** Added a `full-suite` job to + `.github/workflows/test.yml` running `./tests/run-all.sh` (the full 65-suite + suite), parallel to the fast smoke job. It starts non-blocking + (`continue-on-error`) and is promoted to a required check after soaking green. +- **`run-all.sh` skip semantics:** exit code **77** now counts a suite as + *skipped* (not failed) — used by suites that require an external tool/service + (`atlas`, `ait`, `himalaya`, R, quarto, `claude`) absent on a hosted runner. + Service-dependent suites skip/degrade cleanly; standalone-behavior suites pin + `FLOW_ATLAS_ENABLED=no` so results are identical with or without atlas. ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 6770c4145..bb99e19f0 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -8,14 +8,36 @@ The format follows [Keep a Changelog](https://keepachangelog.com/), and this pro ## [Unreleased] +### Fixed + +- **Cache locking errored on Linux** (`lib/doctor-cache.zsh`, + `lib/analysis-cache.zsh`): the `flock` path used bash-only high-fd + redirection (`exec 201>`/`exec 200>`), which zsh parses as a command and + fails with "command not found" on Linux (where `flock` exists). Switched to + zsh's dynamic `exec {var}>` allocation. macOS was unaffected (no `flock` → + mkdir fallback), so this only ever broke on Linux/CI. +- **Email cache never worked on Linux** (`lib/em-cache.zsh`): used macOS-only + `stat -f %m`, so every entry's mtime read as 0 and looked expired. Added a + portable `_em_cache_mtime` (GNU `stat -c %Y` first — it fails cleanly on + macOS — then BSD `stat -f %m`). +- **`teaching_week` computed 0 on Linux** (`lib/teaching-utils.zsh`, + `lib/dispatchers/teach-deploy-enhanced.zsh`): used macOS-only `date -j -f`. + Added portable date helpers (BSD then GNU `date -d`). +- **`flow doctor --help-check` false-flagged `tm`** on machines without aiterm + (`lib/help-compliance.zsh`): the `tm` dispatcher only loads its help when the + `ait` CLI is present, so it's now checked only when `ait` is installed. + ### Changed -- **CI now measures the full test suite.** Added a non-blocking `full-suite` - job to `.github/workflows/test.yml` running `./tests/run-all.sh` on every PR - (Phase 1 of the CI full-suite gate; promotion to a required check is staged). - First full-suite CI run surfaced 14 suites that fail on a hosted runner - because external tools they exercise (`atlas`, `ait`/aiterm, `himalaya`, R, - quarto) are absent — being made to skip/degrade deterministically (Phase 2). +- **CI now runs the full test suite on every PR.** Added a `full-suite` job to + `.github/workflows/test.yml` running `./tests/run-all.sh` (the full 65-suite + suite), parallel to the fast smoke job. It starts non-blocking + (`continue-on-error`) and is promoted to a required check after soaking green. +- **`run-all.sh` skip semantics:** exit code **77** now counts a suite as + *skipped* (not failed) — used by suites that require an external tool/service + (`atlas`, `ait`, `himalaya`, R, quarto, `claude`) absent on a hosted runner. + Service-dependent suites skip/degrade cleanly; standalone-behavior suites pin + `FLOW_ATLAS_ENABLED=no` so results are identical with or without atlas. ## [7.10.0] — 2026-06-13 — forward-looking schedule layer (`agenda` + dash UPCOMING) From 816f0b0f8d9991e5ba977ac3b1ff82bc1869b47e Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 21:28:37 -0600 Subject: [PATCH 23/24] refactor(ci-gate): address PR #465 review nits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - analysis-cache/doctor-cache: declare the flock fd `typeset -g` explicitly instead of relying on zsh's implicit-global-on-assignment, so the cross-function acquire→release reference is unambiguous. - em-cache LRU: null-delimited find/read + tab-separated mtime + `cut -f2-` so cache paths with spaces survive the sort (prior `awk '{print $2}'` truncated them). Defensive — cache files are hash-named. - test.yml: refresh the full-suite job comment (Phase 1 measure → Phase 1+2; still non-blocking pending the Phase 3 dev soak). No behavior change on the green path. Verified: run-all.sh 64 passed / 0 failed / 0 timeout / 1 skipped locally; plugin sources clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/test.yml | 10 +++++----- lib/analysis-cache.zsh | 6 ++++++ lib/doctor-cache.zsh | 5 +++++ lib/em-cache.zsh | 9 ++++++--- 4 files changed, 22 insertions(+), 8 deletions(-) diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index c22363f78..4238ad723 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -63,11 +63,11 @@ jobs: echo "| Duration | ${DURATION}s |" >> $GITHUB_STEP_SUMMARY # --------------------------------------------------------------------------- - # Phase 1 (ci-full-suite-gate): MEASURE the full 65-suite run-all.sh in CI. - # Non-blocking on purpose (continue-on-error) — this job exists to capture the - # ground-truth pass/skip/fail list on a hosted runner (no atlas, no IMAP) - # before we make it a required gate (Phase 3). Do NOT add to required checks - # while this comment is here. + # ci-full-suite-gate (Phase 1+2): run the full 65-suite run-all.sh in CI. + # Phase 1 measured the ground truth; Phase 2 made it green (service/tool-absent + # suites clean-skip via rc 77). Still NON-BLOCKING (continue-on-error) for a + # dev soak — Phase 3 promotes it to a required check (dev → main protection). + # Do NOT add to required checks while this comment is here. # --------------------------------------------------------------------------- full-suite: name: Full Test Suite (non-blocking) diff --git a/lib/analysis-cache.zsh b/lib/analysis-cache.zsh index 3bc3b6052..3b8ae9af6 100644 --- a/lib/analysis-cache.zsh +++ b/lib/analysis-cache.zsh @@ -44,6 +44,12 @@ if ! typeset -f _flow_log_debug >/dev/null 2>&1; then source "${0:A:h}/core.zsh" 2>/dev/null || true fi +# Mutable module state: the flock file descriptor allocated by `exec {var}>` in +# the acquire path and closed in the release path (a different function). Declare +# it `-g` explicitly so the cross-function reference is unambiguous rather than +# relying on zsh's implicit-global-on-assignment behaviour. +typeset -g _ANALYSIS_CACHE_LOCK_FD="" + # ============================================================================= # CONSTANTS # ============================================================================= diff --git a/lib/doctor-cache.zsh b/lib/doctor-cache.zsh index 2287c6c4c..a4ee220f8 100644 --- a/lib/doctor-cache.zsh +++ b/lib/doctor-cache.zsh @@ -59,6 +59,11 @@ if ! typeset -f _flow_log_debug >/dev/null 2>&1; then source "${0:A:h}/core.zsh" 2>/dev/null || true fi +# Mutable module state: flock fd allocated by `exec {var}>` in the acquire path +# and closed in the release path (a different function). Declared `-g` so the +# cross-function reference is explicit, not reliant on implicit globals. +typeset -g _DOCTOR_CACHE_LOCK_FD="" + # ============================================================================= # CONSTANTS # ============================================================================= diff --git a/lib/em-cache.zsh b/lib/em-cache.zsh index edd03f96b..a3c593440 100644 --- a/lib/em-cache.zsh +++ b/lib/em-cache.zsh @@ -197,10 +197,13 @@ _em_cache_enforce_cap() { # Evict oldest files first (LRU) local evicted=0 local files_by_age + # Null-delimited find/read + tab-separated mtimepath + `cut -f2-` so + # paths containing spaces survive the sort (the prior `awk '{print $2}'` + # truncated them). Cache files are hash-named .txt, so this is defensive. files_by_age=("${(@f)$( - find "$cache_base" -name '*.txt' 2>/dev/null | while IFS= read -r _f; do - print -r -- "$(_em_cache_mtime "$_f") $_f" - done | sort -n | awk '{print $2}')}") + find "$cache_base" -name '*.txt' -print0 2>/dev/null | while IFS= read -r -d '' _f; do + print -r -- "$(_em_cache_mtime "$_f")"$'\t'"$_f" + done | sort -n | cut -f2-)}") for old_file in "${files_by_age[@]}"; do [[ -z "$old_file" ]] && continue From 92dbf76e7753aa3e7ecda0f5ab59ccfbb494e128 Mon Sep 17 00:00:00 2001 From: Test User Date: Sat, 13 Jun 2026 21:28:47 -0600 Subject: [PATCH 24/24] chore(ci-gate): remove ORCHESTRATE working artifact before dev merge MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per CLAUDE.md merge-cleanup convention — ORCHESTRATE-*.md are feature-branch working artifacts and should not land on dev. Co-Authored-By: Claude Opus 4.8 (1M context) --- ORCHESTRATE-ci-full-suite-gate.md | 163 ------------------------------ 1 file changed, 163 deletions(-) delete mode 100644 ORCHESTRATE-ci-full-suite-gate.md diff --git a/ORCHESTRATE-ci-full-suite-gate.md b/ORCHESTRATE-ci-full-suite-gate.md deleted file mode 100644 index af5cb9978..000000000 --- a/ORCHESTRATE-ci-full-suite-gate.md +++ /dev/null @@ -1,163 +0,0 @@ -# ORCHESTRATE: Gate the full test suite in CI - -> **Working artifact** for `feature/ci-full-suite-gate`. Implement in a fresh session from -> this worktree (`cd ~/.git-worktrees/flow-cli/ci-full-suite-gate && claude`). -> Authoritative design: `docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md` (read it first). -> **Delete this file during the dev-merge cleanup.** - -## Branch / base -- Worktree: `~/.git-worktrees/flow-cli/ci-full-suite-gate` -- Branch: `feature/ci-full-suite-gate` (off `dev` @ `3b89ff09`) -- Target PR: `--base dev` -- Version: no bump (CI/infra; not a user-facing release on its own) - -## Gating-set decision (default — confirm against Phase 1 data) -**Full-minus-IMAP:** the required gate is all 65 suites EXCEPT `e2e-em-dispatcher` (external IMAP, -can only ever skip on a hosted runner). `test-atlas-contract` stays in the gate but must *skip* -its warm-path cleanly. If Phase 1 reveals more genuinely-external suites, widen the skip list and -note it here. (Smaller fallback per the spec: gate only `test-doctor.zsh` first.) - ---- - -## Phase 1 — Measure (non-blocking) ⏱️ first -Goal: capture ground-truth CI results before changing any test. - -- [ ] Add a **separate** job `full-suite` to `.github/workflows/test.yml`: - - mirrors the existing mock-project setup steps (reuse the `Create mock project structure` block) - - runs `cd ~/projects/dev-tools/flow-cli && ./tests/run-all.sh` - - **non-blocking:** `continue-on-error: true` (do NOT add to required checks yet) - - emits the run-all summary to `$GITHUB_STEP_SUMMARY` -- [ ] Open a draft PR to `dev`; let CI run; **record the actual pass/skip/fail list** in this file. -- [ ] Compare runner reality vs the local table in the spec (atlas absent ⇒ expect the - atlas-skew tests to pass and atlas-contract warm-path to skip). - -**Checkpoint:** paste the Phase 1 runner result here before starting Phase 2. - -``` -Phase 1 CI result (PR #465, run 27483939884, ubuntu-24.04, 2026-06-14): - 51 passed, 14 failed, 0 timeout (run-all.sh exit 1) -``` - -### Phase 1 finding — spec table was WRONG; inverse skew -The runner is NOT cleaner than local. The 2 suites the spec predicted would fail -(`e2e-core-commands`, `test-atlas-contract`) **PASS on the runner** (atlas absent ⇒ -fallback/skip path fires, exactly as hypothesized). But **14 OTHER suites FAIL** — -they pass locally because the Mac has tools the Ubuntu runner lacks (brew, atlas, -himalaya, R, quarto). `e2e-em-dispatcher` **failed** (not timeout) → 0 timeouts. - -14 failing suites (cause = likely, confirm in Phase 2): -| Suite | Likely cause | -|---|---| -| test-doctor | `flow doctor` probes brew/atlas/plugins — none on runner | -| test-cc-dispatcher | cc/claude launcher binary absent | -| test-em-dispatcher | himalaya absent | -| dogfood-em-dispatcher | himalaya absent | -| e2e-em-dispatcher | IMAP/himalaya absent (FAILS, not timeout) | -| dogfood-atlas-bridge | atlas absent | -| dogfood-teach-doctor-v2 | R/renv absent | -| test-teach-deploy-v2-unit | R/quarto/rsync absent | -| test-teach-deploy-v2-integration | R/quarto/rsync absent | -| dogfood-teach-deploy-v2 | R/quarto/rsync absent | -| e2e-teach-deploy-v2 | R/quarto/rsync absent | -| test-help-compliance | `ait`/aiterm absent → `tm` dispatcher degrades (CONFIRMED) | -| test-help-compliance-dogfood | `ait`/aiterm absent → `tm` degrades (CONFIRMED) | -| automated-plugin-dogfood | `ait`/aiterm absent → `tm` is alias not function (CONFIRMED) | - -CORRECTION (2026-06-14, runner-instrumented): the 3 "pure-zsh ⚠️" suites are NOT pure-zsh -and NOT real bugs. Root cause confirmed by a CI diagnostic job: the runner has **no `tm` -binary** (`commands[tm]` empty — the binary-precedence guard was never involved). `tm` is -the **aiterm** dispatcher (`lib/dispatchers/tm-dispatcher.zsh:44-55`): when the `ait` CLI is -absent it intentionally degrades to `alias tm='_tm_not_installed'` and early-returns, so -`tm()`/`_tm_help()` are never defined. The 3 suites assert `tm` is a full dispatcher → -they fail only because `ait` is absent. This is the **same tool-absent skew class** as the -other 11 (atlas/himalaya/R/quarto). A first (wrong) attempt added `tm` to -FLOW_INTENTIONAL_SHADOWS — reverted (commit ed98365f); the fix belongs in the TESTS. - -Implication: ALL 14 failures are one uniform class — service/tool-dependent suites that -must clean-SKIP or accept graceful degradation when their tool is absent. There are NO -real-bug outliers. The spec's "smaller fallback = gate just test-doctor" is non-viable -(test-doctor itself FAILS on the runner). Tools whose absence drives the 14: `atlas`, -`ait` (aiterm), `himalaya` (email/IMAP), `R`/`renv`, `quarto`/`rsync`. - ---- - -## Phase 2 — Make the suite deterministic & green in CI -Goal: `run-all.sh` exits 0 on the runner; identical result locally with/without `atlas` on PATH. - -- [ ] **Determinism (atlas-skew):** in tests that assert *standalone* fallback behavior, pin - `FLOW_ATLAS_ENABLED=no` in setup so installing atlas can't flip the result. - - `tests/e2e-core-commands.zsh` → `status reads .STATUS` ([1]) and `catch creates capture` ([7]); - audit the whole file for other atlas-delegating asserts. -- [ ] **Clean-skip service-dependent tests** (skip only when the dep is genuinely absent; `return 77`): - - `tests/test-atlas-contract.zsh` → route the 4 warm-path tests (`atlas stats|parked|trail`, - currently exit 127) through `skip_without_atlas()` (or skip when `atlas stats` ≠ 0). - - `tests/e2e-em-dispatcher.zsh` → skip IMAP cases (`em unread`/`em read`) when no account is - configured; add a short per-call timeout so a hang can't wedge the suite. -- [ ] **run-all.sh CI semantics:** decide timeout handling. Once IMAP tests SKIP rather than hang, - make `TIMEOUT>0` a hard failure in the gated context (a real hang must be caught). Keep local - behavior unchanged or gate on an env flag (e.g. `FLOW_TEST_CI=1`). -- [ ] Run locally **both ways** to prove determinism: - - with atlas: `./tests/run-all.sh` → 0 - - without: `PATH=$(echo $PATH | tr ':' '\n' | grep -v homebrew | paste -sd:) ./tests/run-all.sh` - (or temporarily shadow atlas) → 0 -- [ ] Update `docs/guides/TESTING.md`: document the gate + how service tests skip; refresh counts - if any test counts change. - -**Definition of green:** every non-skipped suite passes; service-dependent cases report SKIP -(visible in output), never FAIL/TIMEOUT. - ---- - -## Phase 3 — Promote to required -- [ ] Flip `full-suite` to blocking (drop `continue-on-error`); confirm green on the PR. -- [ ] Add `full-suite` to required checks on **`dev`** branch protection; soak ≥1 PR. -- [ ] Then add it to **`main`** protection: - `gh api -X PUT repos/Data-Wise/flow-cli/branches/main/protection --input ` (include the - existing `ZSH Plugin Tests` + new `full-suite`; preserve PR-required/no-force/no-delete). - ⚠️ Outward-facing — do only after dev soak; confirm with user. -- [ ] Keep the fast smoke job (quick signal) alongside the full gate. - ---- - -## Integrate -- [ ] `git fetch origin dev && git rebase origin/dev` -- [ ] `./tests/run-all.sh` green (the whole point — it now gates itself) -- [ ] `gh pr create --base dev` -- [ ] On merge: delete this ORCHESTRATE file; remove worktree + branch (force-delete via user — hook-blocked). - -## Verification (Definition of Done) -1. CI runs the full suite on every PR to dev/main. -2. `run-all.sh` is green on the runner AND locally with/without atlas (determinism proven). -3. Service-dependent cases SKIP visibly; a deliberately-broken test reddens the required check. -4. `docs/guides/TESTING.md` updated; no version/count drift. - -## Notes / decisions log (append during impl) -- 2026-06-14 — Phase 1 done. Non-blocking `full-suite` job added (commit 18ba82db), - draft PR #465 → dev. CI ground truth: 51/14/0. Spec prediction was inverted. -- 2026-06-14 — Triage of the 3 "pure-zsh" failures (user-selected). Runner-instrumented - diagnostic disproved the binary-precedence guess (no `tm` binary on runner). Real cause: - `ait`/aiterm absent → `tm` dispatcher degrades to an alias (tm-dispatcher.zsh:44-55). - Wrong fix (tm→FLOW_INTENTIONAL_SHADOWS) committed then reverted (ed98365f). -- 2026-06-14 — PHASE 2 COMPLETE. Full suite GREEN on runner: **64 passed, 0 failed, - 0 timeout, 1 skipped (e2e-em IMAP)** — run 27486365803, exit 0. Went 14→0. - REAL product bugs the gate caught (would have shipped, Linux-broken): - 1. zsh fd: `exec 201>/200>` bash-only → errors on Linux (flock path). Fixed - doctor-cache.zsh + analysis-cache.zsh (`exec {var}>`). - 2. `stat -f %m` macOS-only → em cache never worked on Linux. Fixed em-cache.zsh - with GNU-first `_em_cache_mtime` (+5 other latent stat sites). KEY subtlety: - `stat -f %m FILE` on Linux PARTIALLY outputs (FS block) before erroring, so - BSD-first `||` corrupts mtime — must try GNU `stat -c %Y` FIRST. - 3. `date -j -f` macOS-only → teaching_week=0 on Linux. Fixed teaching-utils.zsh - (+helpers) and teach-deploy-enhanced.zsh. - 4. help-compliance.zsh listed tm unconditionally → `doctor --help-check` false-flags - tm without aiterm. Now conditional on `ait`. - Test determinism (tool-absent skew, skip when tool absent / keep coverage when present): - tm/ait (automated-plugin-dogfood, test-help-compliance(-dogfood), dogfood-atlas-bridge), - claude (test-cc-dispatcher), himalaya IMAP hang (e2e-em-dispatcher, bounded+rc77), - yq (teach-deploy ×4), R (dogfood-teach-doctor-v2). Atlas-PRESENT skew (local-only): - e2e-core-commands (pin FLOW_ATLAS_ENABLED=no), test-atlas-contract (skip_without_warm_atlas). - CI infra: run-all.sh rc-77 SKIP semantics; git identity provisioned in CI. - METHOD LESSON: instrument the runner; "passes locally with tool hidden" ≠ fixed - (3 wrong-tool diagnoses corrected only via CI diagnostic jobs). - REMAINING: verify local green both ways (atlas on/off); docs/guides/TESTING.md; - Phase 3 (promote to required — outward-facing, needs user sign-off).