Conversation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Spec-only. Documents the smoke-only CI gap and the key finding that run-all.sh is not deterministically green (atlas-present skew in e2e-core-commands; atlas warm-path 127s in test-atlas-contract; IMAP timeout in e2e-em-dispatcher). Proposes a phased, never-red approach: measure (non-blocking job) → make deterministic/skip service tests → promote to required check (dev then main). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Spec + ORCHESTRATE committed; awaiting impl in a fresh session. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
agenda was only in the transient "What's New" box + README; add it to the permanent core-commands listing (next to dash — now vs. soon) so it survives the next release's What's-New rewrite. mkdocs --strict clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v7.10.0 shipped; CI-gate Phase 1 measured (PR #465: 51/14/0, all 14 = tool-absent skew). Phase 2 (clean-skip the 14 suites) is the WIP, resumes in the ci-full-suite-gate worktree session. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Working artifact for feature/ci-full-suite-gate. Implement in a fresh session from this worktree. See docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md. Phased: measure (non-blocking) → determinism/skip → promote to required. Delete this file during dev merge cleanup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 1 of ci-full-suite-gate: run the full 65-suite run-all.sh on the hosted runner to capture ground-truth pass/skip/fail before gating. - Separate job (parallel to smoke), continue-on-error: true => non-blocking - Captures run-all.sh real exit via PIPESTATUS (tee masks it); re-exits so job color reflects reality (0=clean,1=FAIL,2=TIMEOUT) but never blocks PR - Emits full output + exit code to $GITHUB_STEP_SUMMARY for measurement - NOT added to required checks (that's Phase 3) Ref: docs/specs/SPEC-ci-full-suite-gate-2026-06-13.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Spec prediction inverted: e2e-core-commands + test-atlas-contract PASS on runner; 14 OTHER suites FAIL (tool-absent: brew/atlas/himalaya/R/quarto). 3 pure-zsh suites fail unexpectedly -> triage as possible real bugs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Temporary Phase 2 triage. Runs help-compliance, help-compliance-dogfood, automated-plugin-dogfood with output visible + locale fingerprint to find why they fail on the runner but pass locally. To be deleted post-diagnosis. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The binary-precedence guard's default keep-list was (r mcp cc), omitting tm. On systems with a `tm` binary (some Linux distros, GitHub ubuntu runners) the documented tm terminal-manager dispatcher was SILENTLY unfunctioned at load (the skip notice only prints under FLOW_DEBUG) — invisible on macOS dev boxes with no tm binary. Surfaced by Phase 1 of the CI full-suite gate: 3 suites (help-compliance, help-compliance-dogfood, automated-plugin-dogfood) failed on the runner but passed locally; all traced to `tm` not being a function. - flow.plugin.zsh: default FLOW_INTENTIONAL_SHADOWS now (r mcp cc tm) - test-dispatcher-binary-precedence.zsh: regression test simulating a tm binary collision against the real tm-dispatcher.zsh; extend the "intentional shadow survives" loop to cover tm - CLAUDE.md + CHANGELOG x2: document the new default (historical CHANGELOG entries left as-is per project convention) - remove the temporary diagnose-pure-zsh CI job (triage complete) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Diagnostic on the runner disproved the binary-precedence hypothesis: there is NO tm binary on ubuntu-latest (commands[tm] empty). tm resolves to an ALIAS and _tm_help is undefined because tm-dispatcher.zsh requires the `ait` (aiterm) CLI and early-returns to a graceful "not installed" alias when it's absent (tm-dispatcher.zsh:44-55). The runner has no ait. So the 3 "pure-zsh" failures are NOT pure-zsh — they are the SAME tool-absent skew class as the other 11 (atlas/himalaya/R/quarto). The correct fix belongs in the tests (gate tm assertions on `ait`), not in the loader. Reverting: - flow.plugin.zsh: default keep-list back to (r mcp cc) - test-dispatcher-binary-precedence.zsh: drop the bogus tm collision test - CLAUDE.md + CHANGELOG x2: remove the false "tm binary" claim; keep the accurate "CI measures full suite" note - remove the temporary diag-tm CI job Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t class Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 2, batch 1 of the tool-absent-skew fixes. The `tm` dispatcher only loads fully when the `ait` (aiterm) CLI is present; on hosted runners it degrades to an alias, so suites asserting tm-is-a-full-dispatcher failed. Foundation: - run-all.sh: exit code 77 now counted as SKIP (not FAIL); shown in the results line + an explanatory note. Whole-suite tool guards will use it. tm/aiterm determinism (mixed suites — keep full coverage when ait present, skip only the tm cases when absent): - automated-plugin-dogfood.zsh: include tm in the dispatcher / help-fn checks only when `ait` exists. - lib/help-compliance.zsh: _FLOW_HELP_DISPATCHERS includes tm only when `ait` exists — also stops `flow doctor --help-check` from false-flagging tm as non-compliant on machines without aiterm (real fix, not just tests). Fixes test-help-compliance.zsh (no edit needed) via the shared list. - test-help-compliance-dogfood.zsh: skip tm in all subject loops when ait absent; expected dispatcher count is dynamic (14 with ait, 13 without). Verified locally both ways (tool present AND hidden via PATH sandbox): with ait: 16/16, 379/379, 60/60 without ait: 15/15, 351/351, 58/58 (all exit 0) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
run-all.sh in CI exposed a real Linux-only runtime bug: lib/doctor-cache.zsh
and lib/analysis-cache.zsh used bash-only high-fd redirection (`exec 201>`,
`exec 200>`). In zsh a literal fd >= 10 is parsed as a COMMAND, so
`exec 201>file` errors with "command not found: 201". The flock branch only
runs when `flock` exists — true on Linux, false on macOS (which falls back to
mkdir locking) — so this only ever broke on hosted CI runners, never locally.
Fix: use zsh's dynamic `exec {var}>file` allocation and reference $var on
acquire, lock, and release. Verified the syntax in zsh; the old literal form
is proven to fail. This is exactly the regression-class the full-suite gate
was built to catch (smoke-only CI never ran test-doctor).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ach-doctor) Phase 2, batch 2. Make tool-dependent cases skip cleanly when the tool is absent (CI runner), preserving full coverage when present: - test-cc-dispatcher: gate 2 cases that exec `claude` (HERE path). - e2e-em-dispatcher: bound the IMAP/himalaya check with `timeout` and exit 77 when unreachable (real cause was a HANG, not a missing binary). - dogfood-teach-doctor-v2: gate the renv.lock case on `R`. - teach-deploy-v2 (unit/integration/dogfood/e2e): gate on `yq` (the deploy history helpers parse YAML via yq), exit 77 when absent. Verified both ways (tool present AND hidden via PATH sandbox): identical pass counts with the tool, clean skip/exit-0 without. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
These two suites pass on CI (atlas absent) but FAILED locally (atlas installed) — the inverse skew the spec flagged. Acceptance criterion: the suite must be green locally whether or not atlas is installed. - e2e-core-commands: export FLOW_ATLAS_ENABLED=no before sourcing so `status` and `catch` exercise flow-cli's standalone fallback (with atlas installed they delegate to the binary, flipping [1]/[7]). - test-atlas-contract: add skip_without_warm_atlas() — the `atlas` on PATH may be a different/older binary whose stats/parked/trail/-v return 127; route the 4 warm-path/exit-code contract tests through it so they skip unless a flow-compatible atlas actually implements them. Verified both ways: e2e-core-commands 22/22 with atlas and without; test-atlas-contract 14/18 (4 skip) with atlas, 11/18 without — all exit 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idge tm gate Two more tool-absent-skew / cross-platform fixes surfaced by the CI gate: - lib/em-cache.zsh: replaced macOS-only `stat -f %m`/`stat -f '%m %N'` with a portable `_em_cache_mtime` helper (BSD `stat -f` then GNU `stat -c %Y`). On Linux the bare `stat -f` failed → mtime read as 0 → every entry looked expired → cache get/prune/cap returned empty. The email cache never worked on Linux. (Real product bug, caught by test-em/dogfood-em cache round-trip.) - tests/dogfood-atlas-bridge.zsh: the "at() coexists with all 14 dispatchers" case failed because `tm` isn't a function without aiterm (ait). Gate tm on `command -v ait` (same pattern as the other dispatcher-enumeration suites) — this was a tm/ait issue, not atlas. Verified locally: dogfood-atlas-bridge 29/29 with AND without ait; em suites 108/108 + 159/159 (macOS/BSD path); mtime helper returns a real epoch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
teach-deploy suites run 'git commit' (direct deploy, history, back-merge) which fails with 'empty ident' on a fresh runner (git user.name/email unset). Configure a CI git identity — environment provisioning, not a test change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
teach-deploy [42] "teaching_week from start_date" failed on Linux: the date math used `date -j -f` (BSD/macOS only), which fails on GNU date → empty epoch → week calc returns 0. Added _teach_date_to_epoch/_teach_epoch_to_date helpers (BSD `date -j -f` then GNU `date -d`) and routed all 6 date conversions through them. macOS behavior unchanged (BSD form wins first); Linux now works. Also re-point the temporary diagnostic at the em-cache suites + a stat/md5 probe to find why em-cache still fails on Linux after the stat -f fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The em-cache stat fix was still broken on Linux. `stat -f %m FILE` there treats -f as --file-system and prints a filesystem block for FILE to stdout while erroring on the `%m` operand, so a BSD-first `stat -f %m || stat -c %Y` captures BOTH outputs → garbage mtime → cache always looks expired → email cache get/prune/cap return empty (test-em/dogfood-em cache round-trip FAIL). Fix: try GNU `stat -c %Y` FIRST (works cleanly on Linux), fall back to BSD `stat -f %m` (which fails cleanly on macOS with an illegal-option error and empty stdout). Verified the order is correct on both platforms via a CI stat probe and locally. Applied the same swap to the 5 other BSD-first stat sites (teach-doctor-impl x4, teach-dispatcher x1) to prevent the same latent bug. Local (macOS, GNU-first): test-em 108/108, dogfood-em 159/159, dogfood-teach-doctor-v2 43/43. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…c job dogfood-teach-deploy-v2 [42] "Status update calculates teaching_week" used a bare macOS `date -j -f` in the deploy status-update path (teach-deploy-enhanced.zsh) — added the GNU `date -d` fallback (the deploy path is separate from teaching-utils.zsh, fixed earlier). Also removes the temporary diagnostic job from test.yml now that all CI-only failures are diagnosed and fixed. Workflow is back to zsh-tests + full-suite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Two CI jobs (smoke zsh-tests + full-suite run-all.sh); phasing note (non-blocking measurement -> required after soak) - Skip semantics: exit 77 = clean skip when a tool/service is absent; whole-suite vs mixed-suite gating; tool list; FLOW_ATLAS_ENABLED=no determinism note - Refresh stats: 65 suites, 64 passed / 1 skipped / 0 failed; 213 files Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- analysis-cache/doctor-cache: declare the flock fd `typeset -g` explicitly
instead of relying on zsh's implicit-global-on-assignment, so the
cross-function acquire→release reference is unambiguous.
- em-cache LRU: null-delimited find/read + tab-separated mtime + `cut -f2-`
so cache paths with spaces survive the sort (prior `awk '{print $2}'`
truncated them). Defensive — cache files are hash-named.
- test.yml: refresh the full-suite job comment (Phase 1 measure → Phase 1+2;
still non-blocking pending the Phase 3 dev soak).
No behavior change on the green path. Verified: run-all.sh 64 passed / 0
failed / 0 timeout / 1 skipped locally; plugin sources clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per CLAUDE.md merge-cleanup convention — ORCHESTRATE-*.md are feature-branch working artifacts and should not land on dev. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ci: gate the full test suite (Phase 1+2 — measure + make deterministic)
Phase 1+2 merged to dev (b57c0d8) — full suite runs in CI (non-blocking), 64 pass/0 fail/1 skip via rc-77; 4 cross-platform bugs fixed. Worktree + branch removed. Remaining: Phase 3 (promote to required after dev soak). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The full-suite job (run-all.sh) has soaked green on dev (PR #465). Drop continue-on-error so a real failure now fails the check, and rename the job "Full Test Suite (non-blocking)" -> "Full Test Suite" so the required-check name is honest. Service/tool-absent suites still clean-skip via rc 77, so a non-zero exit is a genuine regression. Next: add "Full Test Suite" to dev branch protection (this PR's run registers the renamed check); main protection follows after a dev soak. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ci: promote full-suite to a required gate (Phase 3)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…CI gate - release.sh bump: flow.plugin.zsh, package.json, CLAUDE.md, man pages (.TH) - CHANGELOG (root + docs) promoted [Unreleased] to [7.10.1] - README + index.md "What's New" refreshed (was stale at v7.4.0 → agenda + reliability) - 74 doc-footer version stamps swept to v7.10.1 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New tutorial docs/tutorials/48-agenda-schedule.md (ADHD & Productivity nav group) covering the agenda command, ## Schedule: grammar, recurring blocks, filters, and the dash/morning/today/week surfaces — closes the one pre-release doc gap. - Bumped stale '213 test files' → 216 (CLAUDE.md, index.md, TESTING.md). - mkdocs build --strict passes (0 warnings). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release v7.10.1 — Linux portability fixes + full-suite CI gate
Patch release. No new features or commands — runtime bug fixes (Linux portability + the
tmloader) plus CI/test infrastructure shipped since v7.10.0.Fixed
lib/doctor-cache.zsh,lib/analysis-cache.zsh) — theflockpath used bash-only high-fd redirection (exec 201>), which zsh parses as a command; switched to zsh's dynamicexec {var}>. macOS used the mkdir fallback, so this only ever broke on Linux/CI.lib/em-cache.zsh) — macOS-onlystat -f %mread every entry's mtime as 0; added a portable_em_cache_mtime(GNUstat -c %Yfirst, then BSD).teaching_weekcomputed 0 on Linux (lib/teaching-utils.zsh) — macOS-onlydate -j -f; added portable date helpers (BSD then GNUdate -d).flow doctor --help-checkfalse-flaggedtmon machines without aiterm —tmonly loads its help whenaitis present, so it's now checked only then.Changed
full-suitejob running./tests/run-all.sh, parallel to the fast smoke job, now a required check onmain(a red suite blocks release).run-all.shskip semantics — exit code 77 now counts a suite as skipped (external tool/service absent on a hosted runner); standalone-behavior suites pinFLOW_ATLAS_ENABLED=nofor determinism.Verification
test-doctorre-ran 31/31 isolated).devCI green oncf70510e— bothZSH Plugin TestsandFull Test Suite.🤖 Generated with Claude Code