diff --git a/CLAUDE.md b/CLAUDE.md index d0dd5a3..3a246ba 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -15,10 +15,10 @@ enforced, …) not a single product. Apps are upstream; we don't own their UI. ## Read these in order 1. [`README.md`](./README.md) — env setup, run commands, what's covered -2. [`skills.md`](./skills.md) — the contract organised by openspec module -3. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — traceability matrix (which test pins which requirement) -4. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories -5. [`TESTS.md`](./TESTS.md) — per-test catalog +2. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — audit table: which test pins which requirement +3. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories +4. [`skills.md`](./skills.md) — findings (F1–F11), link-coverage rules, per-app admin gating narrative, adding-a-new-app guide, bug-staging workflow +5. [`TESTS.md`](./TESTS.md) — per-test catalog (hand-maintained, may lag — `git log -- tests/` is the truth) When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "what is this and what do I do" runbook — failure pattern → cause → action. diff --git a/README.md b/README.md index 8d6c2fc..b9eadd9 100644 --- a/README.md +++ b/README.md @@ -33,8 +33,9 @@ Adding a test without a requirement → CI fails. Adding a requirement without a test → CI fails. Drift requires explicit human acceptance in both directions. -**Today: 84 requirements, 0 missing** — 52 from the SSO chain openspec, -32 from per-app admin + workspace-isolation + security-hardening skills. +**Today: 88 requirements, 0 missing** — 52 from the SSO chain openspec, +36 from per-app admin + workspace-isolation + security-hardening skills. +Run `make audit` for live counts. When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "failure pattern → cause → action" runbook. diff --git a/docs/archive/README.md b/docs/archive/README.md new file mode 100644 index 0000000..5b82ad8 --- /dev/null +++ b/docs/archive/README.md @@ -0,0 +1,29 @@ +# Archive + +Historical analysis and incident docs that are preserved for reference +but are no longer part of the working documentation set. + +Anything moved here was: + +- **Time-bound** — a snapshot of state at a specific date (test reviews, + RCAs) that doesn't update when the codebase moves on. +- **Decision-shaped** — an analysis written to support a one-time call + (e.g. choosing between two PR approaches) where the decision has + since landed and the analysis is no longer load-bearing. +- **Superseded** — the doc's content has been integrated into the + current working docs and the original is kept only as provenance. + +If you're looking for **current** state of the suite, start with: + +- [`../../README.md`](../../README.md) — what the suite does, how to run it +- [`../../CLAUDE.md`](../../CLAUDE.md) — conventions + per-app gotchas +- [`../spec-coverage.md`](../spec-coverage.md) — current audit table +- [`../spec-coverage-deferred.md`](../spec-coverage-deferred.md) — gap pile +- [`../spec-review-checklist.md`](../spec-review-checklist.md) — PR-time discipline +- [`../../TRIAGE.md`](../../TRIAGE.md) — failure-pattern → action runbook +- [`../../skills.md`](../../skills.md) — findings, link-coverage rules, bug-staging workflow + +Don't edit anything in this directory. If something here is still +load-bearing, promote it back to working docs (and update the working +docs to absorb the content). If nothing depends on it, leave it +archived — `git log` preserves provenance forever. diff --git a/docs/logout-all-contract-evolution.md b/docs/archive/logout-all-contract-evolution.md similarity index 100% rename from docs/logout-all-contract-evolution.md rename to docs/archive/logout-all-contract-evolution.md diff --git a/docs/root-cause-88-2026-05-20.md b/docs/archive/root-cause-88-2026-05-20.md similarity index 100% rename from docs/root-cause-88-2026-05-20.md rename to docs/archive/root-cause-88-2026-05-20.md diff --git a/docs/test-review-2026-05-20.md b/docs/archive/test-review-2026-05-20.md similarity index 100% rename from docs/test-review-2026-05-20.md rename to docs/archive/test-review-2026-05-20.md diff --git a/skills.md b/skills.md index d209786..1ddb16a 100644 --- a/skills.md +++ b/skills.md @@ -1,159 +1,32 @@ -# FOSS App Test Coverage Skill - -This is the test-coverage contract every app on the FOSS platform must satisfy. -Coverage is split between **shared suites** (run once across all 5 apps via -`APPS`) and a **per-app suite** that crawls each app's link surface. Adding a -new app to the platform should require touching one constant and registering -one factory call — nothing more. - -The canonical rule source is the openspec at -[awais786/sso-rules-moneta](https://github.com/awais786/sso-rules-moneta/tree/main/openspec/specs) -— one `spec.md` per capability module (`forwardauth-traefik`, -`oauth2-proxy-gateway`, `proxy-auth-middleware`, `session-lifecycle`, -`logout-flow`, `cognito-claim-mapping`, `workspace-auto-join`). - -§1 below mirrors that contract onto specific test files, organised by -openspec module. §2 lists orthogonal coverage (tests this suite ships -that the openspec doesn't claim — OWASP-aligned defences, edge-layer -hardening, product UX contracts). §3 documents this suite's own -per-app link-coverage contract. - -Live coverage status (regenerated by `scripts/check-spec-coverage.sh`): -**59 ✅ Covered / 25 ⚠️ Deferred / 0 ❌ Missing / 84 requirements** — -see `docs/spec-coverage.md` + `docs/spec-coverage-deferred.md`. +# FOSS E2E — Findings, Link-Coverage, Bug Staging ---- - -## §0 — Fast + reliable run profiles (recommended defaults) - -Use these profiles to keep PR feedback fast while preserving confidence: - -| Profile | When | Command | Notes | -|---|---|---|---| -| PR smoke | Every PR/local iteration | `PW_WORKERS=2 PW_SLOW_MO_MS=0 npm test` | Fast default; no visual slowdown. | -| Security focus | Security rule changes | `PW_WORKERS=2 npm run test:security` | Keeps auth/header regressions isolated and quick. | -| Full confidence | Before release / nightly | `BROWSERS=all PW_WORKERS=2 npm test` | Cross-browser sweep; expect longer runtime. | -| Debug UI | Local troubleshooting only | `PW_DEBUG_VISUAL=1 PW_WORKERS=1 npm test` | Enables 2s visual pacing for human observation. | - -Reliability guardrails for new tests: +Working notes for the e2e suite that don't fit into the audit table. +What lives here: -- Prefer locator state waits (`toBeVisible`, `toHaveURL`, `waitFor`) over fixed `waitForTimeout`. -- Keep assertions host-aware (`isAuthWall`, expected hostname) to catch silent auth regressions. -- Avoid `networkidle` for websocket-heavy SPAs (Twenty); prefer `commit`/`load` + explicit render checks. -- Reuse `cognitoLogin()` and worker storage fixtures; do not duplicate login choreography per spec. -- Keep retries environment-scoped (`CI=1`), and treat local flakiness as a bug to remove. - ---- - -## §1 — openspec coverage - -Tests are tagged at the `test()` callsite with `// @spec #`; -the audit script enforces that every openspec requirement either has such -a tag or appears in `docs/spec-coverage-deferred.md` with a category. - -### `forwardauth-traefik` - -Traefik ForwardAuth integration with oauth2-proxy. - -| Requirement | Coverage | Test | -|---|---|---| -| a single mpass-auth middleware SHALL be defined on the oauth2-proxy service | ✅ | `tests/security/header-spoofing.spec.ts` | -| every protected app router SHALL apply mpass-auth | ✅ | `tests/security/header-spoofing.spec.ts` | -| bypass paths SHALL route via higher-priority routers without mpass-auth | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/security/strip-on-bypass.spec.ts` | -| bypass routes per app SHALL match the documented list | ✅ | `tests/security/bypass-surface.spec.ts` (per-app `APP_BYPASS_EXTRAS` + regression guards for paths moved off bypass), `tests/apps/pm-godmode.spec.ts` | -| header overwrite SHALL be enforced | ✅ | `tests/security/header-spoofing.spec.ts` (partial — see F1 for the strip-vs-replace caveat) | -| backend ports SHALL be bound to 127.0.0.1 only | ⚠️ Deferred — infra-only | -| auth-response headers SHALL include exactly the three required headers | ⚠️ Deferred — infra-only | - -### `oauth2-proxy-gateway` - -The central SSO gateway. - -| Requirement | Coverage | Test | -|---|---|---| -| gateway SHALL run as a single dedicated service | 🟡 Partial | `tests/security/bypass-surface.spec.ts` (every protected path hits the same gateway) | -| cookie domain SHALL be the platform parent domain | ✅ | `tests/auth/session-sharing.spec.ts`, `tests/auth/sso-login.spec.ts` | -| gateway SHALL emit X-Auth-Request-* headers on authenticated responses | ✅ | `tests/auth/identity-consistency.spec.ts` | -| cookie secret SHALL be 32 random bytes, base64-encoded | ✅ (indirect) | `tests/security/cookie-tampering.spec.ts` — proves the HMAC is checked at runtime | -| gateway SHALL use OIDC Discovery against the Cognito issuer | ⚠️ Deferred — infra-only | -| gateway SHALL use a redis-backed session store | ⚠️ Deferred — genuine test gap (cookie-size proxy possible) | -| gateway SHALL pass access token to downstream apps when requested | ⚠️ Deferred — needs infra access | -| gateway SHALL use the configurable identity claim | ⚠️ Deferred — infra-only | -| single shared callback URL | ⚠️ Deferred — infra-only | - -### `proxy-auth-middleware` - -Per-app middleware that turns `X-Auth-Request-*` into a native session. - -| Requirement | Coverage | Test | -|---|---|---| -| Bypass paths SHALL short-circuit before any auth processing | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/apps/pm-godmode.spec.ts` | -| Authenticated sessions with matching or absent proxy identity SHALL short-circuit | 🟡 Partial | `tests/auth/proxy-short-circuit.spec.ts` (Outline observable; other apps lack JS-readable session cookies and self-skip with the contract vacuously satisfied) | -| Identity mismatch SHALL flush the existing session immediately | ✅ | `tests/flows/identity-switch-after-relogin.spec.ts` (Penpot temporarily skipped — cosmetic stale display cache only, mutations are correct) | -| Unauthenticated requests with a valid proxy identity SHALL auto-provision and log in | ✅ | `tests/auth/sso-login.spec.ts` | -| Email normalisation SHALL be applied uniformly | 🟡 Partial | `tests/auth/identity-consistency.spec.ts` (pins final email, not normalisation rules) | -| Concurrent creation races SHALL fall back to read | 🟡 Partial | `tests/auth/concurrent-first-login.spec.ts` (3 parallel browser contexts → assert single identity; behavioural, doesn't count DB rows) | -| email-shape detection SHALL avoid polynomial-backtracking regex | ⚠️ Deferred — covered by per-fork sso-audit.sh static check | - -### `session-lifecycle` - -Two-layer session model and renewal behaviour. - -| Requirement | Coverage | Test | -|---|---|---| -| the system SHALL maintain two distinct session layers | ✅ | `tests/auth/sso-login.spec.ts`, `tests/auth/session-sharing.spec.ts` | -| Layer 2 expiry while Layer 1 is valid SHALL re-establish session from headers | ✅ | `tests/auth/layer2-re-establish.spec.ts` — all 5 apps (clear local cookies/storage, keep SSO, reload → silent re-establish) | -| simultaneous expiry of both layers SHALL redirect to mPass login | 🟡 Partial | `tests/auth/session-lifecycle.spec.ts` (cookie deletion proxies for expiry) | -| Layer-2 session renewal SHALL be guarded against three regression paths | 🟡 Partial | `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` (Penpot only — same pattern extends to Outline/Plane) | -| Layer 1 SHALL refresh transparently against OIDC | ⚠️ Deferred — needs time fast-forward | -| Layer 1 expiry while Layer 2 is valid SHALL re-auth transparently | ⚠️ Deferred — needs time fast-forward | -| mPass-side session revocation SHALL be honoured on next refresh | ⚠️ Deferred — Cognito-side | -| per-app session TTLs SHALL be uniformly configurable | ⚠️ Deferred — infra-only | -| bridge state TTL SHALL be 3 minutes | ⚠️ Deferred — needs 3-min wait or TTL mock | - -### `logout-flow` - -Per-app vs portal-level logout semantics. - -| Requirement | Coverage | Test | -|---|---|---| -| per-app "Logout" SHALL be navigation-only | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 3 — per-app helper in `tests/lib/app-menus.ts` opens each app's user menu; click asserts no `/sign_out` / `/auth/sign-out` / cognito-logout call, SSO cookie untouched | -| portal "logout all" SHALL clear only the _oauth2_proxy cookie | ✅ | `tests/auth/session-lifecycle.spec.ts` + `tests/auth/logout-invariants.spec.ts` sub-test 1 (per-app cookies survive — the "only" half) | -| stale app-native sessions SHALL be reaped on next request, not eagerly | ✅ | `tests/auth/session-lifecycle.spec.ts` ("deleting cookie locks every app"), `tests/flows/cross-tab-logout-propagation.spec.ts` | -| logout SHALL be observable and idempotent | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 2 (`/oauth2/sign_out` invoked twice → no 5xx) | -| per-app "Logout" SHALL NOT be relied on for security | ⚠️ Deferred — negative policy | -| Cognito SSO teardown is operator-callable but not surfaced as a user action | ⚠️ Deferred — operator-only | -| Cognito allowlist SHALL include the portal main page | ⚠️ Deferred — Cognito-side | - -### `cognito-claim-mapping` - -Cognito ID-token claims → oauth2-proxy headers → per-app user records. - -| Requirement | Coverage | Test | -|---|---|---| -| standard claim → header mapping | ✅ | `tests/auth/identity-consistency.spec.ts` | -| identity claim SHALL be configurable when email is unreliable | ⚠️ Deferred — infra-only (this deployment uses `cognito:username` synthesised via `DEFAULT_EMAIL_DOMAIN`) | -| claim mapping SHALL be the same across cookie / JWT-bearer flow | ⚠️ Deferred — genuine test gap (needs JWT-bearer endpoint exposed) | -| id_token vs access_token audience claim SHALL both be accepted | ⚠️ Deferred — needs access-token exposure | -| display name SHALL be derived without round-trip when possible | ⚠️ Deferred — genuine test gap (network-listener probe on first paint) | - -### `workspace-auto-join` - -How each app onboards a newly-provisioned SSO user. - -| Requirement | Coverage | Test | -|---|---|---| -| auto-join SHALL run on every login, not just on user creation | ⚠️ Deferred — genuine test gap | -| auto-join SHALL skip when no workspace exists yet | ⚠️ Deferred — needs fresh bundle | -| auto-join target SHALL be the oldest workspace | ⚠️ Deferred — needs 2+ workspaces | -| auto-join role SHALL be the app's regular-member role, not Admin or Guest | ⚠️ Deferred — Partial via `tests/apps/*-admin.spec.ts` showing NORMAL_USER is not admin; not yet tagged | -| auto-join SHALL mark onboarding complete on the user profile | ⚠️ Deferred — genuine test gap | -| per-app workspace model SHALL be documented in workspaces.md | ⚠️ Deferred — doc requirement | -| auto-join SHALL NOT leak across apps | ⚠️ Deferred — genuine test gap | +| § | Content | +|---|---| +| §1 | Orthogonal coverage — tests this suite ships that the openspec doesn't claim (OWASP edge-layer, product UX, per-app admin gating) | +| §2 | The per-app link-coverage contract — L1 through L8, enforced by `tests/lib/link-coverage.ts` | +| §3 | Per-app branding assertions (where applicable) | +| §4 | Adding a new app — five-step checklist | +| §5 | Findings + known limitations (F1–F11) — hard-won caveats about what each test catches and what it doesn't | +| §6 | Implementation reference — pointers to the lib files | +| §7 | Bug staging — the `tests/bugs/` lifecycle, including the test-writer agent pipeline | + +For the **openspec coverage table** (which test pins which requirement), +see [`docs/spec-coverage.md`](./docs/spec-coverage.md). For the **gap +pile with reasons**, see [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md). +For **how to walk a spec/test PR**, see +[`docs/spec-review-checklist.md`](./docs/spec-review-checklist.md). + +The canonical SSO rule source is vendored at +[`vendor/openspec/specs/`](./vendor/openspec/specs/) and per-app + +security-hardening rules at [`vendor/openspec/skills/`](./vendor/openspec/skills/). +Run `make audit` for live coverage numbers. --- -## §2 — Orthogonal coverage (not in openspec) +## §1 — Orthogonal coverage (not in openspec) Tests this suite ships that the openspec contract doesn't claim. Each test answers a real production-incident risk that lives outside the @@ -181,8 +54,11 @@ SSO contract's scope. These pin per-app authorization (separate from the SSO contract) and together provide partial evidence that auto-joined NORMAL_USER does -NOT end up with admin access — see `workspace-auto-join#auto-join-role` -above. +NOT end up with admin access. The +`workspace-auto-join#auto-join-role-shall-be-the-app-s-regular-member-role-not-admin-or-guest` +requirement is tracked in [`docs/spec-coverage.md`](./docs/spec-coverage.md) +and its deferred entry in +[`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md). | Test | What it pins | |---|---| @@ -194,7 +70,7 @@ above. --- -## §3 — Per-app link-coverage rules (this skill) +## §2 — Per-app link-coverage rules (this skill) Implemented by `tests/lib/link-coverage.ts` and registered via `registerLinkCoverage({ appName, baseUrl })` from each `tests/apps/.spec.ts`. @@ -246,7 +122,7 @@ all clicks use `force: true` to bypass any remaining overlay. --- -## §4 — Per-app branding (optional) +## §3 — Per-app branding (optional) When an app sets a recognizable ``, assert it once in `tests/apps/<app>.spec.ts`: @@ -260,7 +136,7 @@ When an app sets a recognizable `<title>`, assert it once in --- -## §5 — Adding a new app +## §4 — Adding a new app 1. Add a URL to `APP_URLS` in `constants.ts` (env-overridable via `FOSS_APP_<NAME>`). 2. Create `tests/apps/<app>.spec.ts`: @@ -292,7 +168,7 @@ When an app sets a recognizable `<title>`, assert it once in --- -## §6 — Findings + known limitations +## §5 — Findings + known limitations Tests that catch SOME failure modes but not all, plus deployment-state observations that affect interpretation. @@ -334,7 +210,7 @@ observations that affect interpretation. --- -## §7 — Implementation reference +## §6 — Implementation reference Single source of truth: `tests/lib/link-coverage.ts` @@ -356,7 +232,7 @@ requirement is either covered by a `// @spec` tag or in --- -## §8 — Bug staging (unverified reproductions) +## §7 — Bug staging (unverified reproductions) A fourth class of tests, distinct from §1 (openspec coverage), §2 (orthogonal coverage), and §3 (per-app link coverage): **reproductions