Pressingly · awais786 · Jun 3, 2026 · Jun 3, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -15,10 +15,10 @@ enforced, …) not a single product. Apps are upstream; we don't own their UI.
 ## Read these in order
 
 1. [`README.md`](./README.md) — env setup, run commands, what's covered
-2. [`skills.md`](./skills.md) — the contract organised by openspec module
-3. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — traceability matrix (which test pins which requirement)
-4. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories
-5. [`TESTS.md`](./TESTS.md) — per-test catalog
+2. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — audit table: which test pins which requirement
+3. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories
+4. [`skills.md`](./skills.md) — findings (F1–F11), link-coverage rules, per-app admin gating narrative, adding-a-new-app guide, bug-staging workflow
+5. [`TESTS.md`](./TESTS.md) — per-test catalog (hand-maintained, may lag — `git log -- tests/` is the truth)
 
 When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "what is
 this and what do I do" runbook — failure pattern → cause → action.

diff --git a/README.md b/README.md
@@ -33,8 +33,9 @@ Adding a test without a requirement → CI fails. Adding a requirement
 without a test → CI fails. Drift requires explicit human acceptance in
 both directions.
 
-**Today: 84 requirements, 0 missing** — 52 from the SSO chain openspec,
-32 from per-app admin + workspace-isolation + security-hardening skills.
+**Today: 88 requirements, 0 missing** — 52 from the SSO chain openspec,
+36 from per-app admin + workspace-isolation + security-hardening skills.
+Run `make audit` for live counts.
 
 When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "failure
 pattern → cause → action" runbook.

diff --git a/docs/archive/README.md b/docs/archive/README.md
@@ -0,0 +1,29 @@
+# Archive
+
+Historical analysis and incident docs that are preserved for reference
+but are no longer part of the working documentation set.
+
+Anything moved here was:
+
+- **Time-bound** — a snapshot of state at a specific date (test reviews,
+  RCAs) that doesn't update when the codebase moves on.
+- **Decision-shaped** — an analysis written to support a one-time call
+  (e.g. choosing between two PR approaches) where the decision has
+  since landed and the analysis is no longer load-bearing.
+- **Superseded** — the doc's content has been integrated into the
+  current working docs and the original is kept only as provenance.
+
+If you're looking for **current** state of the suite, start with:
+
+- [`../../README.md`](../../README.md) — what the suite does, how to run it
+- [`../../CLAUDE.md`](../../CLAUDE.md) — conventions + per-app gotchas
+- [`../spec-coverage.md`](../spec-coverage.md) — current audit table
+- [`../spec-coverage-deferred.md`](../spec-coverage-deferred.md) — gap pile
+- [`../spec-review-checklist.md`](../spec-review-checklist.md) — PR-time discipline
+- [`../../TRIAGE.md`](../../TRIAGE.md) — failure-pattern → action runbook
+- [`../../skills.md`](../../skills.md) — findings, link-coverage rules, bug-staging workflow
+
+Don't edit anything in this directory. If something here is still
+load-bearing, promote it back to working docs (and update the working
+docs to absorb the content). If nothing depends on it, leave it
+archived — `git log` preserves provenance forever.
diff --git a/docs/logout-all-contract-evolution.md → .../archive/logout-all-contract-evolution.md b/docs/logout-all-contract-evolution.md → .../archive/logout-all-contract-evolution.md
diff --git a/docs/root-cause-88-2026-05-20.md → docs/archive/root-cause-88-2026-05-20.md b/docs/root-cause-88-2026-05-20.md → docs/archive/root-cause-88-2026-05-20.md
diff --git a/docs/test-review-2026-05-20.md → docs/archive/test-review-2026-05-20.md b/docs/test-review-2026-05-20.md → docs/archive/test-review-2026-05-20.md
diff --git a/skills.md b/skills.md
@@ -1,159 +1,32 @@
-# FOSS App Test Coverage Skill
-
-This is the test-coverage contract every app on the FOSS platform must satisfy.
-Coverage is split between **shared suites** (run once across all 5 apps via
-`APPS`) and a **per-app suite** that crawls each app's link surface. Adding a
-new app to the platform should require touching one constant and registering
-one factory call — nothing more.
-
-The canonical rule source is the openspec at
-[awais786/sso-rules-moneta](https://github.com/awais786/sso-rules-moneta/tree/main/openspec/specs)
-— one `spec.md` per capability module (`forwardauth-traefik`,
-`oauth2-proxy-gateway`, `proxy-auth-middleware`, `session-lifecycle`,
-`logout-flow`, `cognito-claim-mapping`, `workspace-auto-join`).
-
-§1 below mirrors that contract onto specific test files, organised by
-openspec module. §2 lists orthogonal coverage (tests this suite ships
-that the openspec doesn't claim — OWASP-aligned defences, edge-layer
-hardening, product UX contracts). §3 documents this suite's own
-per-app link-coverage contract.
-
-Live coverage status (regenerated by `scripts/check-spec-coverage.sh`):
-**59 ✅ Covered / 25 ⚠️ Deferred / 0 ❌ Missing / 84 requirements** —
-see `docs/spec-coverage.md` + `docs/spec-coverage-deferred.md`.
+# FOSS E2E — Findings, Link-Coverage, Bug Staging
 
----
-
-## §0 — Fast + reliable run profiles (recommended defaults)
-
-Use these profiles to keep PR feedback fast while preserving confidence:
-
-| Profile | When | Command | Notes |
-|---|---|---|---|
-| PR smoke | Every PR/local iteration | `PW_WORKERS=2 PW_SLOW_MO_MS=0 npm test` | Fast default; no visual slowdown. |
-| Security focus | Security rule changes | `PW_WORKERS=2 npm run test:security` | Keeps auth/header regressions isolated and quick. |
-| Full confidence | Before release / nightly | `BROWSERS=all PW_WORKERS=2 npm test` | Cross-browser sweep; expect longer runtime. |
-| Debug UI | Local troubleshooting only | `PW_DEBUG_VISUAL=1 PW_WORKERS=1 npm test` | Enables 2s visual pacing for human observation. |
-
-Reliability guardrails for new tests:
+Working notes for the e2e suite that don't fit into the audit table.
+What lives here:
 
-- Prefer locator state waits (`toBeVisible`, `toHaveURL`, `waitFor`) over fixed `waitForTimeout`.
-- Keep assertions host-aware (`isAuthWall`, expected hostname) to catch silent auth regressions.
-- Avoid `networkidle` for websocket-heavy SPAs (Twenty); prefer `commit`/`load` + explicit render checks.
-- Reuse `cognitoLogin()` and worker storage fixtures; do not duplicate login choreography per spec.
-- Keep retries environment-scoped (`CI=1`), and treat local flakiness as a bug to remove.
-
----
-
-## §1 — openspec coverage
-
-Tests are tagged at the `test()` callsite with `// @spec <module>#<slug>`;
-the audit script enforces that every openspec requirement either has such
-a tag or appears in `docs/spec-coverage-deferred.md` with a category.
-
-### `forwardauth-traefik`
-
-Traefik ForwardAuth integration with oauth2-proxy.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| a single mpass-auth middleware SHALL be defined on the oauth2-proxy service | ✅ | `tests/security/header-spoofing.spec.ts` |
-| every protected app router SHALL apply mpass-auth | ✅ | `tests/security/header-spoofing.spec.ts` |
-| bypass paths SHALL route via higher-priority routers without mpass-auth | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/security/strip-on-bypass.spec.ts` |
-| bypass routes per app SHALL match the documented list | ✅ | `tests/security/bypass-surface.spec.ts` (per-app `APP_BYPASS_EXTRAS` + regression guards for paths moved off bypass), `tests/apps/pm-godmode.spec.ts` |
-| header overwrite SHALL be enforced | ✅ | `tests/security/header-spoofing.spec.ts` (partial — see F1 for the strip-vs-replace caveat) |
-| backend ports SHALL be bound to 127.0.0.1 only | ⚠️ Deferred — infra-only |
-| auth-response headers SHALL include exactly the three required headers | ⚠️ Deferred — infra-only |
-
-### `oauth2-proxy-gateway`
-
-The central SSO gateway.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| gateway SHALL run as a single dedicated service | 🟡 Partial | `tests/security/bypass-surface.spec.ts` (every protected path hits the same gateway) |
-| cookie domain SHALL be the platform parent domain | ✅ | `tests/auth/session-sharing.spec.ts`, `tests/auth/sso-login.spec.ts` |
-| gateway SHALL emit X-Auth-Request-* headers on authenticated responses | ✅ | `tests/auth/identity-consistency.spec.ts` |
-| cookie secret SHALL be 32 random bytes, base64-encoded | ✅ (indirect) | `tests/security/cookie-tampering.spec.ts` — proves the HMAC is checked at runtime |
-| gateway SHALL use OIDC Discovery against the Cognito issuer | ⚠️ Deferred — infra-only |
-| gateway SHALL use a redis-backed session store | ⚠️ Deferred — genuine test gap (cookie-size proxy possible) |
-| gateway SHALL pass access token to downstream apps when requested | ⚠️ Deferred — needs infra access |
-| gateway SHALL use the configurable identity claim | ⚠️ Deferred — infra-only |
-| single shared callback URL | ⚠️ Deferred — infra-only |
-
-### `proxy-auth-middleware`
-
-Per-app middleware that turns `X-Auth-Request-*` into a native session.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| Bypass paths SHALL short-circuit before any auth processing | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/apps/pm-godmode.spec.ts` |
-| Authenticated sessions with matching or absent proxy identity SHALL short-circuit | 🟡 Partial | `tests/auth/proxy-short-circuit.spec.ts` (Outline observable; other apps lack JS-readable session cookies and self-skip with the contract vacuously satisfied) |
-| Identity mismatch SHALL flush the existing session immediately | ✅ | `tests/flows/identity-switch-after-relogin.spec.ts` (Penpot temporarily skipped — cosmetic stale display cache only, mutations are correct) |
-| Unauthenticated requests with a valid proxy identity SHALL auto-provision and log in | ✅ | `tests/auth/sso-login.spec.ts` |
-| Email normalisation SHALL be applied uniformly | 🟡 Partial | `tests/auth/identity-consistency.spec.ts` (pins final email, not normalisation rules) |
-| Concurrent creation races SHALL fall back to read | 🟡 Partial | `tests/auth/concurrent-first-login.spec.ts` (3 parallel browser contexts → assert single identity; behavioural, doesn't count DB rows) |
-| email-shape detection SHALL avoid polynomial-backtracking regex | ⚠️ Deferred — covered by per-fork sso-audit.sh static check |
-
-### `session-lifecycle`
-
-Two-layer session model and renewal behaviour.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| the system SHALL maintain two distinct session layers | ✅ | `tests/auth/sso-login.spec.ts`, `tests/auth/session-sharing.spec.ts` |
-| Layer 2 expiry while Layer 1 is valid SHALL re-establish session from headers | ✅ | `tests/auth/layer2-re-establish.spec.ts` — all 5 apps (clear local cookies/storage, keep SSO, reload → silent re-establish) |
-| simultaneous expiry of both layers SHALL redirect to mPass login | 🟡 Partial | `tests/auth/session-lifecycle.spec.ts` (cookie deletion proxies for expiry) |
-| Layer-2 session renewal SHALL be guarded against three regression paths | 🟡 Partial | `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` (Penpot only — same pattern extends to Outline/Plane) |
-| Layer 1 SHALL refresh transparently against OIDC | ⚠️ Deferred — needs time fast-forward |
-| Layer 1 expiry while Layer 2 is valid SHALL re-auth transparently | ⚠️ Deferred — needs time fast-forward |
-| mPass-side session revocation SHALL be honoured on next refresh | ⚠️ Deferred — Cognito-side |
-| per-app session TTLs SHALL be uniformly configurable | ⚠️ Deferred — infra-only |
-| bridge state TTL SHALL be 3 minutes | ⚠️ Deferred — needs 3-min wait or TTL mock |
-
-### `logout-flow`
-
-Per-app vs portal-level logout semantics.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| per-app "Logout" SHALL be navigation-only | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 3 — per-app helper in `tests/lib/app-menus.ts` opens each app's user menu; click asserts no `/sign_out` / `/auth/sign-out` / cognito-logout call, SSO cookie untouched |
-| portal "logout all" SHALL clear only the _oauth2_proxy cookie | ✅ | `tests/auth/session-lifecycle.spec.ts` + `tests/auth/logout-invariants.spec.ts` sub-test 1 (per-app cookies survive — the "only" half) |
-| stale app-native sessions SHALL be reaped on next request, not eagerly | ✅ | `tests/auth/session-lifecycle.spec.ts` ("deleting cookie locks every app"), `tests/flows/cross-tab-logout-propagation.spec.ts` |
-| logout SHALL be observable and idempotent | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 2 (`/oauth2/sign_out` invoked twice → no 5xx) |
-| per-app "Logout" SHALL NOT be relied on for security | ⚠️ Deferred — negative policy |
-| Cognito SSO teardown is operator-callable but not surfaced as a user action | ⚠️ Deferred — operator-only |
-| Cognito allowlist SHALL include the portal main page | ⚠️ Deferred — Cognito-side |
-
-### `cognito-claim-mapping`
-
-Cognito ID-token claims → oauth2-proxy headers → per-app user records.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| standard claim → header mapping | ✅ | `tests/auth/identity-consistency.spec.ts` |
-| identity claim SHALL be configurable when email is unreliable | ⚠️ Deferred — infra-only (this deployment uses `cognito:username` synthesised via `DEFAULT_EMAIL_DOMAIN`) |
-| claim mapping SHALL be the same across cookie / JWT-bearer flow | ⚠️ Deferred — genuine test gap (needs JWT-bearer endpoint exposed) |
-| id_token vs access_token audience claim SHALL both be accepted | ⚠️ Deferred — needs access-token exposure |
-| display name SHALL be derived without round-trip when possible | ⚠️ Deferred — genuine test gap (network-listener probe on first paint) |
-
-### `workspace-auto-join`
-
-How each app onboards a newly-provisioned SSO user.
-
-| Requirement | Coverage | Test |
-|---|---|---|
-| auto-join SHALL run on every login, not just on user creation | ⚠️ Deferred — genuine test gap |
-| auto-join SHALL skip when no workspace exists yet | ⚠️ Deferred — needs fresh bundle |
-| auto-join target SHALL be the oldest workspace | ⚠️ Deferred — needs 2+ workspaces |
-| auto-join role SHALL be the app's regular-member role, not Admin or Guest | ⚠️ Deferred — Partial via `tests/apps/*-admin.spec.ts` showing NORMAL_USER is not admin; not yet tagged |
-| auto-join SHALL mark onboarding complete on the user profile | ⚠️ Deferred — genuine test gap |
-| per-app workspace model SHALL be documented in workspaces.md | ⚠️ Deferred — doc requirement |
-| auto-join SHALL NOT leak across apps | ⚠️ Deferred — genuine test gap |
+| § | Content |
+|---|---|
+| §1 | Orthogonal coverage — tests this suite ships that the openspec doesn't claim (OWASP edge-layer, product UX, per-app admin gating) |
+| §2 | The per-app link-coverage contract — L1 through L8, enforced by `tests/lib/link-coverage.ts` |
+| §3 | Per-app branding assertions (where applicable) |
+| §4 | Adding a new app — five-step checklist |
+| §5 | Findings + known limitations (F1–F11) — hard-won caveats about what each test catches and what it doesn't |
+| §6 | Implementation reference — pointers to the lib files |
+| §7 | Bug staging — the `tests/bugs/` lifecycle, including the test-writer agent pipeline |
+
+For the **openspec coverage table** (which test pins which requirement),
+see [`docs/spec-coverage.md`](./docs/spec-coverage.md). For the **gap
+pile with reasons**, see [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md).
+For **how to walk a spec/test PR**, see
+[`docs/spec-review-checklist.md`](./docs/spec-review-checklist.md).
+
+The canonical SSO rule source is vendored at
+[`vendor/openspec/specs/`](./vendor/openspec/specs/) and per-app +
+security-hardening rules at [`vendor/openspec/skills/`](./vendor/openspec/skills/).
+Run `make audit` for live coverage numbers.
 
 ---
 
-## §2 — Orthogonal coverage (not in openspec)
+## §1 — Orthogonal coverage (not in openspec)
 
 Tests this suite ships that the openspec contract doesn't claim. Each
 test answers a real production-incident risk that lives outside the
@@ -181,8 +54,11 @@ SSO contract's scope.
 
 These pin per-app authorization (separate from the SSO contract) and
 together provide partial evidence that auto-joined NORMAL_USER does
-NOT end up with admin access — see `workspace-auto-join#auto-join-role`
-above.
+NOT end up with admin access. The
+`workspace-auto-join#auto-join-role-shall-be-the-app-s-regular-member-role-not-admin-or-guest`
+requirement is tracked in [`docs/spec-coverage.md`](./docs/spec-coverage.md)
+and its deferred entry in
+[`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md).
 
 | Test | What it pins |
 |---|---|
@@ -194,7 +70,7 @@ above.
 
 ---
 
-## §3 — Per-app link-coverage rules (this skill)
+## §2 — Per-app link-coverage rules (this skill)
 
 Implemented by `tests/lib/link-coverage.ts` and registered via
 `registerLinkCoverage({ appName, baseUrl })` from each `tests/apps/<app>.spec.ts`.
@@ -246,7 +122,7 @@ all clicks use `force: true` to bypass any remaining overlay.
 
 ---
 
-## §4 — Per-app branding (optional)
+## §3 — Per-app branding (optional)
 
 When an app sets a recognizable `<title>`, assert it once in
 `tests/apps/<app>.spec.ts`:
@@ -260,7 +136,7 @@ When an app sets a recognizable `<title>`, assert it once in
 
 ---
 
-## §5 — Adding a new app
+## §4 — Adding a new app
 
 1. Add a URL to `APP_URLS` in `constants.ts` (env-overridable via `FOSS_APP_<NAME>`).
 2. Create `tests/apps/<app>.spec.ts`:
@@ -292,7 +168,7 @@ When an app sets a recognizable `<title>`, assert it once in
 
 ---
 
-## §6 — Findings + known limitations
+## §5 — Findings + known limitations
 
 Tests that catch SOME failure modes but not all, plus deployment-state
 observations that affect interpretation.
@@ -334,7 +210,7 @@ observations that affect interpretation.
 
 ---
 
-## §7 — Implementation reference
+## §6 — Implementation reference
 
 Single source of truth: `tests/lib/link-coverage.ts`
 
@@ -356,7 +232,7 @@ requirement is either covered by a `// @spec` tag or in
 
 ---
 
-## §8 — Bug staging (unverified reproductions)
+## §7 — Bug staging (unverified reproductions)
 
 A fourth class of tests, distinct from §1 (openspec coverage), §2
 (orthogonal coverage), and §3 (per-app link coverage): **reproductions