Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ enforced, …) not a single product. Apps are upstream; we don't own their UI.
## Read these in order

1. [`README.md`](./README.md) — env setup, run commands, what's covered
2. [`skills.md`](./skills.md) — the contract organised by openspec module
3. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — traceability matrix (which test pins which requirement)
4. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories
5. [`TESTS.md`](./TESTS.md) — per-test catalog
2. [`docs/spec-coverage.md`](./docs/spec-coverage.md) — audit table: which test pins which requirement
3. [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md) — gap pile + categories
4. [`skills.md`](./skills.md) — findings (F1–F11), link-coverage rules, per-app admin gating narrative, adding-a-new-app guide, bug-staging workflow
5. [`TESTS.md`](./TESTS.md) — per-test catalog (hand-maintained, may lag — `git log -- tests/` is the truth)

When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "what is
this and what do I do" runbook — failure pattern → cause → action.
Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ Adding a test without a requirement → CI fails. Adding a requirement
without a test → CI fails. Drift requires explicit human acceptance in
both directions.

**Today: 84 requirements, 0 missing** — 52 from the SSO chain openspec,
32 from per-app admin + workspace-isolation + security-hardening skills.
**Today: 88 requirements, 0 missing** — 52 from the SSO chain openspec,
36 from per-app admin + workspace-isolation + security-hardening skills.
Run `make audit` for live counts.

When CI is red, [`TRIAGE.md`](./TRIAGE.md) is the 2-minute "failure
pattern → cause → action" runbook.
Expand Down
29 changes: 29 additions & 0 deletions docs/archive/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Archive

Historical analysis and incident docs that are preserved for reference
but are no longer part of the working documentation set.

Anything moved here was:

- **Time-bound** — a snapshot of state at a specific date (test reviews,
RCAs) that doesn't update when the codebase moves on.
- **Decision-shaped** — an analysis written to support a one-time call
(e.g. choosing between two PR approaches) where the decision has
since landed and the analysis is no longer load-bearing.
- **Superseded** — the doc's content has been integrated into the
current working docs and the original is kept only as provenance.

If you're looking for **current** state of the suite, start with:

- [`../../README.md`](../../README.md) — what the suite does, how to run it
- [`../../CLAUDE.md`](../../CLAUDE.md) — conventions + per-app gotchas
- [`../spec-coverage.md`](../spec-coverage.md) — current audit table
- [`../spec-coverage-deferred.md`](../spec-coverage-deferred.md) — gap pile
- [`../spec-review-checklist.md`](../spec-review-checklist.md) — PR-time discipline
- [`../../TRIAGE.md`](../../TRIAGE.md) — failure-pattern → action runbook
- [`../../skills.md`](../../skills.md) — findings, link-coverage rules, bug-staging workflow

Don't edit anything in this directory. If something here is still
load-bearing, promote it back to working docs (and update the working
docs to absorb the content). If nothing depends on it, leave it
archived — `git log` preserves provenance forever.
File renamed without changes.
194 changes: 35 additions & 159 deletions skills.md
Original file line number Diff line number Diff line change
@@ -1,159 +1,32 @@
# FOSS App Test Coverage Skill

This is the test-coverage contract every app on the FOSS platform must satisfy.
Coverage is split between **shared suites** (run once across all 5 apps via
`APPS`) and a **per-app suite** that crawls each app's link surface. Adding a
new app to the platform should require touching one constant and registering
one factory call — nothing more.

The canonical rule source is the openspec at
[awais786/sso-rules-moneta](https://github.com/awais786/sso-rules-moneta/tree/main/openspec/specs)
— one `spec.md` per capability module (`forwardauth-traefik`,
`oauth2-proxy-gateway`, `proxy-auth-middleware`, `session-lifecycle`,
`logout-flow`, `cognito-claim-mapping`, `workspace-auto-join`).

§1 below mirrors that contract onto specific test files, organised by
openspec module. §2 lists orthogonal coverage (tests this suite ships
that the openspec doesn't claim — OWASP-aligned defences, edge-layer
hardening, product UX contracts). §3 documents this suite's own
per-app link-coverage contract.

Live coverage status (regenerated by `scripts/check-spec-coverage.sh`):
**59 ✅ Covered / 25 ⚠️ Deferred / 0 ❌ Missing / 84 requirements** —
see `docs/spec-coverage.md` + `docs/spec-coverage-deferred.md`.
# FOSS E2E — Findings, Link-Coverage, Bug Staging

---

## §0 — Fast + reliable run profiles (recommended defaults)

Use these profiles to keep PR feedback fast while preserving confidence:

| Profile | When | Command | Notes |
|---|---|---|---|
| PR smoke | Every PR/local iteration | `PW_WORKERS=2 PW_SLOW_MO_MS=0 npm test` | Fast default; no visual slowdown. |
| Security focus | Security rule changes | `PW_WORKERS=2 npm run test:security` | Keeps auth/header regressions isolated and quick. |
| Full confidence | Before release / nightly | `BROWSERS=all PW_WORKERS=2 npm test` | Cross-browser sweep; expect longer runtime. |
| Debug UI | Local troubleshooting only | `PW_DEBUG_VISUAL=1 PW_WORKERS=1 npm test` | Enables 2s visual pacing for human observation. |

Reliability guardrails for new tests:
Working notes for the e2e suite that don't fit into the audit table.
What lives here:

- Prefer locator state waits (`toBeVisible`, `toHaveURL`, `waitFor`) over fixed `waitForTimeout`.
- Keep assertions host-aware (`isAuthWall`, expected hostname) to catch silent auth regressions.
- Avoid `networkidle` for websocket-heavy SPAs (Twenty); prefer `commit`/`load` + explicit render checks.
- Reuse `cognitoLogin()` and worker storage fixtures; do not duplicate login choreography per spec.
- Keep retries environment-scoped (`CI=1`), and treat local flakiness as a bug to remove.

---

## §1 — openspec coverage

Tests are tagged at the `test()` callsite with `// @spec <module>#<slug>`;
the audit script enforces that every openspec requirement either has such
a tag or appears in `docs/spec-coverage-deferred.md` with a category.

### `forwardauth-traefik`

Traefik ForwardAuth integration with oauth2-proxy.

| Requirement | Coverage | Test |
|---|---|---|
| a single mpass-auth middleware SHALL be defined on the oauth2-proxy service | ✅ | `tests/security/header-spoofing.spec.ts` |
| every protected app router SHALL apply mpass-auth | ✅ | `tests/security/header-spoofing.spec.ts` |
| bypass paths SHALL route via higher-priority routers without mpass-auth | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/security/strip-on-bypass.spec.ts` |
| bypass routes per app SHALL match the documented list | ✅ | `tests/security/bypass-surface.spec.ts` (per-app `APP_BYPASS_EXTRAS` + regression guards for paths moved off bypass), `tests/apps/pm-godmode.spec.ts` |
| header overwrite SHALL be enforced | ✅ | `tests/security/header-spoofing.spec.ts` (partial — see F1 for the strip-vs-replace caveat) |
| backend ports SHALL be bound to 127.0.0.1 only | ⚠️ Deferred — infra-only |
| auth-response headers SHALL include exactly the three required headers | ⚠️ Deferred — infra-only |

### `oauth2-proxy-gateway`

The central SSO gateway.

| Requirement | Coverage | Test |
|---|---|---|
| gateway SHALL run as a single dedicated service | 🟡 Partial | `tests/security/bypass-surface.spec.ts` (every protected path hits the same gateway) |
| cookie domain SHALL be the platform parent domain | ✅ | `tests/auth/session-sharing.spec.ts`, `tests/auth/sso-login.spec.ts` |
| gateway SHALL emit X-Auth-Request-* headers on authenticated responses | ✅ | `tests/auth/identity-consistency.spec.ts` |
| cookie secret SHALL be 32 random bytes, base64-encoded | ✅ (indirect) | `tests/security/cookie-tampering.spec.ts` — proves the HMAC is checked at runtime |
| gateway SHALL use OIDC Discovery against the Cognito issuer | ⚠️ Deferred — infra-only |
| gateway SHALL use a redis-backed session store | ⚠️ Deferred — genuine test gap (cookie-size proxy possible) |
| gateway SHALL pass access token to downstream apps when requested | ⚠️ Deferred — needs infra access |
| gateway SHALL use the configurable identity claim | ⚠️ Deferred — infra-only |
| single shared callback URL | ⚠️ Deferred — infra-only |

### `proxy-auth-middleware`

Per-app middleware that turns `X-Auth-Request-*` into a native session.

| Requirement | Coverage | Test |
|---|---|---|
| Bypass paths SHALL short-circuit before any auth processing | ✅ | `tests/security/bypass-surface.spec.ts`, `tests/apps/pm-godmode.spec.ts` |
| Authenticated sessions with matching or absent proxy identity SHALL short-circuit | 🟡 Partial | `tests/auth/proxy-short-circuit.spec.ts` (Outline observable; other apps lack JS-readable session cookies and self-skip with the contract vacuously satisfied) |
| Identity mismatch SHALL flush the existing session immediately | ✅ | `tests/flows/identity-switch-after-relogin.spec.ts` (Penpot temporarily skipped — cosmetic stale display cache only, mutations are correct) |
| Unauthenticated requests with a valid proxy identity SHALL auto-provision and log in | ✅ | `tests/auth/sso-login.spec.ts` |
| Email normalisation SHALL be applied uniformly | 🟡 Partial | `tests/auth/identity-consistency.spec.ts` (pins final email, not normalisation rules) |
| Concurrent creation races SHALL fall back to read | 🟡 Partial | `tests/auth/concurrent-first-login.spec.ts` (3 parallel browser contexts → assert single identity; behavioural, doesn't count DB rows) |
| email-shape detection SHALL avoid polynomial-backtracking regex | ⚠️ Deferred — covered by per-fork sso-audit.sh static check |

### `session-lifecycle`

Two-layer session model and renewal behaviour.

| Requirement | Coverage | Test |
|---|---|---|
| the system SHALL maintain two distinct session layers | ✅ | `tests/auth/sso-login.spec.ts`, `tests/auth/session-sharing.spec.ts` |
| Layer 2 expiry while Layer 1 is valid SHALL re-establish session from headers | ✅ | `tests/auth/layer2-re-establish.spec.ts` — all 5 apps (clear local cookies/storage, keep SSO, reload → silent re-establish) |
| simultaneous expiry of both layers SHALL redirect to mPass login | 🟡 Partial | `tests/auth/session-lifecycle.spec.ts` (cookie deletion proxies for expiry) |
| Layer-2 session renewal SHALL be guarded against three regression paths | 🟡 Partial | `tests/auth/layer2-renewal-suppressed-on-4xx.spec.ts` (Penpot only — same pattern extends to Outline/Plane) |
| Layer 1 SHALL refresh transparently against OIDC | ⚠️ Deferred — needs time fast-forward |
| Layer 1 expiry while Layer 2 is valid SHALL re-auth transparently | ⚠️ Deferred — needs time fast-forward |
| mPass-side session revocation SHALL be honoured on next refresh | ⚠️ Deferred — Cognito-side |
| per-app session TTLs SHALL be uniformly configurable | ⚠️ Deferred — infra-only |
| bridge state TTL SHALL be 3 minutes | ⚠️ Deferred — needs 3-min wait or TTL mock |

### `logout-flow`

Per-app vs portal-level logout semantics.

| Requirement | Coverage | Test |
|---|---|---|
| per-app "Logout" SHALL be navigation-only | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 3 — per-app helper in `tests/lib/app-menus.ts` opens each app's user menu; click asserts no `/sign_out` / `/auth/sign-out` / cognito-logout call, SSO cookie untouched |
| portal "logout all" SHALL clear only the _oauth2_proxy cookie | ✅ | `tests/auth/session-lifecycle.spec.ts` + `tests/auth/logout-invariants.spec.ts` sub-test 1 (per-app cookies survive — the "only" half) |
| stale app-native sessions SHALL be reaped on next request, not eagerly | ✅ | `tests/auth/session-lifecycle.spec.ts` ("deleting cookie locks every app"), `tests/flows/cross-tab-logout-propagation.spec.ts` |
| logout SHALL be observable and idempotent | ✅ | `tests/auth/logout-invariants.spec.ts` sub-test 2 (`/oauth2/sign_out` invoked twice → no 5xx) |
| per-app "Logout" SHALL NOT be relied on for security | ⚠️ Deferred — negative policy |
| Cognito SSO teardown is operator-callable but not surfaced as a user action | ⚠️ Deferred — operator-only |
| Cognito allowlist SHALL include the portal main page | ⚠️ Deferred — Cognito-side |

### `cognito-claim-mapping`

Cognito ID-token claims → oauth2-proxy headers → per-app user records.

| Requirement | Coverage | Test |
|---|---|---|
| standard claim → header mapping | ✅ | `tests/auth/identity-consistency.spec.ts` |
| identity claim SHALL be configurable when email is unreliable | ⚠️ Deferred — infra-only (this deployment uses `cognito:username` synthesised via `DEFAULT_EMAIL_DOMAIN`) |
| claim mapping SHALL be the same across cookie / JWT-bearer flow | ⚠️ Deferred — genuine test gap (needs JWT-bearer endpoint exposed) |
| id_token vs access_token audience claim SHALL both be accepted | ⚠️ Deferred — needs access-token exposure |
| display name SHALL be derived without round-trip when possible | ⚠️ Deferred — genuine test gap (network-listener probe on first paint) |

### `workspace-auto-join`

How each app onboards a newly-provisioned SSO user.

| Requirement | Coverage | Test |
|---|---|---|
| auto-join SHALL run on every login, not just on user creation | ⚠️ Deferred — genuine test gap |
| auto-join SHALL skip when no workspace exists yet | ⚠️ Deferred — needs fresh bundle |
| auto-join target SHALL be the oldest workspace | ⚠️ Deferred — needs 2+ workspaces |
| auto-join role SHALL be the app's regular-member role, not Admin or Guest | ⚠️ Deferred — Partial via `tests/apps/*-admin.spec.ts` showing NORMAL_USER is not admin; not yet tagged |
| auto-join SHALL mark onboarding complete on the user profile | ⚠️ Deferred — genuine test gap |
| per-app workspace model SHALL be documented in workspaces.md | ⚠️ Deferred — doc requirement |
| auto-join SHALL NOT leak across apps | ⚠️ Deferred — genuine test gap |
| § | Content |
|---|---|
| §1 | Orthogonal coverage — tests this suite ships that the openspec doesn't claim (OWASP edge-layer, product UX, per-app admin gating) |
| §2 | The per-app link-coverage contract — L1 through L8, enforced by `tests/lib/link-coverage.ts` |
| §3 | Per-app branding assertions (where applicable) |
| §4 | Adding a new app — five-step checklist |
| §5 | Findings + known limitations (F1–F11) — hard-won caveats about what each test catches and what it doesn't |
| §6 | Implementation reference — pointers to the lib files |
| §7 | Bug staging — the `tests/bugs/` lifecycle, including the test-writer agent pipeline |

For the **openspec coverage table** (which test pins which requirement),
see [`docs/spec-coverage.md`](./docs/spec-coverage.md). For the **gap
pile with reasons**, see [`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md).
For **how to walk a spec/test PR**, see
[`docs/spec-review-checklist.md`](./docs/spec-review-checklist.md).

The canonical SSO rule source is vendored at
[`vendor/openspec/specs/`](./vendor/openspec/specs/) and per-app +
security-hardening rules at [`vendor/openspec/skills/`](./vendor/openspec/skills/).
Run `make audit` for live coverage numbers.

---

## §2 — Orthogonal coverage (not in openspec)
## §1 — Orthogonal coverage (not in openspec)

Tests this suite ships that the openspec contract doesn't claim. Each
test answers a real production-incident risk that lives outside the
Expand Down Expand Up @@ -181,8 +54,11 @@ SSO contract's scope.

These pin per-app authorization (separate from the SSO contract) and
together provide partial evidence that auto-joined NORMAL_USER does
NOT end up with admin access — see `workspace-auto-join#auto-join-role`
above.
NOT end up with admin access. The
`workspace-auto-join#auto-join-role-shall-be-the-app-s-regular-member-role-not-admin-or-guest`
requirement is tracked in [`docs/spec-coverage.md`](./docs/spec-coverage.md)
and its deferred entry in
[`docs/spec-coverage-deferred.md`](./docs/spec-coverage-deferred.md).

| Test | What it pins |
|---|---|
Expand All @@ -194,7 +70,7 @@ above.

---

## §3 — Per-app link-coverage rules (this skill)
## §2 — Per-app link-coverage rules (this skill)

Implemented by `tests/lib/link-coverage.ts` and registered via
`registerLinkCoverage({ appName, baseUrl })` from each `tests/apps/<app>.spec.ts`.
Expand Down Expand Up @@ -246,7 +122,7 @@ all clicks use `force: true` to bypass any remaining overlay.

---

## §4 — Per-app branding (optional)
## §3 — Per-app branding (optional)

When an app sets a recognizable `<title>`, assert it once in
`tests/apps/<app>.spec.ts`:
Expand All @@ -260,7 +136,7 @@ When an app sets a recognizable `<title>`, assert it once in

---

## §5 — Adding a new app
## §4 — Adding a new app

1. Add a URL to `APP_URLS` in `constants.ts` (env-overridable via `FOSS_APP_<NAME>`).
2. Create `tests/apps/<app>.spec.ts`:
Expand Down Expand Up @@ -292,7 +168,7 @@ When an app sets a recognizable `<title>`, assert it once in

---

## §6 — Findings + known limitations
## §5 — Findings + known limitations

Tests that catch SOME failure modes but not all, plus deployment-state
observations that affect interpretation.
Expand Down Expand Up @@ -334,7 +210,7 @@ observations that affect interpretation.

---

## §7 — Implementation reference
## §6 — Implementation reference

Single source of truth: `tests/lib/link-coverage.ts`

Expand All @@ -356,7 +232,7 @@ requirement is either covered by a `// @spec` tag or in

---

## §8 — Bug staging (unverified reproductions)
## §7 — Bug staging (unverified reproductions)

A fourth class of tests, distinct from §1 (openspec coverage), §2
(orthogonal coverage), and §3 (per-app link coverage): **reproductions
Expand Down