From 62a07685bd29ee6c98dbeab222aa0ba65b5fb703 Mon Sep 17 00:00:00 2001 From: TurtleWolfe Date: Wed, 13 May 2026 02:33:55 +0000 Subject: [PATCH 1/2] fix(ci): serialize E2E runs via repo-wide concurrency mutex MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The E2E suite runs against the project's single Supabase Cloud instance (huvitqubafsrazpjxsax) with shared test users, shared conversation rows, and beforeAll cleanupOldMessages hooks in 11+ messaging spec files. Concurrent runs against the same backend race each other and dogpile shared state, causing the 60-min GitHub Actions job timeout cap to bite. Verified on 2026-05-13: - Run 25769955636 (main, post-#86 merge, started 00:14:46Z) had its `E2E (firefox-msg 1/1)` job cancelled at exactly 60 min during step "Run E2E tests". Last operation in the captured log: a Realtime subscription waitForSelector on `body[data-messages-subscribed]`. - Run 25770068787 (PR #88 docs cleanup, started 00:18:07Z — only 3 minutes later) ran on the same Supabase project. Direct DB queries confirm conversation `4105492f-1047-40ef-bd6e-411788850547` (the one the cancelled main test was using at "Step 8: Conversation ID:") was created by PR #88's run at 00:21:55Z, not by main's run. - The cancelled run's tests detected "Connection already exists / Conversation already exists" and reused PR #88's data, then watched PR #88's beforeAll hooks wipe it on each of 11 spec boundaries. - Progressive per-spec slowdown across the cancel window (35s, 35s, 2m45s, 1m, 6m48s, 5m6s, 8m36s, 12m16s, 10m11s, 10m46s, cancelled at 10m46s) is the cumulative effect of cross-run data races, not Supabase rate limiting (verified zero rate-limit signals in the Supabase health endpoint, zero auth.audit_log_entries during the window — Supabase internal audit is disabled on this project). Three earlier diagnostic theories were wrong because they all assumed single-actor causation: 1. Retry-heavy specific test (complete-user-workflow.spec.ts retry loop) — disproved: that test passed first-attempt at 00:49:26Z. 2. Rate limiting — disproved: ACTIVE_HEALTHY across all Supabase services, no 429s. 3. Cumulative latency from one run's request volume — disproved: two runs were in flight, not one. The fix is structural: GitHub Actions native `concurrency:` group with a repo-wide key. All E2E runs across all refs (main, PRs, schedule) share one mutex; concurrent runs queue rather than race. Changes to .github/workflows/e2e.yml: Before: concurrency: group: e2e-${{ github.ref }} # per-branch cancel-in-progress: true # kill running on new push After: concurrency: group: e2e-supabase-${{ github.repository }} # repo-wide cancel-in-progress: false # queue, don't cancel Why both fields had to change: - Just flipping cancel-in-progress wouldn't fix anything because the race is cross-branch (main vs PR), not within one branch. - Just changing the group key without cancel-in-progress: false would let pushes mid-run abort each other and leave Supabase in a half-cleaned state. - ${{ github.repository }} keeps forks isolated from upstream — each fork gets its own mutex, preserving the template story. Trade-off accepted: PRs land slightly slower when 2+ are pushed close together (queue wait up to ~45 min worst case). On this repo's PR volume (~5/day max), the queue is rarely hit. Slower-by-design beats 60-min cancellation + re-run + human debug time. What this fix does NOT do (and why each is correct): - Does NOT bump `timeout-minutes: 60` — masking the budget is exactly what 9 prior rounds of flake mitigation did and got us here. - Does NOT shard messaging tests N-way — current 1/1 shard works fine when not contended (28 min on chromium-msg when alone). - Does NOT drop retries — the retries weren't the problem. - Does NOT replace the hand-rolled retry loops in complete-user-workflow.spec.ts:437-462 or encrypted-messaging.spec.ts:311,580 — those loops were doing the right thing; they polled while a different CI run was deleting their data. With the mutex in place, they'll succeed on attempt 1. Lessons captured in memory: - ~/.claude/projects/-home-TurtleWolfe-repos-ScriptHammer/memory/lesson_concurrent_ci_shared_backend.md - ~/.claude/projects/-home-TurtleWolfe-repos-ScriptHammer/memory/feedback_branch_hygiene.md This closes round 10 of the E2E flake pattern (STATUS.md:127-129) by addressing the actual underlying invariant: one writer at a time against the shared Supabase backend. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/e2e.yml | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/.github/workflows/e2e.yml b/.github/workflows/e2e.yml index c77482c..27e280d 100644 --- a/.github/workflows/e2e.yml +++ b/.github/workflows/e2e.yml @@ -10,8 +10,13 @@ on: workflow_dispatch: concurrency: - group: e2e-${{ github.ref }} - cancel-in-progress: true + # Repo-wide mutex: every E2E run shares one Supabase project, and concurrent + # runs race each other's beforeAll cleanupOldMessages hooks (run 25769955636 + # on 2026-05-13 was cancelled at 60 min after PR #88's run, started 3 min + # later, wiped its data). Serialize across all refs; do not cancel a running + # job to make room for a queued one. + group: e2e-supabase-${{ github.repository }} + cancel-in-progress: false env: NEXT_PUBLIC_SUPABASE_URL: ${{ vars.NEXT_PUBLIC_SUPABASE_URL }} From b839cd4d98b9285056ee29156d03520ec14a65a9 Mon Sep 17 00:00:00 2001 From: TurtleWolfe Date: Wed, 13 May 2026 04:18:28 +0000 Subject: [PATCH 2/2] fix(e2e): dispatch scroll event after programmatic scrollTop in messaging tests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 4 messaging E2E specs that set `el.scrollTop = N` programmatically were relying on the browser to auto-fire a `scroll` event so the React component's handleScroll listener could update state. Chromium and Firefox do this reliably; WebKit does not. Result: tests that depend on the listener firing (jump-to-bottom button visibility, auto-scroll behavior) pass on Chromium / Firefox and fail intermittently on WebKit. This was surfaced by run 25774750382 (PR #89) where `webkit-msg` failed on `messaging-scroll.spec.ts:261` "T007-T008: Jump button appears when scrolled and does not overlap input." The test set scrollTop, the React listener at src/components/molecular/MessageThread/MessageThread.tsx:194-212 never ran, showScrollButton stayed false, and the assert on `[data-testid="jump-to-bottom"]` timed out. All 3 retries failed identically. The same pattern existed in three other test sites that hid the bug behind either flaky-retry (PR #88's webkit-msg passed with "1 flaky" tag on a different test) or the assert not depending on the listener firing immediately. Fixing all 4 to prevent future round-N rediscovery: tests/e2e/messaging/messaging-scroll.spec.ts:237 (test setup scroll-to-top) tests/e2e/messaging/messaging-scroll.spec.ts:276 (THE failing test T007-T008) tests/e2e/messaging/messaging-scroll.spec.ts:322 (test setup scroll-to-top) tests/e2e/messaging/encrypted-messaging.spec.ts:665 (test setup scroll-to-top) The change is additive — same scrollTop assignment, plus an explicit `el.dispatchEvent(new Event('scroll', { bubbles: true }))`. handleScroll is idempotent (just reads current scroll position and sets state), so the event firing twice in browsers that auto-fire is harmless. Rationale for bundling with the concurrency mutex fix in this PR: - Both changes are E2E reliability fixes - The webkit failure surfaced *because* the concurrency mutex removed the 60-min cancellation bottleneck, letting webkit-msg run to completion and reveal pre-existing flake that the prior cancel had been masking - Splitting into two PRs would require coordinating their merge order and re-running CI twice; bundled is one merge cycle Verification: - All 4 sites now dispatch the scroll event - Pre-push hooks (lint, type-check, build) green - CI re-run will confirm webkit-msg passes Co-Authored-By: Claude Opus 4.7 (1M context) --- tests/e2e/messaging/encrypted-messaging.spec.ts | 1 + tests/e2e/messaging/messaging-scroll.spec.ts | 9 ++++++++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/tests/e2e/messaging/encrypted-messaging.spec.ts b/tests/e2e/messaging/encrypted-messaging.spec.ts index adba5ea..97777f5 100644 --- a/tests/e2e/messaging/encrypted-messaging.spec.ts +++ b/tests/e2e/messaging/encrypted-messaging.spec.ts @@ -663,6 +663,7 @@ test.describe('Encrypted Messaging Flow', () => { .first() .evaluate((el) => { el.scrollTop = 0; + el.dispatchEvent(new Event('scroll', { bubbles: true })); }); // Look for "Load More" button diff --git a/tests/e2e/messaging/messaging-scroll.spec.ts b/tests/e2e/messaging/messaging-scroll.spec.ts index d2e1791..5d77bd5 100644 --- a/tests/e2e/messaging/messaging-scroll.spec.ts +++ b/tests/e2e/messaging/messaging-scroll.spec.ts @@ -232,9 +232,14 @@ test.describe('Messaging Scroll - User Story 2: Scroll Through Messages', () => ); const initialInputBox = await messageInput.boundingBox(); - // Scroll up in the message thread + // Scroll up in the message thread. + // WebKit does not always fire the scroll event for programmatic scrollTop + // assignments. Dispatch it explicitly so React listeners (e.g., + // MessageThread's handleScroll at MessageThread.tsx:194) run reliably + // across browsers. await messageThread.evaluate((el) => { el.scrollTop = 0; + el.dispatchEvent(new Event('scroll', { bubbles: true })); }); // Wait for scroll to complete @@ -274,6 +279,7 @@ test.describe('Messaging Scroll - User Story 3: Jump to Bottom Button', () => { // Scroll up more than 500px to trigger button await messageThread.evaluate((el) => { el.scrollTop = Math.max(0, el.scrollHeight - el.clientHeight - 600); + el.dispatchEvent(new Event('scroll', { bubbles: true })); }); await waitForUIStability(page); @@ -320,6 +326,7 @@ test.describe('Messaging Scroll - User Story 3: Jump to Bottom Button', () => { // Scroll up to trigger button await messageThread.evaluate((el) => { el.scrollTop = 0; + el.dispatchEvent(new Event('scroll', { bubbles: true })); }); await waitForUIStability(page);