Skip to content

fix: share SSE connections across tabs, preserve login redirect URL (#153, #157)#159

Open
myakove wants to merge 10 commits into
mainfrom
fix/issue-153-157-shared-sse
Open

fix: share SSE connections across tabs, preserve login redirect URL (#153, #157)#159
myakove wants to merge 10 commits into
mainfrom
fix/issue-153-157-shared-sse

Conversation

@myakove

@myakove myakove commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Fix two browser UX issues:

  1. Shared SSE connections (fix: share SSE connections across tabs to avoid browser connection limit #153): Browsers enforce a 6 concurrent HTTP/1.1 connection limit per domain. Each tab opened 2-4 persistent SSE connections, causing the app to hang with 3+ tabs. Now uses a BroadcastChannel-based useSharedSSE hook — one tab owns the EventSource connections and relays events to all other tabs. If the leader tab closes, a follower takes over.

  2. Preserve login redirect URL (Feedback: Clicking a link to a specific analysis opens the general dashboard #157): Deep links (e.g., /results/abc123) were lost during login redirect. ProtectedRoute now passes the original location in state, and RegisterPage navigates back to it after login.

Closes #153
Closes #157

Changes

  • New frontend/src/lib/useSharedSSE.ts — BroadcastChannel leader-election hook with exponential backoff reconnect
  • Replace all 7 direct EventSource usages (Layout, Dashboard, Status, Report, ChatUI, TokenUsage, ServerSettings)
  • Clean up dead code: sessionKey/sseReconnectCount in ChatUI
  • ProtectedRoute.tsx: pass state={{ from: location }} to /login
  • RegisterPage.tsx: navigate to state.from.pathname after login (3 navigation points)
  • All 251 frontend tests pass

@myakove-bot

Copy link
Copy Markdown
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message
  • /security-override - Set security check runs to pass (maintainers only)
  • /security-override cancel - Re-run security checks

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest python-module-install - Test Python package installation
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3
  • /cherry-pick-retry <branch> - Retry a failed cherry-pick (merged PRs only)

Branch Management

  • /rebase - Rebase this PR branch onto its base branch

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. Status Checks: All required status checks must pass
  3. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  4. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6-1m)
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6-1m)
Security Checks
  • Suspicious Path Detection: Monitors paths: .claude/, .vscode/, .cursor/, .devcontainer/, .pi/, .github/workflows/, .github/actions/
  • Committer Identity Check: Verifies last committer matches PR author
  • Mandatory: Security checks block merge (use /security-override to bypass — maintainers only)

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Fix shared SSE across tabs and preserve post-login redirect

🐞 Bug fix 🕐 40+ Minutes

Grey Divider

AI Description

• Share SSE streams across tabs to avoid browser per-domain connection limits.
• Replace per-page EventSource usage with a BroadcastChannel-based useSharedSSE hook.
• Preserve deep-link location through /login and redirect back after authentication.
Diagram

graph TD
  ui["Layout/Pages"] --> shared["useSharedSSE"] --> bc["BroadcastChannel"] --> leader["Leader tab EventSource"] --> api[("Backend SSE endpoints")]
  prot["ProtectedRoute"] --> login["/login (state.from)"] --> reg["RegisterPage"] --> back["Navigate returnTo"]
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Switch to WebSockets (single multiplexed connection)
  • ➕ Avoids SSE connection limits entirely with one persistent socket
  • ➕ Bidirectional messaging enables richer realtime features later
  • ➖ Requires backend protocol + auth/session handling changes
  • ➖ Higher implementation and operational complexity than SSE reuse
2. Use a SharedWorker to own EventSource
  • ➕ Designed specifically for cross-tab shared resources
  • ➕ Centralizes ownership without leader-election logic
  • ➖ SharedWorker support/behavior varies and can be tricky with bundlers
  • ➖ More complicated lifecycle/debugging than BroadcastChannel
3. Server-side multiplexing endpoint (one stream, many event types)
  • ➕ Reduces client connections even without cross-tab sharing
  • ➕ Simplifies client subscription surface area
  • ➖ Still opens one stream per tab unless combined with sharing
  • ➖ Requires backend API changes and coordination across all producers

Recommendation: The BroadcastChannel-based approach is a good fit: it fixes the immediate browser connection-limit issue without requiring backend changes, and it degrades cleanly to direct EventSource when BroadcastChannel is unavailable. Keep an eye on edge cases (rapid tab open/close, leader churn) and consider WebSockets only if future features need bidirectional realtime.

Files changed (10) +363 / -150

Enhancement (1) +232 / -0
useSharedSSE.tsAdd BroadcastChannel-based shared SSE hook with leader election +232/-0

Add BroadcastChannel-based shared SSE hook with leader election

• Introduces useSharedSSE to ensure only one tab holds an EventSource per URL, forwarding events to other tabs through BroadcastChannel. Includes leader election, leader handoff on tab close, and exponential backoff reconnect when EventSource reaches CLOSED state, with a direct-EventSource fallback for environments without BroadcastChannel.

frontend/src/lib/useSharedSSE.ts

Bug fix (5) +79 / -87
Layout.tsxUse shared SSE for navbar badge counts +16/-22

Use shared SSE for navbar badge counts

• Replaces a per-tab EventSource to /api/navbar/stream with useSharedSSE. Memoizes handlers for active-count/unread-count events to update badge state without exceeding browser connection limits.

frontend/src/components/layout/Layout.tsx

ChatUI.tsxShare chat SSE stream and remove manual reconnect state +27/-44

Share chat SSE stream and remove manual reconnect state

• Moves chat streaming updates to useSharedSSE, relaying chat-changed events across tabs and using an onOpen callback to resync messages. Removes sessionKey and custom CLOSED-state reconnect/backoff logic now handled by the shared hook.

frontend/src/components/shared/ChatUI.tsx

ProtectedRoute.tsxPreserve original route in login redirect +3/-2

Preserve original route in login redirect

• Captures the current location and passes it via Navigate state when redirecting unauthenticated users to /login. Enables post-login navigation back to deep links.

frontend/src/components/shared/ProtectedRoute.tsx

RegisterPage.tsxRedirect back to deep link after login/register +7/-5

Redirect back to deep link after login/register

• Reads location.state.from.pathname (fallback '/') and navigates to it after auth completes. Applies the same returnTo navigation on initial authenticated load, successful login, and post-registration refresh.

frontend/src/pages/RegisterPage.tsx

StatusPage.tsxUse shared SSE for status updates with explicit disconnect +26/-14

Use shared SSE for status updates with explicit disconnect

• Moves status-changed SSE listening to useSharedSSE and controls connection via an sseUrl state that is cleared on terminal transitions and cleanup. Uses a ref to expose the latest fetchStatus function to the SSE callback without recreating subscriptions.

frontend/src/pages/StatusPage.tsx

Refactor (4) +52 / -63
DashboardPage.tsxReplace dashboard EventSource with shared SSE subscription +9/-13

Replace dashboard EventSource with shared SSE subscription

• Switches the dashboard-changed listener on /api/dashboard/stream to useSharedSSE. Keeps the callback stable via a memoized events map that triggers job list refresh through a ref.

frontend/src/pages/DashboardPage.tsx

ReportPage.tsxShare report SSE streams for comments and status updates +22/-28

Share report SSE streams for comments and status updates

• Converts both comments-changed and status-changed EventSource streams to useSharedSSE, using refs to call the latest fetchComments implementation safely. Maintains existing behavior of deferring comment refresh while drafts exist and best-effort result refresh on status updates.

frontend/src/pages/ReportPage.tsx

ServerSettingsPage.tsxUse shared SSE for admin settings updates +11/-10

Use shared SSE for admin settings updates

• Replaces /api/admin/settings/stream EventSource with useSharedSSE and a memoized settings-changed handler. Preserves existing behavior of refetching settings on server push.

frontend/src/pages/ServerSettingsPage.tsx

TokenUsagePage.tsxReplace token usage EventSource with shared SSE subscription +10/-12

Replace token usage EventSource with shared SSE subscription

• Switches /api/admin/token-usage/stream listening to useSharedSSE and triggers both summary and breakdown refetch on usage-changed. Keeps callbacks stable with a memoized events map.

frontend/src/pages/TokenUsagePage.tsx

@qodo-code-review

qodo-code-review Bot commented Jul 1, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 📜 Skill insights (0)

Context used
✅ Compliance rules (platform): 40 rules

Grey Divider


Action required

1. SSE header injection ✓ Resolved 🐞 Bug ⛨ Security
Description
/api/stream accepts topics from the query string and uses job_id to build the SSE event:
name (results:{job_id}:...) with only a length check. A crafted job_id containing control
characters (e.g. \n/\r) can break SSE framing and inject extra SSE fields/events into the
response stream.
Code

src/rootcoz/main.py[R6520-6527]

+        elif topic.startswith("results:"):
+            job_id = topic[len("results:") :]
+            if not job_id or len(job_id) > 64:
+                continue
+            ev = asyncio.Event()
+            registrations.append(
+                (f"results:{job_id}", ev, None, _job_status_listeners, job_id)
+            )
Relevance

⭐⭐⭐ High

Team previously accepted control-character sanitization to prevent injection/log issues; likely to
sanitize SSE event names too.

PR-#92
PR-#144

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The endpoint builds topic_list directly from the query string, accepts job_id with only a length
check, and later emits prefix inside the SSE event: header without escaping/sanitization.

src/rootcoz/main.py[6493-6497]
src/rootcoz/main.py[6520-6544]
src/rootcoz/main.py[6683-6694]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The multiplexed SSE endpoint interpolates user-controlled topic components (notably `job_id`) directly into the SSE `event:` header. Without character validation, control characters can corrupt the SSE protocol stream.

### Issue Context
Topics are parsed from the `topics` query parameter; for `results:`, `comments:`, and `chat:` the suffix is treated as `job_id` and used to form `prefix`, which is later emitted in `yield f"event: {prefix}:{event_name}\n..."`.

### Fix Focus Areas
- src/rootcoz/main.py[6493-6497]
- src/rootcoz/main.py[6520-6544]

### Implementation notes
- Reject any topic containing `\r` or `\n` early.
- For `job_id`, enforce a strict allowlist regex (e.g. `^[A-Za-z0-9_-]{1,64}$`) or parse/validate as UUID if that’s the intended format.
- Apply the same validation to `results:`, `comments:`, and `chat:` branches.
- Consider returning 400 for invalid topics instead of silently `continue`, to avoid ambiguous client behavior.
- Add a unit/integration test that a `topics` value containing `%0A` is rejected.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. /api/stream missing auth check ✓ Resolved 📘 Rule violation ⛨ Security
Description
The new multiplexed SSE endpoint /api/stream does not require authentication for non-admin topics
(e.g., dashboard, results:{job_id}), allowing unauthenticated clients to subscribe when the
allow-list is empty. This violates the requirement that all non-public API endpoints require
authentication.
Code

src/rootcoz/main.py[R6465-6544]

+@app.get("/api/stream")
+async def stream_multiplexed(
+    request: Request,
+    topics: str = Query("", description="Comma-separated topic subscriptions"),
+) -> StreamingResponse:
+    """Multiplexed SSE endpoint — one connection for all event topics.
+
+    Accepts a comma-separated ``topics`` query parameter.  Each topic
+    registers the connection into the matching listener set so that a
+    single SSE connection can receive events from multiple sources.
+
+    Supported topics:
+
+    - ``navbar`` — active analysis count + unread mention count
+    - ``dashboard`` — job list changes
+    - ``results:{job_id}`` — per-job status changes
+    - ``comments:{job_id}`` — per-job comment changes
+    - ``chat:{job_id}`` — per-job chat message changes
+    - ``token-usage`` — token usage changes (admin only)
+    - ``settings`` — server settings changes (admin only)
+    """
+    _check_allow_list(request)
+    username = getattr(request.state, "username", "")
+    is_admin = getattr(request.state, "is_admin", False)
+
+    topic_list = [t.strip() for t in topics.split(",") if t.strip()]
+    if not topic_list:
+        raise HTTPException(status_code=400, detail="No topics specified")
+
+    # Validate topics and build registration plan
+    # Each entry: (event_prefix, asyncio.Event, listener_set_or_dict, key_or_none)
+    registrations: list[
+        tuple[
+            str,
+            asyncio.Event,
+            set[asyncio.Event] | None,
+            dict[str, set[asyncio.Event]] | None,
+            str,
+        ]
+    ] = []
+    navbar_requested = False
+
+    for topic in topic_list:
+        if topic == "navbar":
+            navbar_requested = True
+            # navbar has custom handling (active + mention events)
+            continue
+        elif topic == "dashboard":
+            ev = asyncio.Event()
+            registrations.append(("dashboard", ev, _dashboard_listeners, None, ""))
+        elif topic.startswith("results:"):
+            job_id = topic[len("results:") :]
+            if not job_id:
+                continue
+            ev = asyncio.Event()
+            registrations.append(
+                (f"results:{job_id}", ev, None, _job_status_listeners, job_id)
+            )
+        elif topic.startswith("comments:"):
+            job_id = topic[len("comments:") :]
+            if not job_id:
+                continue
+            ev = asyncio.Event()
+            registrations.append(
+                (f"comments:{job_id}", ev, None, _comment_listeners, job_id)
+            )
+        elif topic.startswith("chat:"):
+            job_id = topic[len("chat:") :]
+            if not job_id:
+                continue
+            listener_key = f"{job_id}:{username}" if username else job_id
+            ev = asyncio.Event()
+            registrations.append(
+                (f"chat:{job_id}", ev, None, _chat_listeners, listener_key)
+            )
+        elif topic == "token-usage":
+            if not is_admin:
+                continue  # silently skip admin-only topics for non-admins
+            ev = asyncio.Event()
+            registrations.append(("token-usage", ev, _token_usage_listeners, None, ""))
Relevance

⭐⭐⭐ High

Team repeatedly accepted tightening auth/role gates on endpoints (e.g., PR33 auth-required, PR104
require username).

PR-#33
PR-#104

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The handler for /api/stream only calls _check_allow_list(request) and then proceeds to register
non-admin topics without verifying request.state.username (authentication). /api/stream is not
one of the explicitly allowed unauthenticated paths, so it must require auth.

Rule 805015: Require authentication for all non-public API endpoints
src/rootcoz/main.py[6465-6544]
src/rootcoz/main.py[7362-7379]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GET /api/stream` currently only calls `_check_allow_list(request)` and then accepts non-admin topics even when `request.state.username` is empty. Per policy, any endpoint not in the explicit public allowlist must enforce authentication.

## Issue Context
Even if other SSE endpoints exist, this PR introduces a new endpoint and it must comply with the rule for non-public API authentication.

## Fix Focus Areas
- src/rootcoz/main.py[6465-6544]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. currentLeaderId is unused ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
useSharedSSE declares currentLeaderId and assigns to it in multiple branches but never reads it,
leaving dead state in the new shared-SSE hook. With noUnusedLocals enabled this becomes a
build-blocking TypeScript unused-variable error even though runtime behavior is unaffected.
Code

frontend/src/lib/useSharedSSE.ts[112]

+    let currentLeaderId: string | null = null
Relevance

⭐⭐⭐ High

Team often accepts TS/quality fixes; no evidence they keep unused locals with strict TS settings.

PR-#104
PR-#145

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The cited code in frontend/src/lib/useSharedSSE.ts introduces currentLeaderId, writes to it in
several places across the hook’s control flow, but contains no corresponding reads or uses of the
value anywhere in the module, making it an unused artifact. Given the frontend configuration has
noUnusedLocals: true, this pattern is treated as a compile-time unused local error, aligning with
the compliance requirement to remove or meaningfully use unused artifacts.

Rule 804990: Remove or use all unused code artifacts
frontend/src/lib/useSharedSSE.ts[112-112]
frontend/src/lib/useSharedSSE.ts[165-207]
frontend/src/lib/useSharedSSE.ts[111-117]
frontend/src/lib/useSharedSSE.ts[162-168]
frontend/src/lib/useSharedSSE.ts[193-207]
frontend/tsconfig.app.json[19-23]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`currentLeaderId` is declared in `useSharedSSE` and is only written to (assigned in multiple branches) but never read, making it dead state that can trigger TypeScript/ESLint unused-variable rules and fail builds when `noUnusedLocals` is enabled.

## Issue Context
This PR adds a shared SSE hook intended for broad reuse, so keeping unused state increases maintenance burden and can introduce build-breaking lint/TypeScript errors. The variable looks like a leftover from an earlier leader-tracking approach, but the current logic does not consult it.

## Fix Focus Areas
- frontend/src/lib/useSharedSSE.ts[112-207]
- frontend/src/lib/useSharedSSE.ts[112-113]
- frontend/src/lib/useSharedSSE.ts[162-168]
- frontend/src/lib/useSharedSSE.ts[193-207]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (5)
4. SSE manager not provided ✓ Resolved 🐞 Bug ≡ Correctness
Description
SSEProvider creates the SSE manager inside a useEffect and stores it only in a ref, but the
context value is computed during render, so consumers can remain stuck with null and useSSE
never subscribes after login. This breaks real-time updates across the app even though
authentication succeeded.
Code

frontend/src/lib/SSEProvider.tsx[R591-602]

+    if (authenticated) {
+      if (!managerRef.current) {
+        managerRef.current =
+          typeof BroadcastChannel !== 'undefined'
+            ? createSSEManager()
+            : createFallbackManager()
+      }
+    } else {
+      managerRef.current?.destroy()
+      managerRef.current = null
+    }
+
Relevance

⭐⭐ Medium

No historical evidence on React context-from-ref rerender bug; SSEProvider is new file in this PR.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The provider only mutates managerRef.current in an effect and never triggers a rerender, while
useSSE explicitly no-ops when the context manager is null, so subscriptions can be skipped
indefinitely.

frontend/src/lib/SSEProvider.tsx[590-612]
frontend/src/lib/SSEProvider.tsx[657-679]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`SSEProvider` initializes `managerRef.current` inside a `useEffect`, but React does not rerender when a ref changes. Since the context `value` is derived during render, children can see a permanently `null` manager and all `useSSE(...)` subscriptions will be skipped.

### Issue Context
The multiplexed SSE architecture depends on `useSSE` receiving a non-null manager from context to call `manager.subscribe(...)`.

### Fix Focus Areas
- frontend/src/lib/SSEProvider.tsx[586-612]

### Implementation notes
- Replace the ref-only propagation with state, e.g. `const [manager, setManager] = useState<SSEManager | null>(null)`.
- In the `[authenticated]` effect, create the manager and `setManager(createdManager)`; on logout/unmount, call `destroy()` and `setManager(null)`.
- Provide `<SSEContext.Provider value={authenticated ? manager : null}>...` so consumers rerender when the manager becomes available.
- Ensure cleanup doesn’t double-destroy and that state updates are guarded on unmount if needed.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Logout leader stalls SSE ✓ Resolved 🐞 Bug ☼ Reliability
Description
SSEProvider keeps the leader-election manager alive when authenticated becomes false, so a
logged-out/expired tab can remain leader and keep sending heartbeats while failing to open
/api/stream, breaking real-time updates for other tabs. Because followers see the leader
heartbeat, they won’t take over leadership until that tab closes/unmounts.
Code

frontend/src/lib/SSEProvider.tsx[R582-606]

+export function SSEProvider({ children }: { children: ReactNode }) {
+  const { authenticated } = useAuth()
+  const managerRef = useRef<(SSEManager & { destroy(): void }) | null>(null)
+
+  // Create manager once on mount, destroy on unmount
+  if (!managerRef.current) {
+    managerRef.current =
+      typeof BroadcastChannel !== 'undefined'
+        ? createSSEManager()
+        : createFallbackManager()
+  }
+
+  useEffect(() => {
+    return () => {
+      managerRef.current?.destroy()
+      managerRef.current = null
+    }
+  }, [])
+
+  // If not authenticated, don't provide the manager — no SSE connections
+  // will be made. The context value being null means useSSE is a no-op.
+  const value = authenticated ? managerRef.current : null
+
+  return <SSEContext.Provider value={value}>{children}</SSEContext.Provider>
+}
Relevance

⭐⭐ Medium

No historical evidence on SSE manager teardown on auth/logout changes; new file has no prior review
patterns.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Auth can flip authenticated to false (logout/401), but SSEProvider only nulls the context value
instead of destroying the manager; the manager can remain leader, continue heartbeats, and reconnect
attempts. Since /api/stream returns 401 without request.state.username, a logged-out leader will
fail indefinitely while followers won’t attempt takeover because they still receive leader
heartbeats.

frontend/src/lib/SSEProvider.tsx[582-606]
frontend/src/lib/SSEProvider.tsx[271-307]
frontend/src/lib/SSEProvider.tsx[404-409]
frontend/src/lib/auth.tsx[39-45]
frontend/src/lib/auth.tsx[98-109]
src/rootcoz/main.py[6487-6491]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
When `authenticated` becomes false (logout or `/api/auth/me` returns 401), `SSEProvider` stops exposing the manager via context, but the manager instance continues running leader election/heartbeats and can still hold the `navigator.locks` leadership lock. If that tab is the leader, it will keep trying to connect to `/api/stream` without auth (401), while other tabs will not become leader due to ongoing heartbeats.

### Issue Context
- Auth transitions to unauthenticated happen on logout and on 401 from `/api/auth/me`.
- `/api/stream` explicitly requires `request.state.username` and returns 401 if missing.

### Fix Focus Areas
- frontend/src/lib/SSEProvider.tsx[582-606]
- frontend/src/lib/SSEProvider.tsx[271-307]
- frontend/src/lib/SSEProvider.tsx[404-409]

### Suggested fix
1. Add an effect keyed on `authenticated` that:
  - When `authenticated` becomes `false`: call `managerRef.current?.destroy()` and set `managerRef.current = null` to release the lock and stop heartbeats.
  - When `authenticated` becomes `true`: re-create the manager (similar to current creation logic) so the tab re-joins the leader election.
2. Ensure subscriptions cleanly re-register after manager recreation (existing `useSSE` effects should naturally resubscribe when context manager changes from null→manager).
3. (Optional hardening) If feasible, detect repeated `/api/stream` failures and step down/release leadership so another tab can take over.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Heartbeat causes unnecessary reconnects ✓ Resolved 🐞 Bug ☼ Reliability
Description
Follower tabs rebroadcast topics-update on every leader heartbeat, and the leader treats every
topics-update as a reason to reconnect, so the shared EventSource can be torn down/recreated
periodically even when topics didn’t change. This adds churn and increases the chance of missed
events and avoidable load, undermining the “one stable SSE connection” goal.
Code

frontend/src/lib/SSEProvider.tsx[R404-409]

+        case 'topics-update':
+          if (isLeader && m.tabId && m.tabId !== TAB_ID) {
+            tabTopics.set(m.tabId, new Set(m.topics ?? []))
+            tabEventNames.set(m.tabId, new Set(m.eventNames ?? []))
+            scheduleReconnect()
+          }
Relevance

⭐⭐ Medium

No direct history on SSEProvider; team often accepts churn/overwork avoidance fixes (e.g.,
throttling/churn fixes in PR #145).

PR-#145

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The leader sends periodic leader messages as a heartbeat; followers rebroadcast topics on each of
those messages, and the leader always schedules a reconnect on each topics-update, which leads to
reconnect churn even without real subscription changes.

frontend/src/lib/SSEProvider.tsx[271-276]
frontend/src/lib/SSEProvider.tsx[381-392]
frontend/src/lib/SSEProvider.tsx[404-409]
frontend/src/lib/SSEProvider.tsx[233-245]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The leader reconnects the multiplexed `EventSource` on every received `topics-update`, and followers send `topics-update` on every heartbeat (`leader` message). This can force periodic reconnects even when the union of topics/eventNames is unchanged.

### Issue Context
- Leader emits a `leader` heartbeat every `LEADER_HEARTBEAT_MS`.
- Followers respond to each heartbeat by calling `broadcastTopics()`.
- Leader handles `topics-update` by updating maps and calling `scheduleReconnect()`, which (when leader) calls `connectEventSource()`.

### Fix Focus Areas
- frontend/src/lib/SSEProvider.tsx[271-276]
- frontend/src/lib/SSEProvider.tsx[381-392]
- frontend/src/lib/SSEProvider.tsx[404-409]
- frontend/src/lib/SSEProvider.tsx[233-245]

### Recommended fix
Implement one (or both) of the following:
1) **Stop periodic rebroadcasts:** In the follower `case 'leader'` handler, remove `broadcastTopics()` and only reset the watchdog; rely on `topics-request` + subscription-change broadcasts.
2) **Change-detect on leader:** In `case 'topics-update'`, compute a stable signature (e.g., sorted join) of the incoming `topics` and `eventNames` per tab and only call `scheduleReconnect()` when the signature actually changes.

Either approach prevents heartbeat-driven reconnect churn while still keeping the leader’s aggregated subscriptions accurate.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Broadcast uses event names ✓ Resolved 📎 Requirement gap ≡ Correctness
Description
Follower tabs broadcast full topic:eventName strings via the cross-tab topics-update message,
but the leader reuses these values directly as the /api/stream?topics= query, even though the
backend expects bare topic strings like dashboard or results:{job_id}. As a result, the leader
can’t correctly aggregate follower-only subscriptions, so SSE events may not be relayed across tabs
(sometimes masked when the leader already subscribes to the same base topic locally).
Code

frontend/src/lib/SSEProvider.tsx[R279-296]

+  /** Get local full event names (topic:eventName) for broadcasting. */
+  function getLocalEventNames(): string[] {
+    const names: string[] = []
+    for (const sub of subscriptions.values()) {
+      for (const eventName of Object.keys(sub.events)) {
+        names.push(`${sub.topic}:${eventName}`)
+      }
+    }
+    return [...new Set(names)]
+  }
+
+  function broadcastTopics() {
+    channel?.postMessage({
+      type: 'topics-update',
+      tabId: TAB_ID,
+      topics: getLocalEventNames(),
+    })
+  }
Relevance

⭐⭐ Medium

No historical evidence on cross-tab SSE topic/event encoding; SSE work exists (PR26) but not this
leader-election pattern.

PR-#26

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
In SSEProvider, broadcastTopics() sends getLocalEventNames() (full topic:eventName values)
in topics-update, and the leader unions these received values into its backend subscription list
when constructing /api/stream?topics=.... On the backend, stream_multiplexed treats each
topics entry as a topic key (parsing prefixes like results:/chat:/comments: to derive IDs),
so sending values such as dashboard:dashboard-changed or results:<id>:status-changed produces
unknown/malformed topics (including incorrect job_id extraction and double-suffixed event naming),
which the backend ignores or registers under the wrong keys; consequently, follower-tab
subscriptions do not get included in the leader’s SSE connection and their updates are missed.

Implement shared SSE connections across tabs using BroadcastChannel leader election
frontend/src/lib/SSEProvider.tsx[279-296]
src/rootcoz/main.py[6490-6540]
frontend/src/lib/SSEProvider.tsx[88-109]
frontend/src/lib/SSEProvider.tsx[142-160]
frontend/src/lib/SSEProvider.tsx[359-363]
src/rootcoz/main.py[6507-6539]
src/rootcoz/main.py[6673-6691]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Follower tabs currently broadcast full SSE event names (e.g. `topic:eventName` like `results:abc123:status-changed`) in cross-tab `topics-update`, but the leader treats these strings as backend **topics** when building `/api/stream?topics=...`. The backend only understands **bare topic** strings (e.g. `dashboard`, `results:abc123`), so the leader ends up subscribing with malformed topic keys and cannot aggregate follower-only subscriptions for cross-tab relaying.

## Issue Context
This breaks the core requirement that a single leader tab owns the `EventSource` connection and relays SSE events to follower tabs. The leader needs to keep two distinct concepts: (1) backend subscription **topics** used for `/api/stream?topics=...`, and (2) full SSE **event names** used for `EventSource.addEventListener(fullName, ...)` and for relaying, but these are currently conflated via the `tabTopics` set populated from `topics-update`.

## Fix Focus Areas
- frontend/src/lib/SSEProvider.tsx[88-109]
- frontend/src/lib/SSEProvider.tsx[97-109]
- frontend/src/lib/SSEProvider.tsx[142-160]
- frontend/src/lib/SSEProvider.tsx[279-296]
- frontend/src/lib/SSEProvider.tsx[359-363]
- src/rootcoz/main.py[6490-6540]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


8. Split-brain SSE leaders ✓ Resolved 🐞 Bug ≡ Correctness
Description
useSharedSSE can end up with multiple tabs acting as leader because receiving a 'leader' message
only cancels the election timer and never forces an already-leader tab to step down. This can
recreate multiple EventSource connections and reintroduce the browser per-domain connection-limit
hang the PR is intended to prevent.
Code

frontend/src/lib/useSharedSSE.ts[R154-189]

+    function becomeLeader() {
+      if (destroyed || isLeader) return
+      isLeader = true
+      channel.postMessage({ type: 'leader', tabId: TAB_ID })
+      createEventSource()
+    }
+
+    // ---- Election ----
+
+    function startElection() {
+      if (destroyed) return
+      channel.postMessage({ type: 'who-is-leader' })
+      // Random jitter avoids ties when multiple tabs open simultaneously
+      const jitter = Math.random() * 100
+      electionTimer = setTimeout(() => {
+        if (!destroyed) becomeLeader()
+      }, ELECTION_TIMEOUT_MS + jitter)
+    }
+
+    // ---- Channel message handler ----
+
+    channel.onmessage = (msg: MessageEvent<ChannelMessage>) => {
+      const { type, name, data, closed } = msg.data
+
+      switch (type) {
+        case 'who-is-leader':
+          if (isLeader) channel.postMessage({ type: 'leader', tabId: TAB_ID })
+          break
+
+        case 'leader':
+          // Another tab is leader — cancel our election
+          if (electionTimer) {
+            clearTimeout(electionTimer)
+            electionTimer = null
+          }
+          break
Relevance

⭐⭐ Medium

No repo history on useSharedSSE (file missing); no similar accepted/rejected split-brain election
fixes found.

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The hook unconditionally calls becomeLeader() after the timeout and, if it later receives a
leader message, it only clears the election timer and does not reconcile leadership if it is
already leader; this allows multiple leaders to persist concurrently.

frontend/src/lib/useSharedSSE.ts[154-170]
frontend/src/lib/useSharedSSE.ts[175-189]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`useSharedSSE` leader election can result in two leaders (two EventSources) when tabs start at similar times or messages are delayed, because a tab that already became leader never relinquishes leadership upon receiving another tab's `leader` announcement.

## Issue Context
This defeats the primary goal (reducing SSE connections across tabs) and can bring back the browser connection-slot exhaustion.

## Fix Focus Areas
- frontend/src/lib/useSharedSSE.ts[154-206]

## Suggested fix
- Track the current leader’s `tabId` and include it in `leader` messages.
- On receiving `leader`:
 - If we are not leader: cancel election and record leader id.
 - If we are leader and the other `tabId` wins by a deterministic rule (e.g., lexicographically smaller tabId / earlier timestamp), step down: set `isLeader=false`, close `eventSource`, clear reconnect timer, and become follower.
- Prevent overlapping elections by clearing any existing `electionTimer` before starting a new election (e.g., in `startElection()` and on `leader-down`).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

9. Unbounded topics resource risk ✓ Resolved 🐞 Bug ➹ Performance
Description
/api/stream accepts an unbounded topics list and creates per-topic asyncio.Event registrations
and per-loop wait tasks, so an authenticated client can subscribe to many job IDs and drive
CPU/memory usage proportional to topic count. This is new amplification surface compared to
single-topic SSE endpoints.
Code

src/rootcoz/main.py[R6493-6542]

+    topic_list = [t.strip() for t in topics.split(",") if t.strip()]
+    if not topic_list:
+        raise HTTPException(status_code=400, detail="No topics specified")
+
+    # Validate topics and build registration plan
+    # Each entry: (event_prefix, asyncio.Event, listener_set_or_dict, key_or_none)
+    registrations: list[
+        tuple[
+            str,
+            asyncio.Event,
+            set[asyncio.Event] | None,
+            dict[str, set[asyncio.Event]] | None,
+            str,
+        ]
+    ] = []
+    navbar_requested = False
+
+    for topic in topic_list:
+        if topic == "navbar":
+            navbar_requested = True
+            # navbar has custom handling (active + mention events)
+            continue
+        elif topic == "dashboard":
+            ev = asyncio.Event()
+            registrations.append(("dashboard", ev, _dashboard_listeners, None, ""))
+        elif topic.startswith("results:"):
+            job_id = topic[len("results:") :]
+            if not job_id:
+                continue
+            ev = asyncio.Event()
+            registrations.append(
+                (f"results:{job_id}", ev, None, _job_status_listeners, job_id)
+            )
+        elif topic.startswith("comments:"):
+            job_id = topic[len("comments:") :]
+            if not job_id:
+                continue
+            ev = asyncio.Event()
+            registrations.append(
+                (f"comments:{job_id}", ev, None, _comment_listeners, job_id)
+            )
+        elif topic.startswith("chat:"):
+            job_id = topic[len("chat:") :]
+            if not job_id:
+                continue
+            listener_key = f"{job_id}:{username}" if username else job_id
+            ev = asyncio.Event()
+            registrations.append(
+                (f"chat:{job_id}", ev, None, _chat_listeners, listener_key)
+            )
Relevance

⭐⭐⭐ High

Team often accepts adding bounds to prevent unbounded resource usage (e.g., size/output caps).

PR-#122
PR-#144

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The endpoint splits and iterates over all provided topics, allocating an asyncio.Event per valid
topic, and then rebuilds wait_tasks by creating an asyncio.create_task(ev.wait()) for each
registered event every loop iteration; this makes server work scale with the number of subscribed
topics.

src/rootcoz/main.py[6493-6543]
src/rootcoz/main.py[6612-6624]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The multiplexed SSE endpoint allows an arbitrary number of topic subscriptions in a single connection. Each valid topic allocates an `asyncio.Event`, registers it in global listener structures, and each generator loop creates a wait task per registered event.

### Issue Context
Frontend usage is small, but an authenticated client can send a very large `topics` query (e.g., many `results:{job_id}` entries) and cause significant per-connection overhead.

### Fix Focus Areas
- src/rootcoz/main.py[6493-6543]
- src/rootcoz/main.py[6612-6624]

### Suggested fix
1. Deduplicate topics early (e.g., `set(topic_list)`), preserving order if needed.
2. Enforce a maximum topic count (e.g., 20–50) and return HTTP 400/413 when exceeded.
3. Add basic validation/limits for `job_id` substrings (length cap and allowed charset) to avoid pathological keys.
4. Consider a per-user/per-connection rate limit for reconnections if available elsewhere in the codebase.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


10. Status SSE not closed ✓ Resolved 🐞 Bug ☼ Reliability
Description
StatusPage can leave the shared SSE connection open after a user abort because handleAbort() calls a
different fetchStatus() function that can enter a terminal state without clearing sseUrl. If no
further 'status-changed' event arrives, the page keeps an unnecessary SSE connection alive.
Code

frontend/src/pages/StatusPage.tsx[R99-103]

+  const [sseUrl, setSseUrl] = useState<string | null>(null)
+
  useEffect(() => {
    if (!jobId) return
Relevance

⭐⭐⭐ High

PR #26 explicitly closed SSE on terminal states in StatusPage; team likely accepts additional
cleanup to avoid lingering SSE.

PR-#26

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The jobId-scoped fetchStatus inside the effect explicitly calls setSseUrl(null) on terminal
states, but the component also defines another fetchStatus used by handleAbort() which doesn’t
clear sseUrl, so terminal-on-abort can keep SSE enabled until/unless another event occurs.

frontend/src/pages/StatusPage.tsx[99-179]
frontend/src/pages/StatusPage.tsx[195-227]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`StatusPage` now uses `sseUrl` + `useSharedSSE` for the status stream, but there are two `fetchStatus` functions: the one inside the `useEffect([jobId])` clears `sseUrl` on terminal states, while the outer `fetchStatus` called by `handleAbort()` does not. This can leave SSE connected after abort.

## Issue Context
The regression is introduced by the new `sseUrl`-driven SSE lifecycle.

## Fix Focus Areas
- frontend/src/pages/StatusPage.tsx[99-213]
- frontend/src/pages/StatusPage.tsx[215-227]

## Suggested fix
- Remove the duplicated outer `fetchStatus` and reuse the ref-based one (`statusFetchRef.current`) for `handleAbort()`.
 - OR, if keeping both functions, ensure the outer `fetchStatus` clears `sseUrl` when it transitions to terminal (`completed/failed/aborted/403/404`).
- Consider renaming one of the functions to avoid confusion and future regressions.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


11. Login return URL drops query ✓ Resolved 📎 Requirement gap ≡ Correctness
Description
RegisterPage only restores state.from.pathname after login/registration, dropping the original
search and hash components. This breaks deep links that rely on query parameters or fragments
and violates the requirement to preserve and honor the full return URL after an auth redirect.
Code

frontend/src/pages/RegisterPage.tsx[R52-55]

+  const location = useLocation()
  const { login, refreshAuth, authenticated, loading: authLoading } = useAuth()
+  const returnTo = (location.state as { from?: { pathname: string } } | null)?.from?.pathname || '/'
  const [mode, setMode] = useState<Mode>('login')
Relevance

⭐⭐⭐ High

Team has accepted URL/route correctness fixes; auth flow work in PR #33 shows redirect-state
handling is important.

PR-#33

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The redirect flow in ProtectedRoute stores the full location object in navigation state
(state={{ from: location }}), which includes pathname, search, and hash, but the post-auth
return logic in RegisterPage reads only .pathname and navigates to that value. As a result, even
though the original requested location (including query string and fragment) is available in state,
the implemented redirect discards search and hash, failing to honor the full originally
requested URL as required.

Preserve and honor return URL after login/registration redirect (ProtectedRoute and RegisterPage)
frontend/src/components/shared/ProtectedRoute.tsx[19-19]
frontend/src/pages/RegisterPage.tsx[52-55]
frontend/src/components/shared/ProtectedRoute.tsx[11-20]
frontend/src/pages/RegisterPage.tsx[50-73]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`ProtectedRoute` redirects unauthenticated users to `/login` while saving the full `location` in navigation state, but `RegisterPage` reconstructs the return destination using only `state.from.pathname`, which drops the original URL’s `search` and `hash` components.

## Issue Context
The compliance requirement expects the app to preserve and honor the originally requested URL after login/registration, including any query parameters and fragments used by deep links. The current behavior is a partial regression of the “preserve return URL” flow: the full location is stored, but only the pathname is restored.

## Fix Focus Areas
- frontend/src/components/shared/ProtectedRoute.tsx[13-20]
- frontend/src/components/shared/ProtectedRoute.tsx[19-19]
- frontend/src/pages/RegisterPage.tsx[50-73]
- frontend/src/pages/RegisterPage.tsx[52-55]

## Implementation notes
- Restore the return URL using `pathname`, `search`, and `hash` (or the full Location) rather than only `pathname`.
- Options:
 - Keep passing the full `location` object and, in `RegisterPage`, build the destination with `from.pathname + from.search + from.hash` (or navigate with `{ pathname, search, hash }`).
 - Alternatively, pass a string from `ProtectedRoute` like `from: location.pathname + location.search + location.hash` and use it directly in `RegisterPage`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
12. Side effects in setState ✓ Resolved 🐞 Bug ☼ Reliability
Description
SSEProvider calls prev?.destroy() inside a setManager functional updater, which is a
side-effectful state update and can be invoked more than once under React StrictMode/concurrent
replay, leading to unexpected repeated teardown work. This makes the manager lifecycle harder to
reason about and risks duplicate lock/channel cleanup work on auth transitions.
Code

frontend/src/lib/SSEProvider.tsx[600]

+      setManager(prev => { prev?.destroy(); return null })
Relevance

⭐⭐ Medium

No direct precedent on setState side-effects; team accepts cleanup/lifecycle fixes elsewhere
(Sidebar cleanup accepted in #145/#103).

PR-#145
PR-#103

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The unauthenticated branch embeds a destroy() call inside the state updater callback, making the
state update impure and potentially repeatable/replayed by React, while the authenticated branch
already provides a cleanup that destroys the manager instance it created.

frontend/src/lib/SSEProvider.tsx[582-606]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`setManager(prev => { prev?.destroy(); return null })` performs side effects inside a React state updater. Updater functions are expected to be pure and may be re-invoked (e.g., StrictMode), which can cause repeated teardown work and brittle lifecycle semantics.

### Issue Context
The previous authenticated-true effect already returns a cleanup that destroys the created manager (`return () => { m.destroy() }`). On logout/auth-loss, that cleanup runs, so the unauthenticated branch can safely just clear state without embedding side effects into the state update.

### Fix Focus Areas
- frontend/src/lib/SSEProvider.tsx[592-602]

### Suggested fix
Refactor the `useEffect` to avoid side effects in the `setManager` updater, e.g.:
- Remove `prev?.destroy()` from the `setManager` functional update.
- Either rely on the prior effect cleanup for destruction and do `setManager(null)` in the unauthenticated branch, or explicitly call `manager?.destroy()` outside of `setManager()` before setting null (but not inside the updater).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


13. Settings listener cleanup fragile ✓ Resolved 🐞 Bug ☼ Reliability
Description
/api/stream registers settings listeners outside the streaming generator (unlike other topics),
making cleanup depend on the generator running its finally block and adding an inconsistent
special case. This increases the chance of stale listeners remaining registered in abnormal teardown
paths and makes the code harder to reason about.
Code

src/rootcoz/main.py[R6548-6554]

+        elif topic == "settings":
+            if not is_admin:
+                continue
+            ev = asyncio.Event()
+            registrations.append(("settings", ev, None, None, ""))
+            # settings uses a module-level set
+            _settings_listeners.add(ev)
Relevance

⭐⭐ Medium

Cleanup/lifecycle hardening is sometimes accepted, but no precedent for SSE listener registration
consistency in main.py.

PR-#122
PR-#145

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
The settings event is added to _settings_listeners during topic parsing instead of in the
generator’s registration loop, and then cleaned up via a separate special-case loop in finally,
indicating a less robust and less consistent lifecycle than other topics.

src/rootcoz/main.py[6548-6554]
src/rootcoz/main.py[6570-6577]
src/rootcoz/main.py[6699-6714]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
In `stream_multiplexed`, the `settings` topic calls `_settings_listeners.add(ev)` during request setup, outside `event_generator()`’s normal registration loop, and then relies on a separate cleanup loop in `finally`. This special case makes the registration/cleanup path less robust than other topics.

### Issue Context
Other topics add their `asyncio.Event` to listener sets/dicts inside `event_generator()`’s registration block, so registration and cleanup are tightly paired.

### Fix Focus Areas
- src/rootcoz/main.py[6548-6554]
- src/rootcoz/main.py[6570-6577]
- src/rootcoz/main.py[6699-6714]

### Suggested fix
- Treat `settings` like other global topics:
 1. Remove the early `_settings_listeners.add(ev)`.
 2. Add `settings` to `registrations` with `global_set=_settings_listeners` (or handle it in the existing registration loop).
 3. Remove the special-case cleanup loop for settings, letting the normal cleanup handle it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread frontend/src/pages/RegisterPage.tsx
Comment thread frontend/src/lib/useSharedSSE.ts Outdated
Comment thread frontend/src/pages/StatusPage.tsx Outdated
Comment thread frontend/src/lib/useSharedSSE.ts Outdated
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 19233f4

@rnetser

rnetser commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

#153 - not fixed; check logs or simulate - when opmning multiple browser tabs, some get stuck in loading

Comment thread frontend/src/lib/SSEProvider.tsx
Comment thread src/rootcoz/main.py
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 09d7aad

@myakove

myakove commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

@rnetser The SSE connection issue has been redesigned. Instead of per-URL connection sharing (which didn't help when different tabs view different jobs), we now use a single multiplexed SSE endpoint (/api/stream) that routes all event topics through one connection per tab. This reduces total SSE connections from N×M (tabs × streams) to exactly 1 regardless of how many tabs or pages are open.

The dev server has been redeployed with this fix — please test with multiple browser tabs and let us know if the loading issue is resolved.

@myakove

myakove commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Investigated the multiple-tabs-stuck-in-loading issue. Found and fixed three root causes in SSEProvider.tsx (commit b55c922):

1. Split-brain leader election → multiple EventSources
The BroadcastChannel timeout-based election (150ms + jitter) was prone to split-brain when multiple tabs opened simultaneously — both could declare themselves leader and each open their own EventSource, exhausting the browser's 6-connection-per-domain limit. Replaced with navigator.locks API which provides guaranteed single-holder semantics with automatic release on tab close (even crash).

2. Browser auto-reconnect creating duplicate connections
The onerror handler only reconnected when readyState === CLOSED, but ignored readyState === CONNECTING (transient errors). The browser's built-in auto-reconnect would fire in parallel with any pending manual reconnect, opening duplicate connections. Now explicitly close the EventSource on any error and reconnect manually with exponential backoff.

3. No leader liveness detection
If the leader tab closed without delivering leader-down via BroadcastChannel (which is unreliable during beforeunload), follower tabs would stay as followers forever — never opening their own EventSource, never receiving SSE events. Added a heartbeat (5s interval) + watchdog (12s timeout) so followers detect a dead leader and re-elect.

All 251 frontend tests pass. Please re-test with 3+ tabs.

Comment thread frontend/src/lib/SSEProvider.tsx
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit b55c922

@myakove

myakove commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Investigated the issue — the root cause was that the deployed container image was built before the navigator.locks fix (commit b55c922) was merged. The old code used BroadcastChannel-only leader election which had race conditions, and critically, the EventSource.onerror handler didn't close the connection on error — allowing the browser's built-in auto-reconnect to create duplicate connections that exhausted the 6-connection pool.

The fix (already in the codebase) uses navigator.locks for deterministic leader election, adds heartbeat/watchdog for leader liveness, and explicitly closes + nulls the EventSource on error to prevent duplicate connections.

Redeployed the dev server from staging (commit 406e1f5) which includes the full fix. Verified the new JS bundle contains navigator.locks and rootcoz-sse-leader. Please re-test with multiple tabs.

@rnetser

rnetser commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

#153 looks like it is fixed

myakove added 4 commits July 2, 2026 15:02
… exhaustion (#153)

Three fixes for tabs stuck in loading when opening multiple browser tabs:

1. navigator.locks-based leader election: replaces the BroadcastChannel
   timeout-based election which was prone to split-brain. The browser
   guarantees only one tab holds the lock, and auto-releases on close.

2. Explicit EventSource close on error: prevents browser auto-reconnect
   from opening duplicate connections that exhaust the 6-per-domain limit.

3. Leader heartbeat + follower watchdog: detects unresponsive leaders
   and triggers re-election after 12s timeout.

All 251 frontend tests pass.
… connection pool exhaustion (#153)""

This reverts commit c516667.
@myakove myakove force-pushed the fix/issue-153-157-shared-sse branch from c516667 to 3645e95 Compare July 2, 2026 12:06
Comment thread frontend/src/lib/SSEProvider.tsx
Comment thread src/rootcoz/main.py Outdated
Comment thread src/rootcoz/main.py Outdated
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 3645e95

@yossisegev

Copy link
Copy Markdown

I have just tried accessing job 520866e5-b35e-492e-9643-2ed96af9fadd in the dev env, and the issue still happens, as I described it in #157 - when clicking the job link, a new tab is opened (as expected), but not to this job but rather to the parent dev RootCoz dashboard.
In order to open the link you need to copy it, open a new tab/window and paste the link in the address bar.

Comment thread frontend/src/lib/SSEProvider.tsx Outdated
Comment thread src/rootcoz/main.py
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit cdb76af

Comment thread frontend/src/lib/SSEProvider.tsx Outdated
@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 28ffd5c

@qodo-code-review

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit d48f3f2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feedback: Clicking a link to a specific analysis opens the general dashboard fix: share SSE connections across tabs to avoid browser connection limit

4 participants