refactor(api): Clean Architecture — migrate 5 POST routes to SimulationService by showjihyun · Pull Request #10 · showjihyun/Prophet

showjihyun · 2026-04-13T12:30:43Z

Summary

Extends the Round 8-8 service-layer migration (SPEC 20_CLEAN_ARCHITECTURE_SPEC.md §3.1) to five more mutation POST endpoints in backend/app/api/simulations.py. These routes previously depended on the raw SimulationOrchestrator only for a state-fetch + 404 gate, so switching them to _svc_state_or_404 lets us drop the orchestrator dependency entirely in favor of SimulationService.

Migrated routes:

POST /{id}/run-all
POST /{id}/engine-control
POST /{id}/group-chat
POST /{id}/group-chat/{gid}/message
POST /{id}/agents/{aid}/interview

Also expands the _svc_state_or_404 docstring to reflect current coverage and documents why inject-event / replay/{step} remain on _get_state_or_404 (pending corresponding SimulationService methods).

Scope

Intentionally not migrated in this PR:

inject-event, replay/{step} — require new SimulationService.inject_event() / replay_step() methods; deferred to follow-up.
compare, get_group_chat — read-only GETs, per SPEC 20 §3.1 they may stay on _get_state_or_404.

Test plan

Backend full suite: uv run pytest tests/ → 1031 passed, 2 skipped (0 regressions)
Manual review: all remaining _get_state_or_404 callsites verified as either GETs or pending service-method dependencies

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

The three opinions pages used `w-screen` on their root div, but they render inside `SidebarLayout` which provides only `flex-1` width. This caused the 4th stat card ("Active Cascades") to overflow past the viewport edge when the sidebar was open. Switch to `h-full w-full` so pages fill their flex container. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pshot - Add _svc_state_or_404(): Clean Architecture equivalent of _get_state_or_404 that depends on SimulationService instead of raw orchestrator. For use by mutation endpoints (POST routes) per SPEC 20 §3.1. - Expand _get_state_or_404 docstring: document which modules use it and when to prefer the service-layer variant. - Save weekly retro snapshot (.context/retros/2026-04-12-1.json) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

SPEC: docs/spec/20_CLEAN_ARCHITECTURE_SPEC.md#3.1 Extend the Round 8-8 service-layer migration to the next wave of mutation POST endpoints. These routes only needed orchestrator access for the state-fetch + 404 gate, so switching to _svc_state_or_404 lets us drop the raw orchestrator dependency entirely: - POST /{id}/run-all - POST /{id}/engine-control - POST /{id}/group-chat - POST /{id}/group-chat/{gid}/message - POST /{id}/agents/{aid}/interview Update _svc_state_or_404 docstring to reflect the expanded coverage and record which routes remain on _get_state_or_404 (inject-event and replay pending corresponding service methods; read-only GETs compare/group-chat-read by design). Tests: 1031 passed, 2 skipped (no regression). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

SPEC: docs/spec/24*.md §2.2.5 - Bump --bottom-area-height 240 → 300 so ~6 event rows fit instead of ~3 - Widen panel 280 → 360 via new --emergent-panel-width token - Show from md breakpoint (was lg) to avoid clipping on 13–14" laptops - Row typography: 10–11px → 11–12px, 1-line truncate → 2-line clamp, progress bar height 1px → 1.5px for better affordance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Integration tests write directly to data/community_templates.json and leave residue entries ("Test Community", "Updated Name"). These accumulated locally — committing to sync working tree with what the test suite actually produces. Follow-up: tests should use a tmp fixture path (or an in-memory repo) so data/ stays pristine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When adoption_rate approached 1.0, every node rendered as the same green (#22c55e), wiping community colors off the 3D view. Polarization and faction dynamics — arguably the whole point of the simulation — became invisible exactly at the moment they mattered most. Replace the global ADOPTED_GLOW_COLOR with a per-community brightened tint: each community's base hex blended 35% toward white. Community A adopters read as bright-A, B adopters as bright-B, etc. The "converted" signal survives but it no longer overwrites identity. - Add brightenHex(hex, amount) helper alongside hexToRgba - Precompute adoptedColorMap alongside communityColorMap (memoized) so nodeColorFn stays O(1) per node per frame - Strobe (amber active-agent flash) is unchanged — that one IS meant to override community color for its 1.5s glow window Tests: 728 frontend passed, tsc -b clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Existing inject_event tests cover the API gate and event-consumption count, but none proved that the user-typed content string actually reaches agent perception. That is the user-facing claim of the Inject Event modal, so it deserves a direct assertion. New test test_inject_roundtrip_content_reaches_agents drives the documented 4-step flow and asserts each step: 1. POST returns 200 with event_id + effective_step = step + 1 2. state.injected_events has the event queued with the typed marker string preserved in .message, and controversy type mapped to community_discussion 3. Next /step consumes the queue (len → 0) and bumps the total agent exposure_count (proves agents actually perceived it) 4. state.current_step matches the originally returned effective_step Inject suite now 11/11 green (10 existing + 1 new). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The simulation workspace had zero in-page navigation — users had to hit the back button or edit the URL to reach Projects, Communities, etc. Now the shared AppSidebar mounts as a 60px icon rail and slides open to 256px on hamburger click, matching the rest of the app's nav model without stealing horizontal room from the 3D graph. Changes: - AppSidebar: new `defaultCollapsed` prop drives initial state - SidebarLayout: threads the prop through, adds `min-w-0` on <main> so flex children actually shrink to fit - App.tsx: wraps `/simulation/:simulationId` (and legacy plural alias) in a SidebarLayout with `defaultCollapsed` set - SimulationPage: `h-screen w-screen` → `h-full w-full` so the page sizes to its flex slot, not the viewport Bug caught during self-review: - AppSidebar.tsx had `useEffect(() => setCollapsed(isMobile), [isMobile])` which fired on mount and unconditionally stomped `defaultCollapsed` on desktop, silently defeating the whole change. Narrowed the effect to `if (isMobile) setCollapsed(true)` so desktop keeps the prop's initial value. Regression tests (new AppSidebar.test.tsx, 4 cases): - defaultCollapsed=true on desktop stays 60px (guards the P1 above) - no prop → 256px on desktop - mobile viewport auto-collapses - toggle button cycles 60 ↔ 256 Tests: 732 frontend passed (41 files, +1 file, +4 tests), tsc -b clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

365 was a stale "one step per day for a year" leftover. The backend CreateSimulationRequest.max_steps default is 50, and the clone-fallback path in the same hook already used 50 — fresh form was the sole outlier, so users creating a new simulation got silently-slow default runs (7× longer than intended) until they noticed and overrode it. Three-way alignment restored: backend default / clone fallback / fresh form all = 50. Clone path preserved (cloneConfig.max_steps still wins). Tests: CampaignSetupPage 29/29 + FormProgressBanner 6/6 green, tsc clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Two user-reported bugs in the Simulation Complete summary: 1. Key Events timeline was capped at 5. Realistic runs emit dozens of emergent events across steps; everything past the 5th silently vanished. Now renders every event in step order, with a count in the section header (e.g. "Key Events (12)"). 2. Description text was cut to one line via line-clamp-1, hiding the content users came to the report to read. Long descriptions now wrap via break-words. 3. When the event count grew the section pushed the rest of the report down. Wrapped the list in a scrollable container (max-h-44, overflow-y-auto, bordered background) so it stays contained and the modal's other sections stay visible. Regression tests (new SimulationReportModal.test.tsx, 4 cases): - 12 events across 4 steps all render (guards the 5-cap regression) - key-events-list container carries overflow-y-auto + max-h-* - long descriptions have break-words, not line-clamp-* - empty events → section hidden entirely Tests: 736 frontend passed (+4 new), tsc -b clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Combines a prior-session README polish (hero block, badge group, nav menu, expanded use cases) with this pass's gap-fill audit: - "🎮 What you actually do in the UI" — new 6-step workflow section between Features and Use Cases. Walks Projects → Setup → Run → intervene (Inject / Engine Control / Replay) → read (Summary Report + Analytics) → drill (Opinions / Top Influencers / Interview) → Compare. The README marketed the engine well but never said what you actually click. - Features grid expanded 6 → 9 cards with a third row covering the mid-run interaction surfaces (Inject Event, Engine Control, Compare). These exist in the product but had zero README presence. - New use case: "🚨 Stress-test crisis response (mid-run shock injection)" — Inject Event scenario with a step-20 negative PR derail and recovery measurement, matching how the modal is actually used. - "📸 Screenshots" section added before Architecture, with four placeholder tiles (3D / Opinions / Analytics / Influencers) and graceful onerror hide. Capture guide + filename / size rules at docs/assets/screenshots/README.md so the next person knows exactly what to drop in. - Stale numbers synced: tests 1032 BE / 656 FE → 1031 BE / 736 FE (badge + tech stack + "1,658+" → 1,767+); version badge realigned to VERSION file (0.1.3.0 was aspirational; actual is 0.1.1.0). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The 3D graph picked colors via a private resolver in GraphPanel (STATIC_COMMUNITY_COLOR + a hash-based fallback over a 10-color palette), the legend reused that same array (so it matched), but CommunitiesDetailPage hardcoded a separate A–E map and fell back to muted-foreground gray for everything else. Result: a community named "M" or a UUID-keyed template community rendered colored in the graph + legend but gray in the left-menu Communities page, and the user couldn't tell which row matched which node. Extract the resolver to `@/lib/communityColor#resolveCommunityColor` so all three surfaces share one algorithm: 1. Built-in A–E ids → static COMMUNITIES color 2. Anything else → hash(id) % FALLBACK_COMMUNITY_PALETTE — stable across re-fetches, pagination, and reseeding Wiring: - GraphPanel: drop the private STATIC_COMMUNITY_COLOR map + fallbackColorFor + 10-color array, import the resolver instead. Drops the now-unused COMMUNITIES import too. - CommunitiesDetailPage: split COMMUNITY_META into COMMUNITY_NAME (display-name overrides for A–E) and resolveCommunityColor (color for any id). Removes the gray fallback path entirely. - ArchitectureInvariants: add lib/communityColor.ts to the hex baseline allowlist with a follow-up note (lift palette into COMMUNITY_PALETTE proper next). Regression tests (new communityColor.test.ts, 4 cases): - A–E ids return their canonical COMMUNITIES color - Unknown ids fall back to FALLBACK_COMMUNITY_PALETTE - Same id → same color across calls (stability) - 10 distinct ids don't bucket-collapse to 1-2 colors Tests: 744 frontend passed (+8 new, 43 files), tsc -b clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove the SPEC publication ban from .gitignore and CLAUDE.md. All docs/spec/, docs/init/, BUSINESS_REPORT, MARKETING_STRATEGY, and OASIS_vs_Prophet are now tracked and will be pushed publicly. The project owner decided to open-source the full specification layer alongside the implementation — the value is in execution speed, not in secrecy of the design documents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three concerns addressed in one pass: 1. Analytics page — HelpTooltip on every chart Add `term` prop to `<Section>` component. Wire 5 glossary entries (analyticsAdoptionChart, analyticsSentimentChart, analyticsCommunityComparison, analyticsCascadeAnalytics, analyticsEventTimeline) so each chart title gets a ? icon with a plain-English explanation of what the chart shows and how to read it. 2. Community Adoption Comparison bar chart was near-invisible Two bugs compounded: raw `<rect fill>` inside `<Bar>` (Recharts silently ignores non-Cell children, so bars got the transparent default fill) AND `var(--community-*)` CSS vars were dim in dark mode. Switched to `<Cell fill={entry.color}>` with hex colors from `resolveCommunityColor(cid)`. Line chart community strokes and the fallback legend swatches use the same resolver now, so every community surface in Analytics matches the 3D graph. 3. WebSocket cleanup (StrictMode zombie socket) Added `didCancel` guard + strip all event listeners in cleanup so closing a CONNECTING socket doesn't trigger an onclose retry that opens a zombie socket racing the next effect mount. Also fixed tooltip-induced test failures across 3 files (AnalyticsPage, GlobalMetrics, ScenarioOpinions) where HelpTooltip rendered the label text in DOM, breaking `getByText` matchers. Switched to `getByRole('heading')` or `getAllByText(...).length`. Fixed glossary `pageOpinions` text missing trailing period. Tests: 805 frontend passed, tsc -b clean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…n, validation) Two-pass code review found 11 issues across 6 backend files: Critical: - #1 registry._call_adapter: wrap raw str→LLMPrompt before adapter.complete() - #2 persist_step retry: re-insert EmergentEvent rows on rollback retry - #8 deps.py singletons: add threading.Lock + double-checked locking - #9 load_steps: bound EmergentEvent query with step≤max + limit Important: - #3 MC endpoint: asyncio.wait_for(300s) + 504 on timeout - #4 settings PUT: str() coercion on Chinese LLM provider fields - #5 monte_carlo.py: remove fragile iscoroutine guard, plain await - #6 _config_to_dict: dataclasses.asdict for community serialization - #7 UUID parse: _safe_uuid try/except replaces len>8 heuristic - #10 persist_step retry: also re-insert agent_states + propagation_events - #11 settings PUT: str() coercion on Anthropic/OpenAI/Gemini fields too All 57 targeted tests pass (test_29 + test_06 + test_05). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

showjihyun and others added 14 commits April 12, 2026 16:55

showjihyun merged commit e155b26 into main Apr 13, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(api): Clean Architecture — migrate 5 POST routes to SimulationService#10

refactor(api): Clean Architecture — migrate 5 POST routes to SimulationService#10
showjihyun merged 14 commits into
mainfrom
refactor/clean-architecture-service

showjihyun commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

showjihyun commented Apr 13, 2026

Summary

Scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant