Skip to content

refactor(api): Clean Architecture — migrate 5 POST routes to SimulationService#10

Merged
showjihyun merged 14 commits into
mainfrom
refactor/clean-architecture-service
Apr 13, 2026
Merged

refactor(api): Clean Architecture — migrate 5 POST routes to SimulationService#10
showjihyun merged 14 commits into
mainfrom
refactor/clean-architecture-service

Conversation

@showjihyun

Copy link
Copy Markdown
Owner

Summary

Extends the Round 8-8 service-layer migration (SPEC 20_CLEAN_ARCHITECTURE_SPEC.md §3.1) to five more mutation POST endpoints in backend/app/api/simulations.py. These routes previously depended on the raw SimulationOrchestrator only for a state-fetch + 404 gate, so switching them to _svc_state_or_404 lets us drop the orchestrator dependency entirely in favor of SimulationService.

Migrated routes:

  • POST /{id}/run-all
  • POST /{id}/engine-control
  • POST /{id}/group-chat
  • POST /{id}/group-chat/{gid}/message
  • POST /{id}/agents/{aid}/interview

Also expands the _svc_state_or_404 docstring to reflect current coverage and documents why inject-event / replay/{step} remain on _get_state_or_404 (pending corresponding SimulationService methods).

Scope

Intentionally not migrated in this PR:

  • inject-event, replay/{step} — require new SimulationService.inject_event() / replay_step() methods; deferred to follow-up.
  • compare, get_group_chat — read-only GETs, per SPEC 20 §3.1 they may stay on _get_state_or_404.

Test plan

  • Backend full suite: uv run pytest tests/1031 passed, 2 skipped (0 regressions)
  • Manual review: all remaining _get_state_or_404 callsites verified as either GETs or pending service-method dependencies

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

showjihyun and others added 14 commits April 12, 2026 16:55
The three opinions pages used `w-screen` on their root div, but they
render inside `SidebarLayout` which provides only `flex-1` width.
This caused the 4th stat card ("Active Cascades") to overflow past
the viewport edge when the sidebar was open.

Switch to `h-full w-full` so pages fill their flex container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pshot

- Add _svc_state_or_404(): Clean Architecture equivalent of _get_state_or_404
  that depends on SimulationService instead of raw orchestrator. For use by
  mutation endpoints (POST routes) per SPEC 20 §3.1.
- Expand _get_state_or_404 docstring: document which modules use it and
  when to prefer the service-layer variant.
- Save weekly retro snapshot (.context/retros/2026-04-12-1.json)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SPEC: docs/spec/20_CLEAN_ARCHITECTURE_SPEC.md#3.1

Extend the Round 8-8 service-layer migration to the next wave of
mutation POST endpoints. These routes only needed orchestrator access
for the state-fetch + 404 gate, so switching to _svc_state_or_404
lets us drop the raw orchestrator dependency entirely:

  - POST /{id}/run-all
  - POST /{id}/engine-control
  - POST /{id}/group-chat
  - POST /{id}/group-chat/{gid}/message
  - POST /{id}/agents/{aid}/interview

Update _svc_state_or_404 docstring to reflect the expanded coverage
and record which routes remain on _get_state_or_404 (inject-event
and replay pending corresponding service methods; read-only GETs
compare/group-chat-read by design).

Tests: 1031 passed, 2 skipped (no regression).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SPEC: docs/spec/24*.md §2.2.5

- Bump --bottom-area-height 240 → 300 so ~6 event rows fit instead of ~3
- Widen panel 280 → 360 via new --emergent-panel-width token
- Show from md breakpoint (was lg) to avoid clipping on 13–14" laptops
- Row typography: 10–11px → 11–12px, 1-line truncate → 2-line clamp,
  progress bar height 1px → 1.5px for better affordance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Integration tests write directly to data/community_templates.json
and leave residue entries ("Test Community", "Updated Name"). These
accumulated locally — committing to sync working tree with what the
test suite actually produces.

Follow-up: tests should use a tmp fixture path (or an in-memory
repo) so data/ stays pristine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When adoption_rate approached 1.0, every node rendered as the same
green (#22c55e), wiping community colors off the 3D view. Polarization
and faction dynamics — arguably the whole point of the simulation —
became invisible exactly at the moment they mattered most.

Replace the global ADOPTED_GLOW_COLOR with a per-community brightened
tint: each community's base hex blended 35% toward white. Community A
adopters read as bright-A, B adopters as bright-B, etc. The "converted"
signal survives but it no longer overwrites identity.

- Add brightenHex(hex, amount) helper alongside hexToRgba
- Precompute adoptedColorMap alongside communityColorMap (memoized)
  so nodeColorFn stays O(1) per node per frame
- Strobe (amber active-agent flash) is unchanged — that one IS meant
  to override community color for its 1.5s glow window

Tests: 728 frontend passed, tsc -b clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Existing inject_event tests cover the API gate and event-consumption
count, but none proved that the user-typed content string actually
reaches agent perception. That is the user-facing claim of the
Inject Event modal, so it deserves a direct assertion.

New test test_inject_roundtrip_content_reaches_agents drives the
documented 4-step flow and asserts each step:

  1. POST returns 200 with event_id + effective_step = step + 1
  2. state.injected_events has the event queued with the typed
     marker string preserved in .message, and controversy type
     mapped to community_discussion
  3. Next /step consumes the queue (len → 0) and bumps the total
     agent exposure_count (proves agents actually perceived it)
  4. state.current_step matches the originally returned
     effective_step

Inject suite now 11/11 green (10 existing + 1 new).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The simulation workspace had zero in-page navigation — users had to
hit the back button or edit the URL to reach Projects, Communities,
etc. Now the shared AppSidebar mounts as a 60px icon rail and slides
open to 256px on hamburger click, matching the rest of the app's
nav model without stealing horizontal room from the 3D graph.

Changes:
- AppSidebar: new `defaultCollapsed` prop drives initial state
- SidebarLayout: threads the prop through, adds `min-w-0` on <main>
  so flex children actually shrink to fit
- App.tsx: wraps `/simulation/:simulationId` (and legacy plural alias)
  in a SidebarLayout with `defaultCollapsed` set
- SimulationPage: `h-screen w-screen` → `h-full w-full` so the page
  sizes to its flex slot, not the viewport

Bug caught during self-review:
- AppSidebar.tsx had `useEffect(() => setCollapsed(isMobile), [isMobile])`
  which fired on mount and unconditionally stomped `defaultCollapsed`
  on desktop, silently defeating the whole change. Narrowed the effect
  to `if (isMobile) setCollapsed(true)` so desktop keeps the prop's
  initial value.

Regression tests (new AppSidebar.test.tsx, 4 cases):
- defaultCollapsed=true on desktop stays 60px (guards the P1 above)
- no prop → 256px on desktop
- mobile viewport auto-collapses
- toggle button cycles 60 ↔ 256

Tests: 732 frontend passed (41 files, +1 file, +4 tests), tsc -b clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
365 was a stale "one step per day for a year" leftover. The backend
CreateSimulationRequest.max_steps default is 50, and the clone-fallback
path in the same hook already used 50 — fresh form was the sole outlier,
so users creating a new simulation got silently-slow default runs
(7× longer than intended) until they noticed and overrode it.

Three-way alignment restored: backend default / clone fallback / fresh
form all = 50. Clone path preserved (cloneConfig.max_steps still wins).

Tests: CampaignSetupPage 29/29 + FormProgressBanner 6/6 green, tsc clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two user-reported bugs in the Simulation Complete summary:

1. Key Events timeline was capped at 5. Realistic runs emit dozens of
   emergent events across steps; everything past the 5th silently
   vanished. Now renders every event in step order, with a count in
   the section header (e.g. "Key Events (12)").

2. Description text was cut to one line via line-clamp-1, hiding the
   content users came to the report to read. Long descriptions now
   wrap via break-words.

3. When the event count grew the section pushed the rest of the
   report down. Wrapped the list in a scrollable container
   (max-h-44, overflow-y-auto, bordered background) so it stays
   contained and the modal's other sections stay visible.

Regression tests (new SimulationReportModal.test.tsx, 4 cases):
- 12 events across 4 steps all render (guards the 5-cap regression)
- key-events-list container carries overflow-y-auto + max-h-*
- long descriptions have break-words, not line-clamp-*
- empty events → section hidden entirely

Tests: 736 frontend passed (+4 new), tsc -b clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Combines a prior-session README polish (hero block, badge group,
nav menu, expanded use cases) with this pass's gap-fill audit:

- "🎮 What you actually do in the UI" — new 6-step workflow section
  between Features and Use Cases. Walks Projects → Setup → Run →
  intervene (Inject / Engine Control / Replay) → read (Summary
  Report + Analytics) → drill (Opinions / Top Influencers /
  Interview) → Compare. The README marketed the engine well but
  never said what you actually click.
- Features grid expanded 6 → 9 cards with a third row covering the
  mid-run interaction surfaces (Inject Event, Engine Control,
  Compare). These exist in the product but had zero README presence.
- New use case: "🚨 Stress-test crisis response (mid-run shock
  injection)" — Inject Event scenario with a step-20 negative PR
  derail and recovery measurement, matching how the modal is
  actually used.
- "📸 Screenshots" section added before Architecture, with four
  placeholder tiles (3D / Opinions / Analytics / Influencers) and
  graceful onerror hide. Capture guide + filename / size rules at
  docs/assets/screenshots/README.md so the next person knows
  exactly what to drop in.
- Stale numbers synced: tests 1032 BE / 656 FE → 1031 BE / 736 FE
  (badge + tech stack + "1,658+" → 1,767+); version badge realigned
  to VERSION file (0.1.3.0 was aspirational; actual is 0.1.1.0).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 3D graph picked colors via a private resolver in GraphPanel
(STATIC_COMMUNITY_COLOR + a hash-based fallback over a 10-color
palette), the legend reused that same array (so it matched), but
CommunitiesDetailPage hardcoded a separate A–E map and fell back
to muted-foreground gray for everything else. Result: a community
named "M" or a UUID-keyed template community rendered colored in
the graph + legend but gray in the left-menu Communities page,
and the user couldn't tell which row matched which node.

Extract the resolver to `@/lib/communityColor#resolveCommunityColor`
so all three surfaces share one algorithm:
  1. Built-in A–E ids → static COMMUNITIES color
  2. Anything else → hash(id) % FALLBACK_COMMUNITY_PALETTE — stable
     across re-fetches, pagination, and reseeding

Wiring:
- GraphPanel: drop the private STATIC_COMMUNITY_COLOR map +
  fallbackColorFor + 10-color array, import the resolver instead.
  Drops the now-unused COMMUNITIES import too.
- CommunitiesDetailPage: split COMMUNITY_META into COMMUNITY_NAME
  (display-name overrides for A–E) and resolveCommunityColor (color
  for any id). Removes the gray fallback path entirely.
- ArchitectureInvariants: add lib/communityColor.ts to the hex
  baseline allowlist with a follow-up note (lift palette into
  COMMUNITY_PALETTE proper next).

Regression tests (new communityColor.test.ts, 4 cases):
- A–E ids return their canonical COMMUNITIES color
- Unknown ids fall back to FALLBACK_COMMUNITY_PALETTE
- Same id → same color across calls (stability)
- 10 distinct ids don't bucket-collapse to 1-2 colors

Tests: 744 frontend passed (+8 new, 43 files), tsc -b clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the SPEC publication ban from .gitignore and CLAUDE.md.
All docs/spec/, docs/init/, BUSINESS_REPORT, MARKETING_STRATEGY,
and OASIS_vs_Prophet are now tracked and will be pushed publicly.

The project owner decided to open-source the full specification
layer alongside the implementation — the value is in execution
speed, not in secrecy of the design documents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three concerns addressed in one pass:

1. Analytics page — HelpTooltip on every chart
   Add `term` prop to `<Section>` component. Wire 5 glossary entries
   (analyticsAdoptionChart, analyticsSentimentChart,
   analyticsCommunityComparison, analyticsCascadeAnalytics,
   analyticsEventTimeline) so each chart title gets a ? icon with
   a plain-English explanation of what the chart shows and how to
   read it.

2. Community Adoption Comparison bar chart was near-invisible
   Two bugs compounded: raw `<rect fill>` inside `<Bar>` (Recharts
   silently ignores non-Cell children, so bars got the transparent
   default fill) AND `var(--community-*)` CSS vars were dim in dark
   mode. Switched to `<Cell fill={entry.color}>` with hex colors
   from `resolveCommunityColor(cid)`. Line chart community strokes
   and the fallback legend swatches use the same resolver now, so
   every community surface in Analytics matches the 3D graph.

3. WebSocket cleanup (StrictMode zombie socket)
   Added `didCancel` guard + strip all event listeners in cleanup
   so closing a CONNECTING socket doesn't trigger an onclose retry
   that opens a zombie socket racing the next effect mount.

Also fixed tooltip-induced test failures across 3 files
(AnalyticsPage, GlobalMetrics, ScenarioOpinions) where HelpTooltip
rendered the label text in DOM, breaking `getByText` matchers.
Switched to `getByRole('heading')` or `getAllByText(...).length`.
Fixed glossary `pageOpinions` text missing trailing period.

Tests: 805 frontend passed, tsc -b clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@showjihyun showjihyun merged commit e155b26 into main Apr 13, 2026
1 of 2 checks passed
showjihyun added a commit that referenced this pull request Apr 13, 2026
…n, validation)

Two-pass code review found 11 issues across 6 backend files:

Critical:
- #1  registry._call_adapter: wrap raw str→LLMPrompt before adapter.complete()
- #2  persist_step retry: re-insert EmergentEvent rows on rollback retry
- #8  deps.py singletons: add threading.Lock + double-checked locking
- #9  load_steps: bound EmergentEvent query with step≤max + limit

Important:
- #3  MC endpoint: asyncio.wait_for(300s) + 504 on timeout
- #4  settings PUT: str() coercion on Chinese LLM provider fields
- #5  monte_carlo.py: remove fragile iscoroutine guard, plain await
- #6  _config_to_dict: dataclasses.asdict for community serialization
- #7  UUID parse: _safe_uuid try/except replaces len>8 heuristic
- #10 persist_step retry: also re-insert agent_states + propagation_events
- #11 settings PUT: str() coercion on Anthropic/OpenAI/Gemini fields too

All 57 targeted tests pass (test_29 + test_06 + test_05).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant