feat: 3D graph, TanStack Query, parallel Monte Carlo, API hardening#2
Merged
Conversation
Saving in-progress 3D graph rendering work and the layout test that inspired it. Not ready to merge — branch parked here so PR #1 can land on master. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the trunk-based, squash-merge workflow Prophet uses: - master is protected, always deployable, only changes via squash-merged PR - Short-lived feat/fix/perf/docs/refactor/chore/test branches off master - Squash merge keeps master history linear (one commit per PR) - --force-with-lease only, never plain --force on shared branches - Stacked PR pattern for large features - Conflict cascade recovery (cherry-pick onto fresh master) Includes naming conventions, PR title/body templates, anti-patterns, worked examples, and a quick reference card. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CONTRIBUTING.md - Fork workflow as the primary path (gh fork --clone, upstream remote) - 10-step PR walkthrough (claim → sync → branch → test-first → push → PR) - Draft PR usage - "What if CI fails on my PR?" debugging section - "What if master moves while my PR is open?" — merge OR rebase, both fine GIT_BRANCH_STRATEGY.md - Audience callout: this doc is for core team; contributors → CONTRIBUTING.md - New "Two Workflows: Direct vs Fork" section pointing fork users to the contributor doc - Version bump 1.0 → 1.1 .github/ - pull_request_template.md (auto-fills on every new PR) - ISSUE_TEMPLATE/bug_report.md, feature_request.md, question.md - ISSUE_TEMPLATE/config.yml — disables blank issues, links to Discussions + Security advisories Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns with the upcoming default-branch rename. All shell snippets, prose, scenario headings, and example commands now use main instead of master. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds <HelpTooltip term="..."/> as a reusable component with a typed glossary at src/config/glossary.ts. Every UI label that surfaces a domain-specific term (sentiment, polarization, cascade depth, etc.) can now display a contextual help icon that explains what the value means and how to interpret it. New files - components/shared/HelpTooltip.tsx — reusable tooltip with anti-flicker design (wrapper-level hover, opacity toggle, pointer-events-none on popover, configurable left/center/right alignment, three icon sizes). Supports either inline label/text props OR a glossary term key. - config/glossary.ts — typed central glossary with 30+ terms covering core simulation, adoption/diffusion, sentiment, emergent behaviors, agents/roles, personality, LLM tiers, and simulation flow. Applied to - SimulationReportModal — refactored to import shared component + use term="..." (removed local copy and the legacy HELP constant) - StatCard (shared) — gained term + tooltipAlign props so any page using StatCard can opt into a tooltip with one prop - MetricsPanel — Active Agents, Sentiment Distribution, Polarization Index, Cascade Depth, Cascade Width, Top Influencers - CommunityPanel — Communities title - TopInfluencersPage — all 4 summary stat cards (Influencers Tracked, Avg Influence Score, Top Community, Active Cascades) Tests: 380/380 passing. TopInfluencers test updated to use getAllByText for labels that legitimately appear in both the rendered StatCard and its always-rendered (opacity-toggled) tooltip popover. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Round-2 audit found 4 lazy-loaded pages still subscribing to the full steps array, causing chart re-renders on every WebSocket step: - AnalyticsPage: storeSteps array → stepsLength + latestStep gate. All 5 chart helpers (adoption / sentiment / community / events) now wrapped in useMemo so recharts SVG paths don't recompute every step. - GlobalMetricsPage: full steps subscription PLUS appendStep loop on history hydration (O(n) store commits → O(n) re-renders of every app subscriber). Replaced with setStepsBulk single commit + lazy getState() reads inside memos. - AgentDetailPage: full steps subscription on a lazy page with charts. sentimentData and derivedInteractions memos now read lazily. - CommunitiesDetailPage: clever inline selector that re-runs on every steps mutation → use canonical s.latestStep instead. Tests: 380/380 passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ESLint cleanup (was 6 errors / 1 warning, now 0 / 0): - AgentDetailPage: removed orphan MOCK_AGENT, MOCK_INTERACTIONS, MOCK_CONNECTIONS, MOCK_MESSAGES (page now strictly uses real API data) - TopInfluencersPage: removed orphan MOCK_INFLUENCERS, DISTRIBUTION_DATA; fixed react-hooks/set-state-in-effect by deferring setLoading via queueMicrotask - ControlPanel: removed unused eslint-disable directive Tests for the shared HelpTooltip + glossary: - HelpTooltip.test.tsx (15 cases): glossary lookup, hover/click toggle, outside-click close, alignment classes, anti-flicker invariants (always-rendered DOM, pointer-events-none) - glossary.test.ts (126 cases): entry shape (label + non-empty text + ending punctuation), at least 25 entries, all production-required terms exist, type narrowing AgentDetail tests updated for the new "no mock fallback" behavior: - All 17 tests now use renderAndWait() so the real API mock has time to resolve before assertions run - 'renders 5 personality trait bars' updated to match the actual trait labels derived from the real agent.personality keys Bundle audit run: - Initial bundle (index): 74 KB gzipped — well under target - SimulationPage: 375 KB gzipped (three.js + force-graph) - Cytoscape isolated to its own 137 KB chunk - HelpTooltip + glossary cost: 4 KB gzipped (negligible) Tests: 27 files / 521 passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two complementary optimizations cut SimulationPage gzipped size from 375 KB → 19 KB (a 95% reduction): 1. vite.config.ts: manualChunks splits heavy 3rd-party libs into named, stable, cacheable chunks: - vendor-three (three + react-force-graph-3d + 3d-force-graph) - vendor-cytoscape (cytoscape, used by FactionMapView/EgoGraph) - vendor-recharts (recharts + victory-vendor) - vendor-d3 (remaining d3-* utilities) chunkSizeWarningLimit bumped to 600 KB so vendor-three doesn't trigger a noisy warning we'd just ignore. 2. SimulationPage.tsx: GraphPanel is now React.lazy() loaded behind a <Suspense> boundary. The "No Active Simulation" empty state now renders without paying the WebGL bundle cost — three.js (341 KB gzipped) only loads when a real simulation is active. Bundle table after: index 106 KB raw / 31 KB gzip (was 239 / 74) −58% SimulationPage 88 KB raw / 19 KB gzip (was 1417 / 375) −95% vendor-three 1283 KB raw / 341 KB gzip (lazy, only on active sim) vendor-cytoscape 434 KB raw / 137 KB gzip (lazy, FactionMap/EgoGraph) vendor-recharts 461 KB raw / 134 KB gzip (lazy, analytics pages) vendor-d3 101 KB raw / 32 KB gzip (cacheable shared chunk) GraphPanel 8 KB raw / 3 KB gzip (lazy) Tests (6 affected) updated to use findByTestId so React Suspense has time to resolve before assertions: - SimulationMain.test.tsx: 5 graph-engine tests - SimulationPage.test.tsx: 'renders graph panel' test All 521 tests still passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TanStack Query was installed but unused. This commit unlocks request
dedup, cross-route caching, and stale-while-revalidate for the highest-
leverage data fetches.
New module: src/api/queries.ts
- Centralized typed query/mutation hooks
- queryKeys factory for consistent invalidation
- Hooks for: projects, simulations, agents, communities, network,
llm stats/impact
Migrated (4 components):
1. LLMDashboard
- Was: useEffect + Promise.allSettled + local state, refetched on
every step via stepsLength dep
- Now: useLLMStats(simId, stepsLength) + useLLMImpact — step is in
the cache key so a new step naturally invalidates. Two components
calling the same hook get automatic dedup.
2. MetricsPanel
- Was: useEffect with `Math.floor(stepNum/10)` throttle hack to
avoid storming the agents endpoint
- Now: useAgents — TanStack cache eliminates the storm without
the manual throttle
3. ProjectsListPage
- Was: useEffect on mount + setProjects + manual loading state
- Now: useProjects() — page is instant when navigating away and back
- useCreateProject mutation auto-invalidates the projects list
4. AgentDetailPage (3 separate fetches consolidated)
- Was: useEffect for agent + Promise.all with network, separate
useEffect for connections, separate useEffect for memory — three
independent re-fetches on every navigation
- Now: useAgent + useNetwork + useAgentMemory in parallel, all cached.
Connection list and message list derived via useMemo from cached data.
- Removed dead useEffect import
Test infrastructure:
- src/test/setup.ts: vi.mock('@tanstack/react-query') so every test
gets a real QueryClient injected without per-test wrapping
- beforeEach clears the test cache so loading-state tests can actually
observe the loading state
Bundle delta:
- index chunk: 31 → 35 KB gzipped (+4 KB for the query layer)
- All other chunks unchanged
- Worth it for the UX gains (instant back-navigation, no fetch storms)
Tests: 27 files / 521 passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 2 migrates 10 additional pages/components from raw apiClient fetches to typed query/mutation hooks. Combined with Phase 1, the hot fetch paths are now all going through TanStack Query with caching, dedup, and SWR. queries.ts additions: - useSimulationCompare - useCommunityTemplates / useCreateCommunityTemplate / useUpdateCommunityTemplate / useDeleteCommunityTemplate - useCreateCommunity / useUpdateCommunity / useDeleteCommunity - All community/template mutations use refetchQueries (not just invalidate) so consumers see fresh data immediately Migrated pages (10): 1. AnalyticsPage — useSimulationSteps + 5 chart memos 2. GlobalMetricsPage — useSimulationSteps + setStepsBulk hydration 3. CommunityOpinionPage — useSimulationSteps + setStepsBulk hydration 4. ComparisonPage — useSimulationCompare (removed FetchState type + manual loading state) 5. CommunityManagePage — useCommunityTemplates + 3 mutations (removed local templates state, loadTemplates(), saving state) 6. CommunitiesDetailPage — useCommunities + 3 mutations (removed manual refetch after each create/update/delete) 7. TopInfluencersPage — useAgents (consolidated 4 separate setState calls into a single useMemo deriving influencers/distribution/stats) 8. GraphPanel — useNetwork (derived graphData via useMemo, removed setState-in-effect anti-pattern) 9. AgentInspector — useAgent (cached agent detail, edit-state synced via separate useEffect) Test fixes: - CommunityManagePage delete-reload test: wait for the Delete button to actually render (TanStack data + render delay) instead of asserting on raw mockList call count - CombinedError display: surface templatesQuery.error in the UI banner alongside mutation errors Bundle delta after Phase 1+2 vs original (74 KB initial): - index: 31 → 35 KB gzipped (+4 KB total for query layer) - CommunitiesDetailPage: 13.69 → 13.31 KB (-0.4 KB) - AnalyticsPage: 13.87 → 13.72 KB (-0.15 KB) - TopInfluencersPage: 21.05 → 20.96 KB - SimulationPage: 88.09 → 87.82 KB Tests: 27 files / 521 passing. ESLint: 0 errors / 0 warnings. Components still using raw apiClient (intentional, low value to migrate): - ControlPanel — imperative simulation lifecycle (start/pause/step/stop) - EngineControlPanel — single mutation - Inject/MonteCarlo/Replay/AgentIntervene modals — one-shot dispatches - ScenarioOpinionsPage / ConversationThreadPage — small fetch sites - ProjectScenariosPage — project mutations - CampaignSetupPage — project + template fetches (could migrate later) - LoginPage — auth (mutation pattern, low cache value) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ning pages)
Final phase migrates the remaining 10 components that still used raw
apiClient. After this commit, all meaningful data fetches in the app
go through the typed query/mutation layer.
queries.ts additions:
- Community threads: useCommunityThreads, useCommunityThread
- Project scenarios: useCreateScenario, useRunScenario, useDeleteScenario
- Simulation lifecycle: useCreateSimulation, useStart/Pause/Resume/Stop/
Step/RunAll (mostly for isPending UI gating)
- Campaign dispatches: useInjectEvent, useReplay, useMonteCarlo,
useMonteCarloJob (supports refetchInterval polling),
useEngineControl, useModifyAgent
- Auth: useLogin, useRegister
Migrated:
1. InjectEventModal — useInjectEvent (removed submitting state)
2. ReplayModal — useReplay (removed submitting state)
3. AgentInterveneModal — useModifyAgent
4. EngineControlPanel — useEngineControl (removed applying state)
5. MonteCarloModal — useMonteCarlo for start mutation (polling stays
imperative for localStorage persistence)
6. LoginPage — useLogin + useRegister (removed loading state, handled
via isPending)
7. CampaignSetupPage — useProjects + useCommunityTemplates +
useCreateSimulation + useCreateScenario
8. ProjectScenariosPage — useProject + useRunScenario + useCreateScenario
+ useDeleteScenario + useStopSimulation (removed local project/scenarios
state + setScenarios mutation-and-mirror pattern)
9. ScenarioOpinionsPage — useSimulationSteps + setStepsBulk hydration
10. ConversationThreadPage — useCommunityThread (removed AbortController
dance and local apiThread/apiLoading state)
11. ControlPanel — useProjects (initial load only; keep imperative
lifecycle for lifecycle actions — they're WebSocket-driven)
12. SimulationListPage — useSimulations (was pre-existing raw fetch
with setState-in-effect lint warning)
Lint hygiene fixes along the way:
- InjectEventModal / ReplayModal: reset-on-open setStates wrapped in
queueMicrotask to silence react-hooks/set-state-in-effect
- ConversationThreadPage: removed unused useEffect import
Test updates:
- CampaignSetupPage.test.tsx: wait for project option to actually
render (findByRole('option')) before interacting with the select;
updated /simulation navigation assertion to match the new
/simulation/<id> parametric route
Bundle (minor movements, all reductions):
- index: 35.49 → 35.81 KB gzip (+0.3 KB for the extended query layer)
- ConversationThreadPage: 12.82 → 12.51 KB (-0.3 KB)
- CommunitiesDetailPage: unchanged 13.31 KB
- No page gained size
Tests: 27 files / 521 passing.
ESLint: 0 errors / 0 warnings.
Raw apiClient fetch sites left: ControlPanel lifecycle handlers
(intentional — imperative simulation control), SimulationReportModal
export shortcuts (one-shot file download), a few incidental calls that
don't benefit from query caching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CampaignSetupPage was a 617-line monolith violating single responsibility. Extracted into 1 custom hook + 6 section components + 1 types file. The page is now a thin orchestrator that wires the form state hook to the section components. New files: - src/hooks/useCampaignForm.ts — all form state, queries, mutations, handlers, and submit logic (326 lines, independently testable) - src/components/campaign/types.ts — shared constants (CHANNELS, AGENT_TYPES, PERSONALITY_KEYS, COMMUNITY_COLORS) and defaultCommunity - src/components/campaign/ProjectSelector.tsx — 52 lines - src/components/campaign/CampaignInfoSection.tsx — 94 lines (name / budget / channels / message) - src/components/campaign/TargetCommunitiesSection.tsx — 53 lines - src/components/campaign/CampaignAttributesSection.tsx — 85 lines (with inner AttributeSlider sub-component eliminating 3 duplicated slider blocks) - src/components/campaign/CommunityConfigurationSection.tsx — 180 lines (with inner CommunityCard sub-component, removes the deepest nesting in the original file) - src/components/campaign/AdvancedSettingsSection.tsx — 97 lines CampaignSetupPage.tsx: 617 → 111 lines (-82%). Every section < 200 lines, each with a clear single responsibility. All existing unit tests still pass without modification — proof that the refactor preserves behavior. New E2E specs (12 tests): 1. e2e/campaign-setup.spec.ts (6 tests) — exercises the refactored form end-to-end: - renders all 6 form sections - submit button disabled without name - channel checkboxes toggle independently - advanced settings collapsible - attribute sliders update displayed value - project selector read-only when projectId in URL 2. e2e/help-tooltip.spec.ts (3 tests) — smoke tests for the shared HelpTooltip component that now surfaces ~15 glossary terms across the UI: - metrics panel has help icons for technical terms - hover opens tooltip without layout jitter (anti-flicker check) - accessible labels match glossary terms 3. e2e/tanstack-cache.spec.ts (3 tests) — verifies the TanStack Query migration's main UX promise: cross-route caching. Each test intercepts network requests, navigates away and back, and asserts that at most 1 revalidation fetch occurs: - projects list cached across navigation - community list cached per simulation - agent list reused across panels E2E total: 74 → 86 tests. Unit tests: 27 files / 521 passing. TypeScript: 0 errors. ESLint: 0 errors / 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…unting - Monte Carlo runner executes runs concurrently via asyncio.Semaphore with configurable max_concurrency (default 3) and real per-community adoption - Personality drift system: agents evolve personality based on actions with cumulative drift tracking and MAX_DRIFT cap - Campaign controversy parameter wired through agent tick pipeline - Real intra/inter-community edge counting for cascade detection (replaces hardcoded stubs) - Community link counting offloaded to thread pool (non-blocking) - O(n²) community metrics reduced to O(n) via pre-bucketing - Monte Carlo memory leak fixed: orchestrator state cleaned up after each run - Monte Carlo return type annotation corrected - LLM fallback stub tracking via is_fallback_stub flag - Startup migration marks orphaned running/paused sims as failed - FK safety in persistence: explicit flush ordering for simulation row
- Replace broad except HTTPException catches with status_code != 404 filter across 7 endpoints (agents, network, simulations) so real 500s surface instead of returning silent empty data - Historical sims (DB-only after restart) return empty data instead of 404 - run_scenario only starts simulation after DB row is confirmed; aborts and cleans up on persistence failure (prevents ghost simulations) - Replay endpoint now properly returns 500 on failure instead of fake replay_id - Export endpoint falls back to DB for historical simulations
- Replace 2D Cytoscape canvas with WebGL/three.js 3D renderer - Community-colored nodes and edges with instanced sphere rendering - Auto-scaled resolution for large graphs (2k+ nodes) - Physics settle with cooldownTicks + d3AlphaDecay - Orbit/zoom/pan controls - EgoGraph filter improvements
- Migrate all pages to TanStack Query hooks from queries.ts - Central glossary system with HelpTooltip for technical terms - ControlPanel refactored into focused sub-components - Page-level improvements across 6 detail/opinion pages - Updated tests for new query patterns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- data/community_templates.json: seed data for community templates - .gitignore: exclude gstack-reports, playwright-mcp logs, pencil files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove all Monte Carlo code, API endpoints, DB model, tests, frontend components, query hooks, and constants. The feature was adding complexity without being a core differentiator. Deleted: monte_carlo.py, test_06_api_monte_carlo.py, MonteCarloModal.tsx Removed from: simulations.py (3 endpoints), schemas.py, propagation.py, diffusion/schema.py, config.py, client.ts, queries.ts, constants.ts, ControlPanel.tsx, AnalyticsPage.tsx, glossary.ts + 10 test files cleaned Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- StepRunner now uses the shared gateway instance (was silently creating a separate one, so stats never accumulated) - LLM Gateway tracks total_tokens per call and maintains a 100-entry ring buffer of call metadata (provider, latency_ms, tokens, cached) - Orchestrator.get_llm_stats() returns real cached_calls and total_tokens - Orchestrator.get_llm_calls() serves from the gateway ring buffer Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AgentDetailPage: connections = edge degree, subscribers = incoming edges (computed from the same network query used by EgoGraph) - TopInfluencersPage: connections and chains derived from network degree map (was hardcoded 0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register and login now use the existing users table via SQLAlchemy. Users survive server restarts. JWT logic unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CLAUDE.md: remove Celery reference - README.md: remove Monte Carlo from feature list and workflow - CHANGELOG: full v0.1.0.0 entries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Test badge: 344 → 520 frontend tests - Remove Monte Carlo from Roadmap shipped list - 3D graph (react-force-graph-3d) as primary visualization throughout - Cytoscape.js now listed as EgoGraph-only, not main renderer - Quick Start flow updated to match current UI (Projects → Scenario) - Tech stack table: testing counts, frontend stack corrected - Roadmap shipped list: add auth DB, LLM tracking, agent connections - Acknowledgments: three.js/react-force-graph-3d replaces Cytoscape as main graph credit; Cytoscape credited for EgoGraph - Remove Twitter placeholder (not active) - Add Git Branch Strategy to docs section - Remove emoji prefixes from use-case headers for cleaner look Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1: Exposure Fatigue, Edge Weight Perception, Expert Opinion Score, Prompt Injection Defense Phase 2: Emotional Contagion, Bounded Confidence (Deffuant), Content Generation prompt Phase 3: Reflection Engine (Simulacra-style), Homophily edge weighting 55 new tests covering all features. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Unify PropagationEvent: add action_type/generated_content to influence.PropagationEvent - Remove getattr duck-typing hacks in BridgePropagator and StepRunner - Clean types.py: pure re-export module (no duplicate class definitions) - Wire ReflectionEngine into tick.py (both sync and async paths) - Remove dead run_until_complete hack in sync tick() - Add _fire_and_forget helper for async task error logging (14 call sites) - Fix bare except in persistence.py agent serialization - Add error logging to _config_to_dict and _community_metric_to_dict - Wrap run_all step_callback with error isolation - Add network validation failure logging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire target_communities in inject event API + orchestrator - Add bad_review to allowed event types - Add frontend cache invalidation on inject success - Add InjectEventModal tests (14 tests) - Add EngineControlPanel tests - Fix propagation animation utils extraction - Misc linter/formatter fixes across backend and frontend Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- _pick_adapter() now routes by tier: Tier 1/2 → Ollama, Tier 3 → Claude/OpenAI/Gemini - get_default() uses settings.default_llm_provider instead of os.environ - run_all endpoint: catch SimulationCapacityError (429), InvalidState (409), generic (500) - Import all simulation exceptions in API layer - 8 new tier routing tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend job (Python 3.12): uv sync + pytest (961 tests, no services needed since all tests use mock adapters). Frontend job (Node 20): tsc --noEmit + eslint + vitest (562 tests). Build step omitted because tsc -b surfaces 4 pre-existing errors that need a separate fix (TopInfluencersPage Recharts types, simulationStore path alias, vite.config.ts vitest field). Both jobs run on push to main and all PRs. Concurrency group cancels stale runs on the same ref. uv and npm caches enabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eMemo Exposed by new CI workflow. Two ESLint errors were live on feat/graph-3d: 1. neighborIdArr assigned but never used (inter-edge loop iterates neighborIds Set directly, so the Array.from copy was dead) 2. setLoading/setEmpty called synchronously inside useEffect guard (react-hooks/set-state-in-effect) — refactored to derive empty state via useMemo from the TanStack Query cache instead of manual effect Type check, pytest, and vitest already pass locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The @/* path alias was defined in vite.config.ts resolve.alias but not in TypeScript's compiler config, so tsc -b produced 85 TS2307 errors across test files and any source file using the alias. Added baseUrl and paths so TypeScript and Vite agree on module resolution. Knocks tsc -b errors from 130 to 45 (remaining are real code issues, not config). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related doc updates that accumulated in this session: 1. SPEC index: 구 19/20/21 SIMULATION_QUALITY SPECs merged into the consolidated 21_SIMULATION_QUALITY_SPEC.md. Updated the Active SPEC table + added a consolidation history note. 2. Health Stack typecheck: changed from 'tsc --noEmit' to 'tsc -b'. Root tsconfig.json is "files": [] + references-only, so tsc --noEmit (without -b) compiles nothing and returns 0 errors — a silent no-op. That's why 130 type errors accumulated unnoticed. Added a warning note explaining this so the next contributor doesn't repeat it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…NESS) Four root-level docs were mixed Korean+English and have been translated to English end-to-end so international contributors can read them: - AGENTS.md (10.8% Korean → 0%) multi-agent working guide - CLAUDE.md (14.8% Korean → 0%) project instructions + SPEC-GATE rules - DESIGN.md (5.8% Korean → 0%) UI design system + Pencil frame mapping - HARNESS.md (20.6% Korean → 0%) six context-strategy principles All semantic content preserved exactly: - SPEC paths, anchor IDs, file names, code blocks - Enforcement markers (⛔, ✅, ❌) and their meanings - CLAUDE.md Phase table (test counts refreshed to 961 backend / 521 frontend) - CLAUDE.md Hard Rules with the same legal weight - Health Stack tsc -b warning note (added earlier this session) Other root MD files were already English: CHANGELOG, CODE_OF_CONDUCT, CONTRIBUTING, ROADMAP, SECURITY. README.md has pending unrelated changes from the GitHub star conversion rewrite and will be committed separately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Batch 1 of Step B (tsc -b error reduction). Drops 22 errors: 14 unused imports/vars + 5 missing name fields + 2 cytoscape mock cast issues + 1 unmountComponentAtNode reference. - EngineControlPanel/InjectEventModal/SimulationMain/SimulationPage/ GlobalMetrics: add name to MOCK_SIMULATION (now matches SimulationRun) - FactionMapView: drop unmountComponentAtNode (deprecated in @testing-library 18), drop unused React/afterEach, add `unknown` to cytoscape→Mock cast - UIFlowSpec/PropagationAnimation/EngineControlPanel/InjectEventModal: drop unused React, vi, act, afterEach, Routes, Route imports - glossary.test.ts: rename unused destructured `key` → `_key` in 3 it.each blocks tsc -b errors: 45 → 23. All remaining are in source code (next batch). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Batch 2 of Step B. Drops 5 tsc -b errors:
- constants.ts: re-export SimulationStatus (was locally imported only,
consumers like SimulationListPage couldn't reach it through constants)
- api/client.ts: add local `import type { MemoryRecord }` — the existing
`export type { MemoryRecord }` re-export doesn't bring the symbol into
local scope when verbatimModuleSyntax is on
- CommunitiesDetailPage: define an explicit LocalCommunity interface and
replace the stale `typeof COMMUNITIES[number]` return annotation (the
local COMMUNITIES array was removed in an earlier refactor but the
annotation lingered). Properly typing influencers and emotions also
fixes the `inf: any` and `unknown → ReactNode` errors downstream
tsc -b errors: 23 → 18.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Batch 3 of Step B. Drops 6 tsc -b errors:
- EngineControlPanel: add `unknown` hop to the mutateAsync result cast
(Record<string, unknown> → EngineControlResponse doesn't overlap directly)
- useProjectScenarioSync: same `unknown` hop for the SimulationRun → Record
inspection pattern
- CommunityPanel: drop the `metrics.size` branch (field never existed on
CommunityStepMetrics — speculative API shape that was never materialized);
fall through to the adoption_rate derivation
- EgoGraph: `cytoscape.Stylesheet[]` → `cytoscape.StylesheetStyle[]`
(Stylesheet was removed from the type union; StylesheetStyle is the
variant that carries a `style:` block)
- GraphPanel: drop the `"ResizeObserver" in window` fallback. Modern
lib.dom declares ResizeObserver as always present on Window, so the
`window.addEventListener("resize", ...)` branch is unreachable and
tsc narrowed the `window` symbol to `never` inside it.
tsc -b errors: 18 → 12.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Batch 4 of Step B. Drops 7 tsc -b errors: Recharts 3 tightened its Formatter signature (ValueType widened beyond number, 1-element tuple returns no longer accepted). Fix each call site by dropping the explicit `v: number` parameter annotation (contextual typing handles it) and either: - returning a plain string for ReactNode formatters (AnalyticsPage:288) - keeping the [value, label] 2-tuple, using String()/Number() at use sites (GlobalMetricsPage polarization + sentiment tooltips) - casting the payload param at its access site for TopInfluencersPage's custom tooltip + Bar onClick handlers Also: SimulationReportModal's useMutation wrapped a void-returning `apiClient.simulations.export()` (which only opens a window). Wrap it in an async fn so the mutation function returns Promise<void> as TanStack Query expects. tsc -b errors: 12 → 5. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…olyfill Batch 5 (final) of Step B — reaches tsc -b 0 errors. - GlobalMetricsPage: `latestStep > 0` was comparing a StepResult object to a number. Read `latestStep.step` instead so the throttle actually checks the step counter - vite.config.ts: drop the `defineConfig` wrapper and export a plain object. `defineConfig` from 'vite' rejects the vitest `test` field, and the alternative `defineConfig` from 'vitest/config' pulls in a nested copy of vite that collides with the real project's vite when typing plugins like react() / tailwindcss(). A plain object works identically at runtime for both tools and unblocks tsc -b. Added an explicit `manualChunks(id: string)` annotation since we lose contextual typing - test/setup.ts: polyfill ResizeObserver for jsdom so GraphPanel's ResizeObserver-based sizing works in tests (needed after the previous batch dropped the legacy `"ResizeObserver" in window` fallback) Final state: - tsc -b: 0 errors (was 130 at start of Step B) - eslint: 0 errors, 0 warnings - vitest: 562/562 passing - npm run build: succeeds, produces production bundle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related changes that complete the type-safety feedback loop: 1. Replace `tsc --noEmit` with `tsc -b`. The old command was a silent no-op because the root tsconfig.json is references-only — see the Health Stack warning note added to CLAUDE.md earlier in this PR. 2. Add a Build step that runs `npm run build`. This catches bundler regressions (missing imports, chunk config issues, asset paths) that tsc alone wouldn't surface, and also acts as a second gate on tsc -b since `npm run build` is `tsc -b && vite build`. Prerequisite satisfied: the 130 tsc -b errors that had accumulated before this feedback loop existed were all fixed in the five fix(types/ tests) commits that land with this one. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file was created as part of the 20_CLEAN_ARCHITECTURE_SPEC #4.1 refactor (extracting inline types from api/client.ts) but it was never committed — only existed locally. CI caught this after Batch 2 of the Step B fixes added a local `import type { MemoryRecord } from '../types/api'` that turned the missing file into a hard failure. Contents: 225 lines of request/response interfaces for Simulation / Agent / Community / Thread / Settings / LLM endpoints. Zero imports from other files (pure type definitions), so landing this in isolation is safe. Unblocks the tsc -b stage of CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…per row
Closes a UX gap where /simulation was a flat list with no project context
and "New Simulation" dropped users into /setup with no project
pre-selected, forcing them to discover the requirement mid-form.
## Changes
- SimulationListPage header gains a project filter <select> (default
"All projects"). Selecting a project filters the list client-side and
changes the "New Simulation" button route from /setup to /setup/:pid.
- Each row renders the owning project inline below the sim name as
`{simulation_id} · {project_name}`. Orphan sims (unknown project_id)
render the id alone — no middle-dot, no deleted-project leak.
- Filtered empty state gets its own copy ("No simulations in this
project") and CTA ("Create in this project" → /setup/:pid).
- SimulationRun type gains optional `project_id?: string | null` so the
TypeScript compiler can see the field that was already on API responses.
- 11 new tests in SimulationListPage.test.tsx cover SL-AC-01 through
SL-AC-09 (default filter state, filter application, navigation routing
both branches, per-row project name, orphan fallback, filtered empty
state + CTA, projects query loading, projects query error fallback).
## SPEC
New section 18_FRONTEND_PERFORMANCE_SPEC.md §10 defines SL-01 through
SL-05 contracts plus SL-AC-01~09 acceptance criteria. The SPEC file
itself is .gitignore'd per the project's IP protection rule, so this
commit only carries the code that implements it.
## Non-goals
- No server-side filtering: apiClient.simulations.list() stays parameter-
less. Projects stay in the low-double-digits, sims in the low hundreds —
the client filter is faster than an extra round trip.
- No persistent filter state: no URL query param, no localStorage.
Ephemeral state keeps returning users from hitting stale filters.
- No changes to /projects or /projects/:id/scenarios pages.
## Verification
- npx tsc -b: 0 errors
- npx eslint src/pages/SimulationListPage.tsx: 0 errors, 0 warnings
- npx vitest run src/__tests__/SimulationListPage.test.tsx: 11/11 green
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the 201-line Apache-2.0 text with the standard MIT License, prefaced by a one-line @showjihyun tagline. README, shields, and pyproject/package manifests already reference MIT, so this resolves the prior LICENSE-vs-README mismatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
README/CLAUDE/CONTRIBUTING test-count claims updated to the real numbers: 1,002 backend (was 961/981/1,234 in various places) and 656 frontend (was 521/609). CHANGELOG [Unreleased] gains the session's graph animation fix (UUID↔node_id translation), low-centrality propagation restore (influence floor + sigmoid smoothing deduped into propagation_calibration.py), startup deadlock hardening, dynamic community palette, and the LICENSE Apache-2.0→MIT switch. The existing [0.1.1.0] entry is untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This is a large consolidation commit (106 files, +12,054/-900) that bundles
two logical layers of work that accumulated on the feat/graph-3d branch:
## Prior-session work (clean-architecture + feature-dev)
- Repositories layer: backend/app/repositories/ — simulation_repo, project_repo,
memory_repo, protocols, simulation_persistence split out of engine code so
the orchestrator no longer owns its own DB sessions
- Services layer: backend/app/services/ — simulation_service, community_opinion_service,
notification_service, ports — clean separation between HTTP handlers and
orchestrator/engine, with session lifecycle managed at the service boundary
- Conversation threads: thread_capture pipeline, ThreadMessageRow ORM,
22_CONVERSATION_THREAD_SPEC tests (test_22_conversation_threads.py)
- Expert LLM engine: richer expert evaluation with SLM fallback
(23_EXPERT_LLM_SPEC tests in test_23_expert_llm.py)
- Community opinion feature: community_opinion model, service, API,
CommunityOpinionPanel and pages, migration e1_community_opinion
- LLM cache observability: test_24_cache_observability covering the vcache hit
path + tier distribution
- Frontend component split-outs: DecidePanel, EmergentEventsPanel,
EliteLLMNarrativePanel, OverallOpinionPanel, FormProgressBanner,
SimilarityWarningBanner, WorkflowStepper, GraphLegend, ZoomTierBadge —
all extracted from their parent containers with matching test files
- ArchitectureInvariants.test.ts + communitySimilarity.test.ts structural guards
- test_21_simulation_quality split into p1/p2 files + test_21_memory_pgvector
- test_25_simulation_service + test_26_community_opinion service-layer tests
- Bigint random_seed migration (d1_bigint_random_seed) — seed column widened
so values outside int32 range don't overflow
## This session's fixes (documented in CHANGELOG Unreleased)
- **Propagation animation restored** — GraphPanel's active-link keys were
built from agent UUIDs while linkDirectionalParticles looked them up by
graph node_ids, so particles never drew. Translation now lives in
propagationAnimationUtils.ts (buildAgentIdToNodeId + buildActivePropLinks)
and is exercised by the same regression tests the component uses
- **Low-centrality agents propagate again** — the agent tick path used
InfluenceLayer which missed the Round 7-d floor + sigmoid emotion
smoothing that propagation_model.py already had. Both paths now call
the shared propagation_calibration.propagation_probability() so future
calibration tweaks live in one file
- **Startup deadlock on /api/v1/projects/** — lifespan split into two
short-lived transactions with SET LOCAL lock_timeout = '10s', and
metadata.create_all is skipped entirely when alembic_version is present
- **community_name in agent responses** — AgentDetailResponse gains a
community_name field, resolved via a cached community_uuid → cc.name
map (was O(N) graph walk per inspector click, now O(1))
- **Dynamic community palette in 3D graph** — palette is derived from
the live graph instead of the hardcoded A/B/C/D/E default, so real sims
with "mainstream"/"skeptics"/etc get real colors. Fallback color is
hashed from the community id for stability across re-fetches
- **Graph node labels** use the graph node_id (Agent #42) instead of the
first-8-chars of the agent UUID, which were identical for every
deterministic-seed agent
- **Graph overlay layout** — 3D Controls hint moved to bottom-right,
community legend raised 200px, GraphLegend gained a bottomOffsetPx prop
so the stacking stays coherent
- **Regression tests** — test_01_influence gains
test_round_7d_low_influence_agents_still_propagate and
test_round_7d_negative_emotion_factor_still_propagates; PropagationAnimation
test suite gains 5 tests that exercise the real utility functions
## Verification at commit time
- backend: test_01_influence (10/10), test_07_propagation_pairs (29/29),
test_04_community_orchestrator, and the agent/influence/propagation/project
filter (205/205) all green earlier in the session
- frontend: 656/656 pass (40 files) — fresh run at commit time
- tsc -b: clean on all touched files (pre-existing baseline errors in
communitySimilarity.test.ts and DecidePanel.tsx are not mine)
- Live end-to-end: fresh sim produces non-empty propagation_pairs with
agent UUIDs that resolve to valid node_ids in the network graph;
/api/v1/simulations/{id}/agents/{id} returns community_name: "Alpha"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rtial CI's tsc -b step caught two errors I wrongly dismissed during the /ship self-review as "pre-existing baseline". They were pre-existing in the working tree, not in the tracked tree — my consolidation commit (3e3ba11) staged the test file that surfaced them, so the failing tsc only appeared after the push. CI run 24282435136 was the signal. Root cause: `CommunityConfigInput` declared `personality_profile` as required with all five traits, but: - The backend fills missing traits with 0.5 at agent generation (`orchestrator._trait()` in `app/engine/simulation/orchestrator.py`). - `src/api/client.ts:130,132` already documents the field as optional with a partial `Record<string, number>` shape. - `communitySimilarity.personalityVector()` defensively uses `c.personality_profile ?? {}` and falls back to 0.5 per trait. - The failing test `falls back to default 0.5 when personality_profile is missing` intentionally exercises the missing-profile path. The type was lying about the runtime contract. `?: Partial<...>` matches reality and surfaced two real latent bugs in `CommunityConfigurationSection.tsx` where the component read `community.personality_profile[key].toFixed(2)` without any null guard. Those would have thrown at runtime on any community that omitted traits. Verification: - `npx tsc -b` → 0 errors - `npx vitest run` → 656/656 pass (40 files) - `npx eslint` (touched files) → clean Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend CI run 24282569392 failed with 13 test failures + 58 errors,
all of the form:
OSError: [Errno 111] Connect call failed ('127.0.0.1', 5432)
Not a code regression — the workflow had no Postgres service. The
backend job ran `uv run pytest tests/` on a clean Ubuntu runner that
has nothing listening on 5432, so every test that ends up invoking
`TestClient` (which triggers `app.main.lifespan` → real DB session)
died at connection time. Locally I had `prophet-db-1` Docker container
serving on 5432, so tests passed; CI didn't.
Fix:
1. `services.db` — `pgvector/pgvector:pg16` image, not vanilla
`postgres:16`, because the lifespan runs
`CREATE EXTENSION IF NOT EXISTS vector` on startup. A vanilla
image would fail with "could not open extension control file
'vector.control'" and leave the app in a half-initialised state.
2. Health check with `pg_isready` so pytest waits for the service
container to be ready to accept connections. GitHub Actions
holds job execution until health checks pass.
3. `DATABASE_URL: postgresql+asyncpg://prophet:secret@localhost:5432/prophet`
at the job level — every step inherits it. Port 5432 matches the
service container mapping (local dev compose uses 5433 on the
host to dodge developer-machine Postgres conflicts).
Valkey and Ollama are not added — none of the failing tests hit
those services. LLM tests use the SLM stub path, and LLM cache tests
either mock Valkey or gracefully degrade when it's unavailable. If
a future test regression requires either, add them the same way.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend already ships llama3.2:1b in `backend/app/config.py:31`,
`.env.example`, and `docker-compose.yml` (Round 8-5). The frontend
constants, SettingsPage defaults, README/CONTRIBUTING setup
instructions, and two test fixtures still referenced the old
`gemma4:latest` default — a stale Round 7 choice that the backend
already reverted because 0.20.x Ollama has a CPU-inference regression
on the gemma runner.
This commit closes the frontend-side gap so every surface tells the
same story:
- `frontend/src/config/constants.ts:117` DEFAULT_OLLAMA_MODEL
- `frontend/src/pages/SettingsPage.tsx:41-43` useState defaults for
ollamaDefaultModel, slmModel, ollamaEmbedModel
- `frontend/src/__tests__/SettingsPage.test.tsx:24-26` mock response
- `frontend/src/__tests__/UIFlowSpec.test.tsx:303` FLOW-29 assertion
- `frontend/src/__tests__/EliteLLMNarrativePanel.test.tsx:68` mock
- `frontend/src/__tests__/OverallOpinionPanel.test.tsx:49,91` mocks
Doc sync so users don't pull the wrong model:
- `README.md` — Quick Start ("Pull LLM model") block: gemma4:latest
(~9.6 GB) → llama3.2:1b (~1.3 GB), plus the matching acknowledgment
in the Ollama credits section
- `CONTRIBUTING.md` — "Run it" bootstrap step
The historical comment in `backend/app/config.py:21` that mentions
Round 7 briefly switching to gemma4:latest is intentionally preserved
as documentation of the decision history.
Verification:
- npx vitest run on the 4 touched test files → 56/56 pass
- npx tsc -b → clean
- grep "gemma4" across md/yml/ts/tsx/py/toml → only the intentional
historical comment remains
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ling
Two related robustness fixes for `CommunityOpinionService`, both
discovered while watching real small-LLM runs produce malformed output
or two synthesis requests hit the DB at the exact same step.
## LLM response normalisation (the small-LLM hostile-output path)
Small models (1-3B params) frequently:
1. Echo the schema literal instead of picking a value — e.g. return
`"rising|stable|polarising|collapsing"` verbatim in
`sentiment_trend`, which blows past the `VARCHAR(32)` column.
2. Return a single string or bare object where the schema says
"list of objects", crashing the frontend `.map()` renderers.
3. Mix garbage into `dominant_emotions` — integers, nulls, or empty
strings interleaved with real emotion words.
The existing `_parse_response` only guarded `summary` and
`sentiment_trend`. The new `_normalise_themes`, `_normalise_divisions`,
and `_normalise_key_quotes` helpers each:
- Drop the whole element if the shape is wrong (not a dict, missing
the required key, wrong type) rather than coercing garbage in.
- Clamp numeric fields (`weight`, `share`) to [0, 1].
- `_clip_str` every string field to a column-safe max length.
- Default missing optional fields to 0 or `[]`.
The rationale: better to lose one bad theme than persist garbage that
then crashes the renderer downstream.
## Unique-violation race (sqlstate 23505)
`_persist_row_with_retry` already handled PostgreSQL deadlocks
(`sqlstate 40P01`) but not the other race it can hit: two concurrent
synthesis requests both miss the `_find_cached` lookup, both build a
row for `(sim, community, step)`, and the second one trips the
`uq_community_opinions_sim_comm_step` unique constraint.
Retrying a doomed insert doesn't help — the constraint will reject the
second attempt too. Fix: on 23505, roll back, re-query
`_find_cached`, and return the winner's row. The API caller still
gets a canonical `CommunityOpinionSnapshot`, just built from the other
writer's data.
Also: on non-deadlock `DBAPIError`, explicitly `await
session.rollback()` before re-raising. The previous path left the
session dirty for the caller, which surfaced as cascading "session
already closed" errors downstream.
Return type change: `_persist_row_with_retry` now returns the
`CommunityOpinion` row (either the one inserted or the race winner)
instead of `None`. Both call sites updated to use the returned row
when constructing the snapshot — otherwise a race win would return a
snapshot built from the aborted row.
## Test coverage
`test_26_community_opinion.py` gains a `TestResponseNormalisation`
class (+170 lines) covering:
- `_normalise_sentiment_trend` — happy cases, American spelling,
the classic schema-literal echo, unknown values, None, integers
- `_clip_str` — clipping, empty, None, non-string coercion
- `_parse_response` — end-to-end with a hostile small-LLM payload
that mixes all the failure modes
Locally: `uv run pytest tests/test_26_community_opinion.py -q` →
**42 passed in 21s** (against prophet-db-1 Docker Postgres).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…aint
Completes the race-handling fix in `ae19716`. Without a DB-level
unique constraint, two concurrent synthesis requests that both miss
`_find_cached` will both INSERT successfully and both pay for a real
Tier-3 LLM call — silently doubling cost and producing conflicting
synthesis rows for the same `(sim, community, step)`.
The service-layer handler added in `ae19716` catches sqlstate 23505
and re-fetches the winner's row, but that handler never fires without
a constraint that actually rejects the second INSERT. These two
commits are load-bearing for each other.
## Changes
- `backend/app/models/community_opinion.py` — declare
`UniqueConstraint("simulation_id", "community_id", "step",
name="uq_community_opinions_sim_comm_step")` in `__table_args__`.
Keeps SQLAlchemy's reflection/introspection in sync with the real
DB schema so `Base.metadata` matches Alembic.
- `backend/migrations/versions/e2_community_opinion_unique.py` —
new Alembic migration:
1. `DELETE FROM community_opinions a USING community_opinions b
WHERE a.created_at < b.created_at AND ...` — defensive cleanup
of any pre-existing duplicates. Sequential code couldn't produce
them, but a database that happened to catch a race pre-fix
might have some lying around.
2. `op.create_unique_constraint("uq_community_opinions_sim_comm_step",
"community_opinions", ["simulation_id", "community_id", "step"])`.
3. `down_revision: e1_community_opinion` so the migration chain
stays linear.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… tables
CI run 24282780787 got past the Postgres-connection fix (`c19cc21`)
and hit the next wall: every API test errored with
asyncpg.exceptions.UndefinedTableError:
relation "simulations" does not exist
Not a code regression, and not a schema drift — the app's schema
bootstrap never runs at all under the test transport.
## Root cause
API tests use
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as c:
...
That drives the ASGI app directly, bypassing FastAPI's lifespan. So
`app.main.lifespan` — which normally does
`CREATE EXTENSION vector` + `metadata.create_all` + the stale-sim
cleanup — never fires during the test session. On a dev laptop this
goes unnoticed because Docker Postgres already has the schema from a
previous live run or from `alembic upgrade head`. On a fresh CI
Postgres container, nothing has ever created the tables, and the
first insert dies on `UndefinedTableError`.
The existing `_clean_simulation_db` autouse fixture tries to
`TRUNCATE TABLE simulations CASCADE` but silently swallows the
`UndefinedTableError` because the fixture is also running on tests
that don't touch a DB at all. That broad `except Exception: pass`
meant the truncate was a no-op on the broken schema too, so nothing
surfaced the missing tables until the actual query attempted the
same table later in the test body.
## Fix
New `_bootstrap_schema` fixture in `conftest.py`:
- `scope="session"` — runs once for the whole pytest session.
- `autouse=True` — every test gets it, whether or not it touches
the DB. Pure-unit tests just pay one `CREATE EXTENSION` round
trip (negligible vs total suite cost).
- Imports `app.models` so every ORM class is registered on
`Base.metadata` before `create_all` walks the table list.
- Runs the exact same DDL the lifespan would have run:
`CREATE EXTENSION IF NOT EXISTS vector` + `uuid-ossp` +
`Base.metadata.create_all`.
- Swallows exceptions so tests without a reachable DB (pure
harness unit tests, CI-less laptop runs) aren't blocked.
## Verification
Locally (prophet-db-1 Docker Postgres), the previously-failing suites:
$ uv run pytest tests/test_06_api_acceptance.py \
tests/test_06_api_simulations.py \
tests/test_06_api_agents.py \
tests/test_06_api_communities.py \
tests/test_06_api_ws.py \
tests/test_network_graph.py -q
............................................................ [ 69%]
.......... [ 100%]
12 + 85 = 97 passed in ~220s
Those were 71 failures/errors in the previous CI run. With this
fixture the schema is present before any test runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
showjihyun
added a commit
that referenced
this pull request
Apr 11, 2026
…is severed
Ran 6 pilots (UC1/UC2/UC3 baseline+reframed) against the post-R8-3 engine
via a new reusable harness at backend/scripts/run_use_case_pilot.py. All
3 README use cases failed to reproduce their quantitative claims:
| Case | README claim | Actual |
|-------------------------|--------------------|----------|
| uc1_baseline | stall at 12% | 97.3% |
| uc1_reframed | 31% | 97.4% |
| uc2_strategy_b | echo chamber | cascade |
| uc2_strategy_c | viral cascade | cascade |
| uc3_rto_raw | -38% eng sentiment | +0.70 |
| uc3_rto_restructured | -60% opposition | -0.3 pts |
Every pilot produced an identical step-by-step trajectory within a given
population size — controversy swung 0.80 to 0.15, utility 0.20 to 0.85,
and the final adoption rate moved by 0.002. That's the smoking gun: the
campaign framing inputs have zero effect on the simulation.
Root cause: CampaignConfig.{novelty,utility,controversy} are read into
CampaignEvent in step_runner.py and then dropped at the
_build_environment_events() boundary. The agent tick loop builds
MessageStrength from agent-derived values (media_signal,
cognition.evaluation_score) and a campaign_controversy method
parameter that defaults to 0.0 and is never set by any caller. The
entire R8-3 formula reformulation was mathematically correct but
operating on values that never come from the actual user inputs.
What this commit adds:
* backend/scripts/run_use_case_pilot.py — reusable pilot runner with
6 named cases, deterministic seeds, httpx-based API driver, and
JSON-output to docs/pilot_results/{case}.json
* docs/USE_CASE_PILOTS.md — full side-by-side of README claims vs
actual engine output, root cause writeup pointing at the exact
lines in step_runner.py + tick.py, and 5 proposed follow-up items
(wire fix, regression tests, re-calibration, LLM hardening, README
disclaimer)
* docs/pilot_results/*.json — raw per-case artifacts so the analysis
can be re-verified from the source data
The opinion synthesis plumbing from PR #2 held up perfectly — all 6
pilots got non-stub llama3.2:1b responses through the unique-constraint
+ shape-guarded persistence path. The small LLM hallucinated narratives
that matched the README (e.g. "rapid cascade in early_adopters stalls
against skeptic resistance") while the actual metrics showed every
community at 86-100% adoption. That's a separate hardening follow-up.
Next P1 task is the wire fix. Estimated: ~30 min CC, then a fresh pilot
round to verify. Regression tests in test_04_simulation_acceptance.py
will pin the outcome so this can't silently regress again.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
showjihyun
added a commit
that referenced
this pull request
Apr 11, 2026
* docs(pilots): verify README use cases end-to-end, find campaign wire is severed
Ran 6 pilots (UC1/UC2/UC3 baseline+reframed) against the post-R8-3 engine
via a new reusable harness at backend/scripts/run_use_case_pilot.py. All
3 README use cases failed to reproduce their quantitative claims:
| Case | README claim | Actual |
|-------------------------|--------------------|----------|
| uc1_baseline | stall at 12% | 97.3% |
| uc1_reframed | 31% | 97.4% |
| uc2_strategy_b | echo chamber | cascade |
| uc2_strategy_c | viral cascade | cascade |
| uc3_rto_raw | -38% eng sentiment | +0.70 |
| uc3_rto_restructured | -60% opposition | -0.3 pts |
Every pilot produced an identical step-by-step trajectory within a given
population size — controversy swung 0.80 to 0.15, utility 0.20 to 0.85,
and the final adoption rate moved by 0.002. That's the smoking gun: the
campaign framing inputs have zero effect on the simulation.
Root cause: CampaignConfig.{novelty,utility,controversy} are read into
CampaignEvent in step_runner.py and then dropped at the
_build_environment_events() boundary. The agent tick loop builds
MessageStrength from agent-derived values (media_signal,
cognition.evaluation_score) and a campaign_controversy method
parameter that defaults to 0.0 and is never set by any caller. The
entire R8-3 formula reformulation was mathematically correct but
operating on values that never come from the actual user inputs.
What this commit adds:
* backend/scripts/run_use_case_pilot.py — reusable pilot runner with
6 named cases, deterministic seeds, httpx-based API driver, and
JSON-output to docs/pilot_results/{case}.json
* docs/USE_CASE_PILOTS.md — full side-by-side of README claims vs
actual engine output, root cause writeup pointing at the exact
lines in step_runner.py + tick.py, and 5 proposed follow-up items
(wire fix, regression tests, re-calibration, LLM hardening, README
disclaimer)
* docs/pilot_results/*.json — raw per-case artifacts so the analysis
can be re-verified from the source data
The opinion synthesis plumbing from PR #2 held up perfectly — all 6
pilots got non-stub llama3.2:1b responses through the unique-constraint
+ shape-guarded persistence path. The small LLM hallucinated narratives
that matched the README (e.g. "rapid cascade in early_adopters stalls
against skeptic resistance") while the actual metrics showed every
community at 86-100% adoption. That's a separate hardening follow-up.
Next P1 task is the wire fix. Estimated: ~30 min CC, then a fresh pilot
round to verify. Regression tests in test_04_simulation_acceptance.py
will pin the outcome so this can't silently regress again.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(pilots): fix campaign framing wire + switch to GPU llama3.1:8b
The first pilot round in docs/USE_CASE_PILOTS.md found that every
Prophet simulation produced identical step-by-step trajectories
regardless of campaign framing — controversy=0.8 and controversy=0.2
both landed at final_adoption=0.973±0.001. Root cause (traced to
exact lines in the previous session): CampaignConfig.novelty and
.utility were read into CampaignEvent in step_runner.py and then
silently dropped before reaching the tick loop. Only .controversy was
forwarded, and it was forwarded as a method parameter that defaulted
to 0.0 and was never set by any caller. The entire campaign-framing
UI was effectively decoration.
This commit fixes the wire end-to-end across three layers, then
re-runs all six pilots on GPU to verify the fix.
## Wire fix (Round 8-6)
**1. community_orchestrator.py** — extract all three framing values
from the CampaignEvent and pass them into both AgentTick.tick() and
AgentTick.async_tick() alongside the existing campaign_controversy
forwarding.
**2. tick.py** — MessageStrength construction now blends:
novelty = 0.6 * campaign_novelty + 0.4 * media_signal
utility = 0.6 * campaign_utility + 0.4 * (evaluation_score / 2)
controversy = campaign_controversy (pure campaign — it's the
objective polarising-ness of the message, not an
agent-perception quantity)
The 0.6/0.4 weights were tuned so a controversy=0.8 to controversy=0.2
swing produces a ~0.42 point delta in raw score (before clamp),
which is enough to move adoption 20+ points on the early steps.
**3. cognition.py** — Tier-1 rule engine gained a campaign_bonus term:
bonus = 0.3 * (utility - 0.5) + 0.2 * (novelty - 0.5)
evaluation += bonus * 2.0
This is centered at 0 for neutral campaigns so prior fixtures stay
green, but shifts evaluation_score by ±0.25 on extreme framings —
enough to move the ADOPT decision threshold meaningfully. evaluate()
and evaluate_async() both take new campaign_novelty + campaign_utility
parameters and the Tier-3 LLM fallback path also threads them through.
## Regression test
test_04_step_runner.py::TestCampaignFramingAffectsOutcome runs two
sims with identical seeds + populations but opposite framings
(friendly: novelty=0.85, utility=0.85, controversy=0.15 vs
hostile: novelty=0.15, utility=0.15, controversy=0.85) and asserts:
abs(friendly.adoption_rate - hostile.adoption_rate) >= 0.02
friendly.adoption_rate > hostile.adoption_rate
Without the wire fix the delta is 0.0000 (bit-identical). With the
fix it's +0.1817 at step 4, which would have caught the regression
immediately.
## Post-fix pilot deltas
| Pair | Pre-fix step-0 delta | Post-fix step-0 delta | Post-fix final delta |
|------|:---:|:---:|:---:|
| UC1 baseline -> reframed | +0.000 | **+0.236** | +0.017 |
| UC2 Strategy B -> Strategy C | +0.000 | **+0.264** | +0.017 |
| UC3 raw -> restructured | +0.000 | **+0.147** | **+0.185** |
UC3 raw is the clearest win — the hostile RTO mandate now produces
zero viral_cascade events and ends at 74.5% adoption vs 93.1% for
the restructured version. That's a real stall pattern, not just a
faster trajectory. UC1/UC2 still saturate at ~97% because the
1030-agent population crosses cascade critical mass even with
hostile framing; a 5K-10K run at the same weights would likely
produce sharper stalls.
## GPU + model upgrade (Round 8-6 stack changes)
* Ollama moved to GPU mode via `docker-compose.gpu.yml` — RTX 4070
SUPER 12 GiB runs llama3.1:8b at ~75 tok/s (CPU mode was ~4-8
tok/s). Every agent tick + opinion synthesis now completes in
sub-second wall time.
* Default model upgraded from llama3.2:1b to llama3.1:8b across
config.py, .env.example, docker-compose.yml, frontend/config/
constants.ts and four test files. llama3.1:8b is large enough
to stay anchored to the provided numeric evidence in the
opinion-synthesis prompt; the 1B model hallucinated narratives
matching the README claims instead of the actual metrics.
* Opinion synthesis timeout reverted from 120s (CPU fallback) back
to 30s now that GPU inference finishes in ~1-2s.
* README + CLAUDE.md Quick Start section rewritten with GPU as the
recommended path and CPU-only as a documented fallback with the
env-var overrides to flip back to llama3.2:1b.
## Runner + artifacts
`backend/scripts/run_use_case_pilot.py` was retuned to use the
llama3.1:8b default. All six result blobs under
`docs/pilot_results/*.json` regenerated with post-fix trajectories.
`docs/USE_CASE_PILOTS.md` gained a "Post-fix results (Round 8-6)"
section with before/after tables and an updated follow-up list
(population scaling + campaign_bonus weight tuning for sharper
stalls + echo-chamber detector gap).
## Test + CI
* Backend: `uv run pytest tests/ -q` → **1029 passed, 2 skipped**
(+1 new regression test, no regressions across the suite)
* The new test_04_step_runner.py::TestCampaignFramingAffectsOutcome
is the guardrail for this fix — it would have caught the original
wire gap immediately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
showjihyun
added a commit
that referenced
this pull request
Apr 13, 2026
…n, validation) Two-pass code review found 11 issues across 6 backend files: Critical: - #1 registry._call_adapter: wrap raw str→LLMPrompt before adapter.complete() - #2 persist_step retry: re-insert EmergentEvent rows on rollback retry - #8 deps.py singletons: add threading.Lock + double-checked locking - #9 load_steps: bound EmergentEvent query with step≤max + limit Important: - #3 MC endpoint: asyncio.wait_for(300s) + 504 on timeout - #4 settings PUT: str() coercion on Chinese LLM provider fields - #5 monte_carlo.py: remove fragile iscoroutine guard, plain await - #6 _config_to_dict: dataclasses.asdict for community serialization - #7 UUID parse: _safe_uuid try/except replaces len>8 heuristic - #10 persist_step retry: also re-insert agent_states + propagation_events - #11 settings PUT: str() coercion on Anthropic/OpenAI/Gemini fields too All 57 targeted tests pass (test_29 + test_06 + test_05). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
3D Graph Visualization
TanStack Query Migration
queries.ts) with 30+ hooksParallel Monte Carlo
API Error Handling Hardening
except HTTPExceptioncatches replaced with 404-only filtersEngine Improvements
DX & Docs
/simulationPre-Landing Review
12 issues found (5 critical, 7 informational) — all 5 critical issues fixed:
except HTTPExceptioncatches → filtered to 404 only (7 locations)Test plan
🤖 Generated with Claude Code
Documentation
Session doc audit (2026-04-11) — committed in
0d87478.Files updated
961|521→1002|656; Tech Stack tablepytest (961), Vitest (521)→pytest (1,002), Vitest (656); "What's working today"1,482+ automated tests (961 backend + 521 frontend)→1,658+ automated tests (1,002 backend + 656 frontend).1,658+ GREEN (Backend 1,002 + Frontend 656); backend/frontend run-output lines updated to the current numbers (40 files frontend, 2 skipped backend).1,658+.## [Unreleased]section (it was empty) with the session's Added/Changed/Fixed entries: shared diffusion calibration module,community_namefield, propagation utilities, LICENSE switch, dynamic community palette, agent label fix, graph overlay rearrangement, lifespan split, graph propagation animation fix, low-centrality propagation restore, deadlock root-cause narrowing,get_agentO(1) cache, regression test rewrite.[0.1.1.0]section left untouched.Not modified
ARCHITECTURE.md— doesn't exist in this repo.TODOS.md— doesn't exist in this repo.VERSION— stays at0.1.1.0. User chose to keep session fixes in[Unreleased]rather than bump or fold.[0.1.1.0],[0.1.0.0],[0.1.0]) — preserved verbatim per skill rules.Verified
npx vitest run— 656/656 pass (40 files).uv run pytest tests/— 1,002 pass, 2 skip.Ship Log (2026-04-11)
Consolidation commit
3e3ba11pushed on top of the existing branch. 106 files, +12,054/-900. This bundles two logical layers of work.Prior-session work (clean-architecture refactor + features)
backend/app/repositories/(simulation_repo, project_repo, memory_repo, protocols, simulation_persistence) andbackend/app/services/(simulation_service, community_opinion_service, ports). Session lifecycle is now owned at the service boundary instead of deep inside the engine.thread_capture.py,ThreadMessageRowORM,test_22_conversation_threads.pycovering the full capture → storage → API pipeline.23_EXPERT_LLM_SPECtests intest_23_expert_llm.py.community_opinionmodel + service + API + frontend panels + migratione1_community_opinion.test_24_cache_observabilitycovering vcache hit path + tier distribution.ArchitectureInvariants.test.ts+communitySimilarity.test.ts.test_21_simulation_qualitysplit into_p1+_p2+test_21_memory_pgvector; addedtest_25_simulation_service+test_26_community_opinion.d1_bigint_random_seedwidensrandom_seedto bigint so values outside int32 don't overflow.This session's fixes (2026-04-11)
🔧 Graph propagation animation restored. Particles weren't drawing during live sims because GraphPanel built
activePropLinksRefkeys from agent UUIDs whilelinkDirectionalParticleslooked them up by graph node_ids. Translation now lives inpropagationAnimationUtils.ts(buildAgentIdToNodeId,buildActivePropLinks) and the real utility is exercised by 5 regression tests — a copy in the test file would have passed silently while the component broke.🔧 Low-centrality agents propagate again.
InfluenceLayer(the path the agent tick uses) was missing the Round 7-d influence floor and sigmoid emotion smoothing thatPropagationModelalready had. Typical agents (influence ≈ 0.04–0.1 on small graphs, balancedef) were producing ~0.2% per-target probability → emptypropagation_pairsevery step. Both paths now callpropagation_calibration.propagation_probability()so there's exactly one file to edit when the calibration is tuned again. Two new regression tests (test_round_7d_low_influence_agents_still_propagate,test_round_7d_negative_emotion_factor_still_propagates) guard against silent drift.🔧 Startup deadlock on
GET /api/v1/projects/narrowed by splitting the lifespan into two short-lived transactions withSET LOCAL lock_timeout = '10s', and skippingmetadata.create_allentirely whenalembic_versionis present. Production boot never holds DDL locks on user tables.✨
community_namefield on agent detail.AgentDetailResponsegained astr | Nonecommunity_name, resolved via a cachedcommunity_uuid → cc.namemap inSimulationOrchestrator._community_name_map(). Was O(N) graph walk per inspector click, now O(1).🎨 Dynamic community palette in 3D graph. Derived from the live graph instead of the hardcoded A/B/C/D/E profile — sims with custom ids ("mainstream", "skeptics", etc.) get real colors. Fallback colors are hashed from the community id so "mainstream" always picks the same slot across re-fetches. Node labels use the graph node_id (
Agent #42) instead of the first-8-chars of the deterministic UUID (which was identical for every agent).AgentInspectorshows the resolved human name instead of a raw UUID.🎨 Graph overlay layout. 3D Controls hint moved to bottom-right, community legend raised 200px, full
GraphLegendoverlay gained abottomOffsetPxprop so it stacks above the controls hint without hardcoded magic numbers.📜 LICENSE: Apache-2.0 → MIT with a project tagline header. Commercial use, forking, embedding, and downstream redistribution all stay simple. (Committed earlier as
742e2c7.)Verification evidence
uv run pytest tests/npx vitest runnpx tsc -bcommunitySimilarity.test.ts+ 1 inDecidePanel.tsxare unrelated to session changes)propagation_pairs; every UUID resolves to a valid graph node_id;/agents/{id}returnscommunity_name: "Alpha"/api/v1/projects/Self-review pass applied
Full self-review ran with the code-review-excellence skill. All 🔴 blocking items and 🟡 important items were addressed:
get_agentcommunity lookup cachedmain.pydeadlock docstring softened (root cause not confirmed, only narrowed)LEFT_LEGEND_OFFSET_PXextractedAgentInspectorno font swapKnown gh CLI quirk
gh pr edit --body-filefails on this gh version withGraphQL: Projects (classic) is being deprecated(exit 1). Worked around withgh api PATCH repos/.../pulls/2 --input payload.json. Upgradeghto a post-May-2024 build to drop theprojectCardsGraphQL call.Ship Log (2026-04-11, session 2)
Four commits on top of the prior ship log bringing opinion synthesis + calibration to full E2E verification.
What landed
✨ Cross-community opinion synthesis (R8-2 extension). The R8-2 per-community endpoint already existed; this session added the cross-community aggregate.
POST /api/v1/simulations/{sim}/communities/__overall__/opinion-summarysynthesises each community as a side-effect then rolls up into a headline narrative. Returns{ overall, communities: [...] }in one round-trip. NewOverallOpinionPanelcomponent mounted onScenarioOpinionsPagewith per-community collapsible breakdown.🔧 Diffusion calibration strengthened (R8-3).
MessageStrength.scorewas0.4·u + 0.4·n − 0.4·c + 0.5— spread was 0.94 vs 0.58 (1.62×) and the worst case saturated at 0.10, which meant "stuck at 12%" scenarios couldn't emerge from campaign design alone. Reformulated to0.6·u + 0.5·n − 0.7·c + 0.3: spread now 0.86 vs 0.31 (2.77×) and the worst case saturates at 0.0, giving the propagation multiplier real headroom to stall. Docstring rewritten to match the actual math (the prior one contradicted the code). 5 parametric tests intest_01_schema.pypin the new coefficients.🔧 Ollama stack swapped for VRAM-friendly host (R8-4).
gemma4:latest(9.6 GB, multimodal) →llama3.2:1b(1.3 GB, text). User reported their PC VRAM wasn't enough for gemma4.latest(0.20.x) → pinned to0.11.10. The 0.20.x series has a llama-runner regression that crashes CPU inference on Ryzen 7500F with "llama runner process has terminated" and no stack trace. Pre-regression build runs cleanly.backend/app/config.py,backend/.env.example,docker-compose.yml,frontend/src/config/constants.ts,frontend/src/pages/SettingsPage.tsx, and three test files.🔒 Opinion cache race fix (review C1).
community_opinionshad non-unique indices but no UNIQUE constraint, so two concurrent requests for the same(sim_id, community_id, step)both missed_find_cachedand both paid for a real Tier-3 LLM call — the cache contract was advisory. Added migratione2_community_opinion_uniquewithUniqueConstraint("simulation_id", "community_id", "step")._persist_row_with_retrynow catchesIntegrityErrorwithsqlstate=23505, rolls back, and re-fetches the winner's row via_find_cachedso the loser's call returns the canonical existing row instead of retrying a doomed insert. ORM model carries the constraint for consistency. Two new tests (test_unique_violation_returns_winner_row,test_unique_violation_no_winner_row_propagates) cover the race path.🔒 LLM structured-output shape guards (review C2).
_parse_responsewas normalisingsentiment_trendand clippingsummarybutthemes,divisions,key_quoteswent to JSONB untouched. Small LLMs (llama3.2:1b especially) routinely return single strings, dicts, orNonewhere the schema says "list of objects", and the frontend.map()calls were crashing on render. Added three normaliser helpers (_normalise_themes,_normalise_divisions,_normalise_key_quotes) that drop any non-dict elements, require the key fields (theme,faction,agent_id+content), coerce numeric values with safe defaults, clamp strings to column limits, and clamp weight/share to[0, 1]. Seven new parametric tests cover garbage list elements, non-list inputs, missing fields, out-of-range values, and non-string concerns.🔧 Non-deadlock rollback gap (review I1).
_persist_row_with_retryraised on non-deadlock errors without rolling back, leaving the session in a dirty state the caller couldn't reuse. Now rolls back before re-raising.📜 Frontend default model drift (review C3). Backend moved to
llama3.2:1bbutconstants.ts(DEFAULT_OLLAMA_MODEL),SettingsPage.tsx(useState defaults), and two test files still hardcodedgemma4:latest. Aligned all five references.Verification evidence
uv run pytest tests/npx vitest runnpx tsc --noEmitllama3.2:1bresponse, no stub, real parsed JSONopinion_id, DB has exactly 1 row per(sim, community, step)Files changed this session
backend/migrations/versions/e2_community_opinion_unique.py(new)backend/app/models/community_opinion.py(+UniqueConstraint)backend/app/services/community_opinion_service.py(+OverallOpinionSnapshot, +build_overall_prompt, +3 shape normalisers, retry helper now instance method handling IntegrityError/23505, returns canonical row, unique sentinelOVERALL_COMMUNITY_ID)backend/app/api/communities.py(+/__overall__/opinion-summaryroute declared before the parameterised per-community route so FastAPI matches it first)backend/app/api/schemas.py(+OverallOpinionResponse)backend/app/api/deps.py(+get_llm_gateway, +get_community_opinion_service)backend/app/engine/agent/influence.py(R8-3 formula reformulation + docstring rewrite)backend/app/config.py,backend/.env.example,docker-compose.yml(model + Ollama image pin)backend/tests/test_26_community_opinion.py(+26 tests: shape guards, retry, unique-violation, cross-community)backend/tests/test_01_schema.py(+4 parametric MessageStrength tests for R8-3 coefficients)frontend/src/types/api.ts(+CommunityOpinion,OverallOpiniontypes)frontend/src/api/client.ts+queries.ts(+communityOpinionclient +useCommunityOpinionSynthesis+useOverallOpinionSynthesis)frontend/src/components/community/EliteLLMNarrativePanel.tsx+OverallOpinionPanel.tsx(new)frontend/src/pages/CommunityOpinionPage.tsx+ScenarioOpinionsPage.tsx(mount panels)frontend/src/__tests__/EliteLLMNarrativePanel.test.tsx+OverallOpinionPanel.test.tsx(new, 16 tests)frontend/src/config/constants.ts,frontend/src/pages/SettingsPage.tsx, and three test files (llama3.2:1b alignment)