fix(detectors): polarization threshold + cascade state leak + 18-step pilots#9
Merged
Conversation
…18-step pilots Round 8-8: two emergent-event detector issues that were silently making README claims about "polarization auto-detection" false. ## Polarization threshold: 0.4 → 0.05 The prior threshold (0.4 sentiment_variance) assumed belief variance on [-1, 1] would routinely reach 0.4 (half the community at +1, half at -1). Actual pilot data shows community variance sits in the 0.03-0.07 range — the belief update dynamics don't push communities to full bimodality. With the old threshold, the polarization detector never fired in any of the 6 pilot scenarios. The new 0.05 threshold fires cleanly on hostile-framing pilots (UC1/UC2 at 0.065-0.067). Updated test: test_no_polarization_below_threshold now tests 0.01 (well below 0.05). Added test_realistic_pilot_variance_fires_polarization (variance=0.065, which is the actual UC1 baseline value). ## CascadeDetector.reset() + slow_adoption guard leak The orchestrator holds one StepRunner (and therefore one CascadeDetector) for its entire process lifetime. The _slow_adoption_fired one-shot guard was leaking across simulations: a slow_adoption event from simulation A would silently suppress the detector for simulation B unless B happened to recover above the threshold first. Added CascadeDetector.reset() and called it from SimulationOrchestrator.create_simulation() so each sim starts with a clean detector state. Added test_reset_clears_slow_adoption_guard. ## echo_chamber_ratio: left at 10.0 (documented as non-firing) Prophet's default NetworkGenerator (Watts-Strogatz + Barabasi-Albert with cross_community_prob=0.02) produces community internal/external edge ratios of 0.4-0.6 — the preferential attachment creates MORE cross-community bridges than intra-community ties. The echo_chamber detector's ratio > 10 condition can never fire on this topology. The proper fix is a belief-isolation metric (low variance + large deviation from global mean) — tracked as a follow-up. Documented in the CascadeConfig docstring and USE_CASE_PILOTS.md. ## 18-step pilot results All 6 pilots re-ran at 18 steps to match the README's "step 18" reference. Key findings: | Case | step 2 | step 12 | step 17 | sentiment | |---|:---:|:---:|:---:|:---:| | uc1_baseline (hostile) | 13.5% | 93.6% | 97.5% | +0.79 | | uc1_reframed (friendly) | 79.5% | 98.6% | 99.4% | +0.92 | | uc3_rto_raw (hostile) | 0.0% | 0.1% | 0.2% | **-0.32** | | uc3_rto_restructured | 57.8% | 94.4% | 97.4% | +0.90 | UC3 raw holds steady stall through all 18 steps with sentiment deepening to -0.32. UC1 baseline shows the "13% stall" at step 2 but cascades by step 12 (the 15% skeptic minority can't sustain resistance against 80% adopter-leaning agents). slow_adoption now fires correctly (uc1 + uc2 hostile both register it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt util - Add utils/sentiment.ts: sentimentTone, sentimentTextClass, formatDelta, deltaChangeType - ScenarioOpinionsPage: replace hard-coded demo deltas with real prev-vs-current step deltas; clamp mean_belief to [-1,1]; inverted polarization changeType - CommunityOpinionPage: wire useCommunityThreads API (preferred over step-derived synthetic); use per-community new_propagation_count for message_count fallback - ConversationThreadPage: breadcrumb derived from routed communityId (not hard-coded "Alpha") - All three pages use sentimentTextClass (no inline threshold ternaries) - AgentInspector: widen drawer from w-80 (320px) to w-[370px] (+50px) - Extend tests: 8 new contract tests (36 total across 3 files), all Green - tsc -b: 0 errors, ESLint: 0 errors Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
showjihyun
added a commit
that referenced
this pull request
Apr 13, 2026
…n, validation) Two-pass code review found 11 issues across 6 backend files: Critical: - #1 registry._call_adapter: wrap raw str→LLMPrompt before adapter.complete() - #2 persist_step retry: re-insert EmergentEvent rows on rollback retry - #8 deps.py singletons: add threading.Lock + double-checked locking - #9 load_steps: bound EmergentEvent query with step≤max + limit Important: - #3 MC endpoint: asyncio.wait_for(300s) + 504 on timeout - #4 settings PUT: str() coercion on Chinese LLM provider fields - #5 monte_carlo.py: remove fragile iscoroutine guard, plain await - #6 _config_to_dict: dataclasses.asdict for community serialization - #7 UUID parse: _safe_uuid try/except replaces len>8 heuristic - #10 persist_step retry: also re-insert agent_states + propagation_events - #11 settings PUT: str() coercion on Anthropic/OpenAI/Gemini fields too All 57 targeted tests pass (test_29 + test_06 + test_05). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Round 8-8 fixes two emergent-event detector issues that silently made
README claims about "polarization auto-detection" false, extends pilots
to 18 steps, and documents the MC runner status.
Polarization threshold: 0.4 → 0.05 — the old default never fired because
belief variance in real pilots sits at 0.03-0.07. Added two new unit tests.
CascadeDetector.reset() — the
_slow_adoption_firedone-shot guard leakedacross simulations because the orchestrator held a singleton detector. Added
reset()called oncreate_simulation()+ test.18-step pilots — README cites "step 18" for UC1's stall. UC3 raw confirmed:
0.2% adoption + sentiment -0.32 at step 17, complete sustained stall. UC1 baseline
hits 13.5% at step 2 (matches README) but cascades to 97.5% by step 17.
MC smoke test —
run-allendpoint works (81.6% adoption, 2 emergent events).MonteCarloRunnerclass atmonte_carlo.pyis dead code (zero callers) — eitherremove or wire up in a future PR.
Test plan
🤖 Generated with Claude Code