Skip to content

fix(detectors): polarization threshold + cascade state leak + 18-step pilots#9

Merged
showjihyun merged 2 commits into
mainfrom
feat/pilot-hardening
Apr 12, 2026
Merged

fix(detectors): polarization threshold + cascade state leak + 18-step pilots#9
showjihyun merged 2 commits into
mainfrom
feat/pilot-hardening

Conversation

@showjihyun

Copy link
Copy Markdown
Owner

Summary

Round 8-8 fixes two emergent-event detector issues that silently made
README claims about "polarization auto-detection" false, extends pilots
to 18 steps, and documents the MC runner status.

Polarization threshold: 0.4 → 0.05 — the old default never fired because
belief variance in real pilots sits at 0.03-0.07. Added two new unit tests.

CascadeDetector.reset() — the _slow_adoption_fired one-shot guard leaked
across simulations because the orchestrator held a singleton detector. Added
reset() called on create_simulation() + test.

18-step pilots — README cites "step 18" for UC1's stall. UC3 raw confirmed:
0.2% adoption + sentiment -0.32 at step 17, complete sustained stall. UC1 baseline
hits 13.5% at step 2 (matches README) but cascades to 97.5% by step 17.

MC smoke testrun-all endpoint works (81.6% adoption, 2 emergent events).
MonteCarloRunner class at monte_carlo.py is dead code (zero callers) — either
remove or wire up in a future PR.

Test plan

  • Backend: 1031 passed, 2 skipped (+2 new cascade detector tests)
  • 6 pilots at 18 steps, all non-stub
  • Polarization detector now fires on realistic variance values
  • slow_adoption fires correctly on UC1/UC2 hostile pilots

🤖 Generated with Claude Code

showjihyun and others added 2 commits April 12, 2026 13:24
…18-step pilots

Round 8-8: two emergent-event detector issues that were silently
making README claims about "polarization auto-detection" false.

## Polarization threshold: 0.4 → 0.05

The prior threshold (0.4 sentiment_variance) assumed belief variance
on [-1, 1] would routinely reach 0.4 (half the community at +1, half
at -1). Actual pilot data shows community variance sits in the
0.03-0.07 range — the belief update dynamics don't push communities
to full bimodality. With the old threshold, the polarization detector
never fired in any of the 6 pilot scenarios. The new 0.05 threshold
fires cleanly on hostile-framing pilots (UC1/UC2 at 0.065-0.067).

Updated test: test_no_polarization_below_threshold now tests 0.01
(well below 0.05). Added test_realistic_pilot_variance_fires_polarization
(variance=0.065, which is the actual UC1 baseline value).

## CascadeDetector.reset() + slow_adoption guard leak

The orchestrator holds one StepRunner (and therefore one CascadeDetector)
for its entire process lifetime. The _slow_adoption_fired one-shot guard
was leaking across simulations: a slow_adoption event from simulation A
would silently suppress the detector for simulation B unless B happened
to recover above the threshold first.

Added CascadeDetector.reset() and called it from
SimulationOrchestrator.create_simulation() so each sim starts with a
clean detector state. Added test_reset_clears_slow_adoption_guard.

## echo_chamber_ratio: left at 10.0 (documented as non-firing)

Prophet's default NetworkGenerator (Watts-Strogatz + Barabasi-Albert
with cross_community_prob=0.02) produces community internal/external
edge ratios of 0.4-0.6 — the preferential attachment creates MORE
cross-community bridges than intra-community ties. The echo_chamber
detector's ratio > 10 condition can never fire on this topology.
The proper fix is a belief-isolation metric (low variance + large
deviation from global mean) — tracked as a follow-up. Documented in
the CascadeConfig docstring and USE_CASE_PILOTS.md.

## 18-step pilot results

All 6 pilots re-ran at 18 steps to match the README's "step 18"
reference. Key findings:

| Case | step 2 | step 12 | step 17 | sentiment |
|---|:---:|:---:|:---:|:---:|
| uc1_baseline (hostile) | 13.5% | 93.6% | 97.5% | +0.79 |
| uc1_reframed (friendly) | 79.5% | 98.6% | 99.4% | +0.92 |
| uc3_rto_raw (hostile) | 0.0% | 0.1% | 0.2% | **-0.32** |
| uc3_rto_restructured | 57.8% | 94.4% | 97.4% | +0.90 |

UC3 raw holds steady stall through all 18 steps with sentiment
deepening to -0.32. UC1 baseline shows the "13% stall" at step 2
but cascades by step 12 (the 15% skeptic minority can't sustain
resistance against 80% adopter-leaning agents). slow_adoption now
fires correctly (uc1 + uc2 hostile both register it).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt util

- Add utils/sentiment.ts: sentimentTone, sentimentTextClass, formatDelta, deltaChangeType
- ScenarioOpinionsPage: replace hard-coded demo deltas with real prev-vs-current step deltas;
  clamp mean_belief to [-1,1]; inverted polarization changeType
- CommunityOpinionPage: wire useCommunityThreads API (preferred over step-derived synthetic);
  use per-community new_propagation_count for message_count fallback
- ConversationThreadPage: breadcrumb derived from routed communityId (not hard-coded "Alpha")
- All three pages use sentimentTextClass (no inline threshold ternaries)
- AgentInspector: widen drawer from w-80 (320px) to w-[370px] (+50px)
- Extend tests: 8 new contract tests (36 total across 3 files), all Green
- tsc -b: 0 errors, ESLint: 0 errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@showjihyun showjihyun merged commit c6ba97f into main Apr 12, 2026
2 checks passed
@showjihyun showjihyun deleted the feat/pilot-hardening branch April 12, 2026 04:51
showjihyun added a commit that referenced this pull request Apr 13, 2026
…n, validation)

Two-pass code review found 11 issues across 6 backend files:

Critical:
- #1  registry._call_adapter: wrap raw str→LLMPrompt before adapter.complete()
- #2  persist_step retry: re-insert EmergentEvent rows on rollback retry
- #8  deps.py singletons: add threading.Lock + double-checked locking
- #9  load_steps: bound EmergentEvent query with step≤max + limit

Important:
- #3  MC endpoint: asyncio.wait_for(300s) + 504 on timeout
- #4  settings PUT: str() coercion on Chinese LLM provider fields
- #5  monte_carlo.py: remove fragile iscoroutine guard, plain await
- #6  _config_to_dict: dataclasses.asdict for community serialization
- #7  UUID parse: _safe_uuid try/except replaces len>8 heuristic
- #10 persist_step retry: also re-insert agent_states + propagation_events
- #11 settings PUT: str() coercion on Anthropic/OpenAI/Gemini fields too

All 57 targeted tests pass (test_29 + test_06 + test_05).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant