Skip to content

Task02f/control group and proxy refinement#6

Merged
MaksymDS merged 4 commits into
masterfrom
task02f/control-group-and-proxy-refinement
Apr 27, 2026
Merged

Task02f/control group and proxy refinement#6
MaksymDS merged 4 commits into
masterfrom
task02f/control-group-and-proxy-refinement

Conversation

@MaksymDS
Copy link
Copy Markdown
Contributor

No description provided.

MaksymDS and others added 4 commits April 27, 2026 14:04
All trades in DB have outcome_index=1 (NO token). Raw VWAP was
storing NO price. Fix: YES_price = 1 - VWAP.

Result: Brier 0.5484 → 0.1347, calibration bins now monotone.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pilot (event_resolved, n=725): median=-0.084, %pos=15.2%
Control (unclassifiable, n=683): median=-0.043, %pos=21.4%

Mann-Whitney p=0.000001, r=0.132, CI [-0.066,-0.023] entirely negative.
Verdict: REVERSED SEPARATION — pilot ILS is significantly LOWER than
null baseline. Positive ILS in event_resolved markets not elevated
above unclassifiable baseline.

Root cause: resolved_at-24h proxy structurally better for sports/
behavioral markets (resolution IS the event) than for political/
regulatory markets (news precedes resolution by days).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Key results on 725 pilot markets:
  24h: %pos=15.2%, median=-0.084, %|ILS|>1=13.9%
   6h: %pos=11.0%, median=-0.134, %|ILS|>1=19.3%
   2h: %pos=0.0%,  median=-0.332, %|ILS|>1=25.3%
   1h: %pos=0.0%,  median=-0.350, %|ILS|>1=27.1%

Spearman ρ(24h,1h)=0.542 — moderate proxy sensitivity.
Monotone increase 24h→1h: only 14/221 markets (6.3%).

Epstein cluster highly unstable: AOC 24h=+0.933 vs 6h=-4.241.
Interpretation: news released >24h before resolution; final 6h
captures post-disclosure price noise, not pre-event leakage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…let analysis

Phase 3 of Task 02F: deep dive on the 3 Epstein markets (AOC, Sanders, Barak)
that showed highest ILS in the pilot (0.55–0.93).

Key findings:
- AOC/Sanders high ILS is a formula edge effect (p_open≥0.94, denominator≤0.06)
- Barak (17%→63%) shows genuine price discovery with anomalous Dec 20 crash
- One wallet (0x4bfb41d5, veteran 5115 mkts) dominated all 3 from day 1
- 4 wallets appear in all 3 markets; none are newly created

Also adds TASK_02F_FINAL.md synthesizing all 3 phases. Phase 4 (LLM Tier 3)
gated on explicit user approval.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@MaksymDS MaksymDS merged commit 62243c3 into master Apr 27, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant