Task 02D: price reconstruction, UMA fix, ILS scoring, Phase 0 fixture#4
Merged
Conversation
Runs SubgraphCollector + ClobCollector for all 24 markets across the documented case set (fficd-001 through fficd-008), regardless of volume threshold. Market IDs resolved by prefix LIKE lookup so real condition IDs are used at runtime. Features: idempotency skip (trades exist + resolved >24h), "bad indexers" fast-fail, 4h runtime cap, append-only JSONL log, markdown status report. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 0 errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 0 errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 3 indexer skips Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, rate 3.3 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 3.5 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ades, 3.7 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ades, 3.7 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ades, 4.2 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…es, 5.6 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 7.8 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 9.7 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 11.1 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 13.9 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 15.7 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 19.6 m/min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rades, 23 m/min, ETA ~1.8h Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…des, 26.4 m/min, ~31min left Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scripts/make_foresightflow_fixture.py: reusable JSONL fixture generator for coordination experiment. Phase 0 (50 markets) and Phase 1A (2000). Primary baselineMidPrice from CLOB prices table; --allow-trade-vwap flag enables VWAP fallback from trades table >24h before resolution. 6-category mapping: keyword → fflow taxonomy → fallback. - TASK_02C_RESULTS.md: Phase 3B marked COMPLETE (17.9M trades, 796K wallets, 10,410 markets, 11 bad-indexer skips). Phase 3C marked READY TO RUN. Data collection table updated with final subgraph count. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Etherscan deprecated the V1 Polygonscan endpoint (api.polygonscan.com/api). V2 is at api.etherscan.io/v2/api with chainid=137 injected per-request. - config.py: default polygonscan_url → https://api.etherscan.io/v2/api - polygonscan.py: _get() prepends chainid=137 to every request - .env: FFLOW_POLYGONSCAN_URL updated (was api.polygonscan.com) BLOCKER: local DNS resolver returns NXDOMAIN for api.etherscan.io. Workaround (requires user sudo): echo "23.92.68.154 api.etherscan.io" | sudo tee -a /etc/hosts Or change system DNS to 8.8.8.8 in Network Preferences. API confirmed working via resolved IP. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
polygonscan: - run() and _get_stale_wallets() gain min_trades param; queries trades table via JOIN to select only wallets with >= N trades (ordered by trade count DESC — most active first) - Progress log every 100 wallets during batch (polygonscan_batch_progress) - CLI: --min-trades flag wired through With --min-trades 100: 11,393 wallets (~6.3h) vs 796K full set (440h+) make_foresightflow_fixture.py: - Progress log every 500 candidates scanned - os.makedirs for output directory if needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
scripts/diagnose_clob_coverage.py: 7-step diagnostic covering:
1. Basic prices table stats (409 markets, 1.12M rows, Apr 13-26 window)
2. ILS-target coverage: 3/11,263 markets (0.0%) — confirms gap
3. FFICD validation set: 0/24 markets have CLOB prices
4-5. data_collection_runs analysis: 727 runs = 409 distinct markets,
all from April 2026 open-market monitoring pilot
6. Trade VWAP feasibility: 100% of 17.9M trades have valid 0-1 price
7. Recommendation: trade VWAP unblocks ILS now; CLOB batch is Option A
reports/TASK_02C_CLOB_DIAGNOSTICS.md: generated output
Root cause of TASK_02C_RESULTS.md contradiction:
- 727 CLOB runs / 1.55M rows → open market monitoring pilot
- 0% ILS coverage → CLOB never ran for historical resolved markets
Both statements were correct about different market sets.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…line fflow/scoring/price_series.py (new): reconstruct_price_series(): CLOB first, trade VWAP fallback with 1-min bucketing, forward-fill gaps. Returns DataFrame with source col. get_price_at(): CLOB→trade VWAP two-tier lookup, ±5min tolerance. fflow/scoring/pipeline.py: compute_market_label() gains price_source='auto' param. 'auto' = CLOB first, trade VWAP fallback; 'clob' = CLOB only; 'trade_vwap' = force trades. Stores actual source in label row. fflow/models.py: MarketLabel.price_source TEXT column added. alembic/versions/0003_price_source.py: Migration 0002→0003, applied to DB. tests/test_price_series.py: 9 tests, all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fallback - Key in URL path via _uma_subgraph_url() (not just Authorization header) - On subgraph failure, fall back to eth_getLogs on UMA OOv2 at 1rpc.io/matic - ABI-decode Settle event non-indexed data (bytes ancillaryData, int256 resolvedPrice) - Default FFLOW_POLYGON_RPC_URL changed from polygon-rpc.com to 1rpc.io/matic - Add RetryableHTTPClient.post() method Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- `fflow news tier1-batch [--limit N]`: bulk Tier 1 for all markets with resolution_evidence_url but no existing news_timestamps row - `fflow news seed-proxy [--market-ids ...] [--category ...] [--offset-days N]`: seed synthetic T_news from end_date-N days (tier=2, confidence=0.50) for admin-resolved markets without UMA evidence Seeded 24 FFICD validation markets with tier=2 proxy (end_date-1d). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Catch PriceLookupError from compute_ils, return None gracefully - Snap t_open to first available trade when price series starts late (common for long-running markets with sparse early trading) - Guard t_news < t_open: return None with t_news_predates_t_open warning - Fix MarketLabel.pre_news_max_jump: NUMERIC(8,6) → NUMERIC(20,6) (it's a USDC amount, not a price; can exceed 99.999999) - Migration 0004 to alter column type Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4/24 markets scored (all others: no trade data or proxy T_news failures). Key finding: T_news proxy quality (end_date-1d) is the dominant error source; high |ILS| values reflect price-convergence noise, not informed trading signal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scripts/make_foresightflow_fixture.py: full rewrite — hard cutoff 2025-09-15 invariant, NegRisk + secondary bucket exclusion, per-category quota sampling (crypto=8, politics=8, sports=8, economics=8, geopolitics=9, entertainment=9), Brier calibration check - data/fixture_phase0.jsonl: 50 markets, 26 YES / 24 NO, 0 pre-cutoff, 0 bucket markets; all baselines from trade_vwap (CLOB not backfilled) - CHARTER_v0.3.md: updated project charter - .gitignore: exclude .claude/ and memory/ directories Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reconstruct_price_series),price_sourcefield added tomarket_labelseth_getLogsfallback with ABI decoding of Settle eventsfflow news tier1-batch(seeds T_news from UMA evidence URLs) andfflow news seed-proxy(seeds synthetic T_news asend_date - N days, tier=2)reports/TASK_02D_ILS_FFICD_RESULTS.mdscripts/make_foresightflow_fixture.pyrewritten — hard cutoff 2025-09-15 invariant, NegRisk + secondary bucket exclusion, per-category quota (6 categories), Brier calibration check; generatesdata/fixture_phase0.jsonl(50 markets, 52% YES)PriceLookupErrorhandling in pipeline, t_open snapping to first trade,NUMERIC(20,6)forpre_news_max_jump, migration 0004Phase 5 status
BLOCKED pending review of Phase 4 results. See
reports/TASK_02D_ILS_FFICD_RESULTS.md— recommendation is to improve T_news quality before running control group comparison.Test plan
uv run pytest— all existing tests pass🤖 Generated with Claude Code