Phase 2A: Add 6 new data sources (ACLED, ADS-B, GDELT, InSight Crime, OpenSanctions, SpiderFoot, Unusual Whales)#2
Conversation
… SpiderFoot, InSight Crime) - ACLED: Updated auth to support API key (ACLED_API_KEY), added FLASH alert for fatalities > 50 - ADS-B: Switched to free opendata.adsb.fi endpoint, added military aircraft markers (ISR, bombers, tankers, VIP), PRIORITY alerts for sensitive regions - OpenSanctions: Added API key support, cross-referencing logic, entity matching - GDELT: Full integration with regional tone scoring, geographic event clustering, PRIORITY alerts for tone deterioration - SpiderFoot: Docker container integration, scan management, findings panel - InSight Crime: RSS feed integration (main + Mexico/Colombia/Central America), entity extraction, OpenSanctions cross-referencing, PRIORITY alerts - Dashboard: New globe markers for military aircraft, GDELT clusters - Delta engine: Added alert evaluation for all new sources - News feed: Merged InSight Crime articles into ticker Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…anctions 401, safeFetch truncation - InSight Crime: Changed /category/ to /tag/ for Mexico/Colombia feeds, removed Central America (404) - GDELT: Reduced tone scoring to 3 regions and shortened delay to 3s to avoid 30s timeout - OpenSanctions: Handle 401 gracefully when no API key provided - safeFetch: Remove 500-char truncation on rawText (broke RSS parsing) Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
- SpiderFoot: Use /scanlist, /scaneventresults, /scanstatus, /scansummary (not /api?func=) - GDELT: Reduce tone scoring to 2 regions, shorten delay to 2s to stay under 30s timeout Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…calls Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…PI calls Removes sequential rate-limited tone API calls that caused 60s+ timeouts. Instead, computes regional tone scores from article-level tone data already fetched in the initial query. Reverts SLOW_SOURCE_TIMEOUT_MS back to standard 30s. Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
- Article fetch timeout: 20s (was 15s default) with 0 retries - Geo fetch timeout: 15s with 0 retries - Removed unused toneTrend/volumeTrend functions - Reduced inter-request delay to 500ms Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
GDELT API rejects queries with OR terms unless wrapped in parentheses. Added auto-parenthesization in searchEvents() and geoEvents(). Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…a.com) api.acleddata.com doesn't resolve. The correct ACLED API endpoint is https://acleddata.com/api/acled/read per their official docs. Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
- Add retry with 6s delay when GDELT returns rate-limit text response - Increase inter-request delay to 6s (GDELT requires 5s between requests) - Give GDELT 60s source timeout to accommodate retry + geo query - Fixes intermittent 0-article results from rate limiting Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…ne data) GDELT ArtList mode doesn't include tone scores in responses. Changed regional monitoring to track article coverage counts. PRIORITY alerts now fire on high coverage (10+ articles) per region. Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
- Add 3s initial delay to avoid rate-limit collisions with concurrent sources - Increase retry count from 2 to 3 for resilience against rate limiting - GDELT enforces 1 request per 5 seconds; concurrent sweeps can trigger this Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…age) Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…es to 4 Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…nSight Crime to news feed sources Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…ry for InSight Crime Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…e D) Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…rades, dark pool - New collector: apis/sources/unusualwhales.mjs (3 endpoints, 15min refresh) - Market Intelligence panel in dashboard with top 10 options flow, congressional trades (defense/energy/aerospace), dark pool >M - PRIORITY alert: options flow >M in defense/energy/maritime stocks - FLASH alert: defense committee member trades within 48h of geopolitical alert - Financial event globe markers at company HQ locations (cyan) - Sensor Grid entry for Market Intel with live counts - Map legend updated with Market Intel marker Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…rid, tension index, red markers - Default globe view changed from WORLD to ASIA PACIFIC (lat 20, lon 118) - PRC Intelligence section in Sensor Grid: PLAN Vessels, Taiwan ADIZ, SCS Activity, PRC State Media - PRC WATCH status indicator in topbar alongside existing WARTIME STAGFLATION RISK - Red markers for PRC-attributed activity (Chinese aircraft in Taiwan Strait/SCS) - PRC military aircraft from ADS-B rendered as large red markers - Strategic labels: PRC WATCH ZONE, TAIWAN ADIZ, SOUTH CHINA SEA - PRC Tension Index gauge in Signal Core panel (score 0-100, REDUCED/NORMAL/HEIGHTENED/ELEVATED) - Map legend updated with PRC Activity entry (red) - CSS for PRC gauge bar and indicator dot Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
…limited Root cause: D.air regions depended entirely on OpenSky hotspots, which returns HTTP 429 (rate limited). ADS-B had 331 military aircraft with valid coordinates but only fed D.adsbMilitary, never D.air. Fix: - Add buildAirHotspotsFromADSB() in inject.mjs that counts ADS-B military aircraft into the same 10 geographic regions as OpenSky - Use ADS-B data as automatic fallback when OpenSky is unavailable - Increase militaryAircraft cap from 50 to 200 for better regional coverage - Update airMeta to show 'ADS-B Military' as source when fallback is active Co-Authored-By: Matt Suma <matthew.suma@gmail.com>
Original prompt from Jake
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| // ─── ACLED FLASH alerts (fatalities > 50 in a single event) ───────── | ||
|
|
||
| const acledFlash = current.acled?.flashAlerts || []; | ||
| for (const alert of acledFlash) { | ||
| signals.new.push({ | ||
| key: `acled_flash:${alert.headline}`, | ||
| reason: alert.headline, | ||
| detail: alert.detail, | ||
| severity: 'critical', | ||
| tier: 'FLASH', | ||
| }); | ||
| criticalChanges += 3; | ||
| } | ||
|
|
||
| // ─── ADS-B PRIORITY alerts (military over sensitive regions) ─────── | ||
|
|
||
| const adsbAlerts = current.adsbMilitary?.priorityAlerts || []; | ||
| for (const alert of adsbAlerts.slice(0, 5)) { | ||
| signals.new.push({ | ||
| key: `adsb_priority:${alert.headline}`, | ||
| reason: alert.headline, | ||
| detail: alert.detail, | ||
| severity: 'high', | ||
| tier: 'PRIORITY', | ||
| }); | ||
| criticalChanges++; | ||
| } | ||
|
|
||
| // ─── GDELT tone deterioration PRIORITY alerts ───────────────────── | ||
|
|
||
| const gdeltAlerts = current.gdelt?.priorityAlerts || []; | ||
| for (const alert of gdeltAlerts) { | ||
| signals.new.push({ | ||
| key: `gdelt_tone:${alert.headline}`, | ||
| reason: alert.headline, | ||
| detail: alert.detail, | ||
| severity: 'high', | ||
| tier: 'PRIORITY', | ||
| }); | ||
| criticalChanges++; | ||
| } | ||
|
|
||
| // ─── InSight Crime sanctions cross-ref PRIORITY alerts ──────────── | ||
|
|
||
| const icAlerts = current.insightCrime?.priorityAlerts || []; | ||
| for (const alert of icAlerts) { | ||
| signals.new.push({ | ||
| key: `ic_sanctions:${alert.headline}`, | ||
| reason: alert.headline, | ||
| detail: alert.detail, | ||
| severity: 'high', | ||
| tier: 'PRIORITY', | ||
| }); | ||
| criticalChanges++; | ||
| } | ||
|
|
||
| // ─── Unusual Whales PRIORITY alerts (options flow >$5M) ───────────── | ||
|
|
||
| const uwAlerts = current.unusualWhales?.priorityAlerts || []; | ||
| for (const alert of uwAlerts) { | ||
| signals.new.push({ | ||
| key: `uw_options:${alert.ticker}`, | ||
| reason: alert.headline, | ||
| detail: alert.detail, | ||
| severity: 'high', | ||
| tier: 'PRIORITY', | ||
| }); | ||
| criticalChanges++; | ||
| } | ||
|
|
||
| // ─── Unusual Whales FLASH alerts (defense committee + geopolitical) ── | ||
|
|
||
| const defenseCommitteeTrades = current.unusualWhales?.congressTrades?.defenseCommittee || []; | ||
| const geopoliticalAlerts = [ | ||
| ...(current.gdelt?.priorityAlerts || []), | ||
| ...(current.acled?.flashAlerts || []), | ||
| ...(current.adsbMilitary?.priorityAlerts || []), | ||
| ]; | ||
|
|
||
| if (geopoliticalAlerts.length > 0) { | ||
| const now = new Date(); | ||
| const last48h = new Date(now.getTime() - 48 * 60 * 60 * 1000); | ||
|
|
||
| for (const trade of defenseCommitteeTrades) { | ||
| const tradeDate = new Date(trade.transactionDate); | ||
| if (tradeDate >= last48h) { | ||
| signals.new.push({ | ||
| key: `uw_flash:${trade.name}:${trade.ticker}`, | ||
| reason: `DEFENSE COMMITTEE TRADE: ${trade.name} traded ${trade.ticker} within 48h of geopolitical alert`, | ||
| detail: `${trade.txnType} ${trade.amounts} on ${trade.transactionDate}`, | ||
| severity: 'critical', | ||
| tier: 'FLASH', | ||
| }); | ||
| criticalChanges += 3; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 Delta engine alerts lack dedup — persistent alerts re-reported as "new" every sweep, inflating criticalChanges
All new alert types (ACLED flash, ADS-B priority, GDELT tone, InSight Crime sanctions, Unusual Whales options, Unusual Whales FLASH) are unconditionally pushed to signals.new without any dedup against the previous run. By contrast, the existing Telegram dedup (lines 183-199) correctly checks prevHashes before adding to signals.new. A mass casualty event (fatalities > 50) persists in ACLED data for days, so every sweep will re-add criticalChanges += 3 for the same event. Similarly, a military aircraft over a sensitive region will increment criticalChanges every ~15 minutes as long as it's airborne. This inflated count is consumed by the dashboard (jarvis.html:1885), Telegram alerts (server.mjs:103), and LLM idea generation (lib/llm/ideas.mjs:147), misrepresenting the situation as continuously escalating. Additionally, _compactForStorage in lib/delta/memory.mjs:196-221 does not preserve flashAlerts, priorityAlerts, adsbMilitary, insightCrime, or unusualWhales, so even if dedup were added, the previous run data wouldn't contain these fields to compare against.
Prompt for agents
The new alert types (ACLED flash, ADS-B priority, GDELT tone, InSight Crime sanctions, Unusual Whales) in computeDelta() are pushed unconditionally to signals.new without dedup against prior runs. This is inconsistent with the Telegram dedup pattern at lines 183-199 which builds prevHashes from prior runs.
Two changes are needed:
1. In lib/delta/engine.mjs: For each new alert category, build a set of prior alert keys from the previous run(s) and only push to signals.new if the key is not already present. For example, collect previous acled flash headlines, previous adsb priority headlines, etc. from the previous/priorRuns data and skip alerts whose key already appeared.
2. In lib/delta/memory.mjs _compactForStorage(): Preserve the alert fields needed for dedup in subsequent runs. Add at minimum: acled.flashAlerts (array of headlines), adsbMilitary.priorityAlerts, gdelt.priorityAlerts, insightCrime.priorityAlerts, unusualWhales.priorityAlerts, and unusualWhales.congressTrades.defenseCommittee. Without storing these, the dedup in step 1 would have nothing to compare against in prior runs.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Valid finding — all new alert types are pushed unconditionally to signals.new without dedup, unlike the Telegram dedup pattern. And _compactForStorage doesn't preserve the alert fields needed for comparison.
Since this is a direct sync of upstream code from ltmsuma/CrucixDEVIN PR #1, the fix should be a follow-up PR. Two changes needed:
- In
engine.mjs: Build prior alert key sets from previous runs and skip duplicates - In
memory.mjs: Preserveacled.flashAlerts,adsbMilitary.priorityAlerts,gdelt.priorityAlerts,insightCrime.priorityAlerts,unusualWhales.priorityAlerts, andunusualWhales.congressTrades.defenseCommittee
Happy to implement this as a follow-up if requested.
|
|
||
| export async function fullBriefing() { | ||
| console.error('[Crucix] Starting intelligence sweep — 29 sources...'); | ||
| console.error('[Crucix] Starting intelligence sweep — 30 sources...'); |
There was a problem hiding this comment.
🟡 Source count log message says 30 but there are 32 sources
The fullBriefing() function logs "Starting intelligence sweep — 30 sources" but the allPromises array contains 32 entries (29 original + SpiderFoot, InSightCrime, UnusualWhales). The computed sourcesQueried in the output object is correct (derived from sources.length), but the startup log is misleading.
| console.error('[Crucix] Starting intelligence sweep — 30 sources...'); | |
| console.error('[Crucix] Starting intelligence sweep — 32 sources...'); |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Correct — the log says "30 sources" but allPromises actually has 32 entries after this PR adds SpiderFoot, InSight Crime, and Unusual Whales. The runtime sourcesQueried count is correct, but the startup log is misleading. Upstream bug — can fix in a follow-up.
| } | ||
|
|
||
| export async function briefing() { | ||
| const apiKey = process.env.UNUSUAL_WHALES_API_KEY; |
There was a problem hiding this comment.
🔴 New environment variables not documented in .env.example, violating CONTRIBUTING.md
CONTRIBUTING.md mandates: "document any required environment variables in .env.example and README.md". This PR introduces four new environment variables — ACLED_API_KEY, OPENSANCTIONS_API_KEY, UNUSUAL_WHALES_API_KEY, and SPIDERFOOT_URL — but neither .env.example nor README.md were updated. This is a violation of the mandatory contributing rules.
Prompt for agents
CONTRIBUTING.md requires that new environment variables are documented in both .env.example and README.md.
The following new env vars need to be added to .env.example:
- ACLED_API_KEY: Used in apis/sources/acled.mjs as the preferred auth method (set alongside ACLED_EMAIL)
- OPENSANCTIONS_API_KEY: Required for opensanctions.mjs search endpoint (free key at opensanctions.org/api)
- UNUSUAL_WHALES_API_KEY: Required for unusualwhales.mjs options flow, congressional trades, dark pool data
- SPIDERFOOT_URL: Optional, defaults to http://localhost:5001, used by apis/sources/spiderfoot.mjs
Also update README.md with setup instructions for these new sources.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Acknowledged — this is a valid gap. Four new env vars (ACLED_API_KEY, OPENSANCTIONS_API_KEY, UNUSUAL_WHALES_API_KEY, SPIDERFOOT_URL) are used in the source code but not documented in .env.example or README.md. This is inherited from the upstream PR. Can fix in a follow-up.
| try { | ||
| sanctionsHits = await crossReference(entityList); | ||
| // PRIORITY alerts for sanctions matches | ||
| priorityAlerts = sanctionsHits.map(hit => ({ | ||
| tier: 'PRIORITY', | ||
| headline: `SANCTIONS MATCH: "${hit.name}" found in InSight Crime + OpenSanctions`, | ||
| detail: `Matched: ${hit.matches.map(m => m.caption).join(', ')} (datasets: ${hit.matches.flatMap(m => m.datasets || []).slice(0, 3).join(', ')})`, | ||
| })); | ||
| } catch (e) { | ||
| // Cross-referencing is best-effort | ||
| } |
There was a problem hiding this comment.
🟡 crossReference makes dozens of failing API calls when OPENSANCTIONS_API_KEY is not set
insightcrime.mjs:120 calls crossReference(entityList) which invokes matchEntity for up to 30 entities. Each matchEntity call goes through safeFetch with default retries: 1, hitting the OpenSanctions API. Without OPENSANCTIONS_API_KEY, the API returns 401 and safeFetch retries each call once (with 2s delay). This wastes ~6-12 seconds of the source's 30-second timeout budget on requests guaranteed to fail. The opensanctions.mjs briefing() function at opensanctions.mjs:127 has a proper guard for missing API key, but crossReference() does not, and insightcrime.mjs calls it directly without checking.
| try { | |
| sanctionsHits = await crossReference(entityList); | |
| // PRIORITY alerts for sanctions matches | |
| priorityAlerts = sanctionsHits.map(hit => ({ | |
| tier: 'PRIORITY', | |
| headline: `SANCTIONS MATCH: "${hit.name}" found in InSight Crime + OpenSanctions`, | |
| detail: `Matched: ${hit.matches.map(m => m.caption).join(', ')} (datasets: ${hit.matches.flatMap(m => m.datasets || []).slice(0, 3).join(', ')})`, | |
| })); | |
| } catch (e) { | |
| // Cross-referencing is best-effort | |
| } | |
| if (process.env.OPENSANCTIONS_API_KEY) { | |
| try { | |
| sanctionsHits = await crossReference(entityList); | |
| // PRIORITY alerts for sanctions matches | |
| priorityAlerts = sanctionsHits.map(hit => ({ | |
| tier: 'PRIORITY', | |
| headline: `SANCTIONS MATCH: "${hit.name}" found in InSight Crime + OpenSanctions`, | |
| detail: `Matched: ${hit.matches.map(m => m.caption).join(', ')} (datasets: ${hit.matches.flatMap(m => m.datasets || []).slice(0, 3).join(', ')})`, | |
| })); | |
| } catch (e) { | |
| // Cross-referencing is best-effort | |
| } | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Valid — crossReference() doesn't guard on OPENSANCTIONS_API_KEY, so it fires up to 30 safeFetch calls that all 401 with retries, wasting 6-12 seconds. The suggested fix of wrapping the call in if (process.env.OPENSANCTIONS_API_KEY) would prevent this. Upstream code — can fix in a follow-up.
| cluster.lat = (cluster.lat + points[j].lat) / 2; | ||
| cluster.lon = (cluster.lon + points[j].lon) / 2; |
There was a problem hiding this comment.
🟡 Geo-clustering centroid drifts toward later points instead of computing true average
In clusterGeoPoints, the centroid update cluster.lat = (cluster.lat + points[j].lat) / 2 gives equal weight to the current centroid and the new point regardless of cluster size. With points at latitudes [0, 0, 90], the centroid after all three is 45 instead of the correct 30. This biases centroids toward later-added points, and because subsequent point-to-cluster distance checks use the drifted centroid, it can cause incorrect cluster assignments — points that should be grouped together may end up in separate clusters.
| cluster.lat = (cluster.lat + points[j].lat) / 2; | |
| cluster.lon = (cluster.lon + points[j].lon) / 2; | |
| // Update centroid as weighted average | |
| const n = cluster.points.length; | |
| cluster.lat = (cluster.lat * (n - 1) + points[j].lat) / n; | |
| cluster.lon = (cluster.lon * (n - 1) + points[j].lon) / n; |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Good catch on the centroid drift. The running average (cluster.lat + points[j].lat) / 2 gives equal weight to the centroid and the new point regardless of cluster size. The suggested fix using (cluster.lat * (n-1) + points[j].lat) / n would compute the true weighted average. This is upstream code — can fix in a follow-up.
…e, OpenSanctions, SpiderFoot, Unusual Whales)
Summary
Syncs the Phase 2A data source expansion from ltmsuma/CrucixDEVIN PR #1 into the COG-GTM fork. This adds 3 new source modules, enhances 4 existing sources, and adds corresponding dashboard panels (Narco & Organized Crime, Market Intelligence, PRC Tension Index).
Depends on PR #1 (upstream master sync) — must be merged first.
21 commits across 12 files (+1,629 / -130 lines)
Why
The upstream repo added a Phase 2A data expansion that significantly broadens intelligence coverage. This sync brings those capabilities into the COG-GTM fork so we stay at full parity.
Scope
What Changed
New Sources
apis/sources/insightcrime.mjsapis/sources/spiderfoot.mjsSPIDERFOOT_API_KEYapis/sources/unusualwhales.mjsUNUSUAL_WHALES_API_KEYEnhanced Existing Sources
api.acleddata.com, FLASH alerts for events with >50 fatalitiesopendata.adsb.fias free primary (no key), sensitive region detection (7 regions), PRIORITY alerts for military over hotspotstotalResultsas object, guards against collections 404Dashboard
Delta Engine
criticalseverityValidation
git merge-treeconfirms zero conflict markers against COG-GTM masterConfig and Docs
.env.example— not updated in this PR (new env varsSPIDERFOOT_API_KEY,UNUSUAL_WHALES_API_KEY,ACLED_API_KEYshould be documented; may need a follow-up)briefing.mjsupdated (29 → 30, though actual count is higher)Source Additions
InSight Crime: Adds Central American organized crime coverage — a region not covered by existing sources. No API key required (public RSS). Degrades to empty result on fetch failure. Cross-references entities against OpenSanctions for sanctions overlap. Wired into
briefing.mjsas Tier 7 andinject.mjsfor the Narco panel.SpiderFoot: Adds OSINT enrichment (IP/domain/email reconnaissance). Requires
SPIDERFOOT_API_KEY; returns stub when missing. Wired intobriefing.mjsas Tier 7.Unusual Whales: Adds market intelligence — options flow anomalies, congressional trading (filtered to defense/energy), dark pool large prints. Requires
UNUSUAL_WHALES_API_KEY; degrades gracefully. Wired intobriefing.mjsas Tier 8 andinject.mjsfor the Market Intelligence panel.Human Review Checklist
innerHTMLwithcleanText(). VerifycleanText()adequately sanitizes all injected content per CONTRIBUTING.md requirements.acleddata.com/api/acled/readtoapi.acleddata.com/acled/read. Confirm this is the correct current ACLED API endpoint.air * 1.5 + GDELT * 8 + ISR * 15, max 100). Review whether these weights produce meaningful results or could be misleading.lib/delta/engine.mjs) — Generates FLASH alerts when defense committee members trade within 48h of any geopolitical event. Review for false positive risk and whether the correlation logic is sound.canProbeApiblock now always fetches from/api/dataeven when inline data exists. Verify this doesn't cause flash-of-content or data race issues.briefing.mjssays "30 sources" but this PR adds 3 new sources to the existing set. Verify the actual count is correct..env.exampleupdates — New env vars (ACLED_API_KEY,SPIDERFOOT_API_KEY,UNUSUAL_WHALES_API_KEY) are referenced in source code but.env.exampleis not updated in this PR.Checklist
Link to Devin session: https://app.devin.ai/sessions/a046d43371284e3f8175cb76185f3ced
Requested by: @jakexcosme