Skip to content

outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387

Open
alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
alltheseas:feat/thompson-cg3-nip66
Open

outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387
alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
alltheseas:feat/thompson-cg3-nip66

Conversation

@alltheseas
Copy link
Copy Markdown
Contributor

@alltheseas alltheseas commented Mar 11, 2026

Summary

NDK's outbox relay selection doesn't learn from delivery outcomes, can't filter dead relays, and has no connection cap. This PR adds three opt-in enhancements: Thompson Sampling with CG3 coverage guarantee (+4.9pp event recall over plain Thompson, 22.0% → 26.9%), NIP-66 dead relay filtering (~45% faster loading), and a configurable connection cap. Benchmarked across 6 profiles in the nostrability/outbox suite.

All features are opt-in. Default behavior is unchanged when no new options are set.

What's added

Thompson Sampling (enableThompsonSampling: true)

Relays vary dramatically in reliability, but NDK's static popularity ranking can't distinguish a relay that delivers 100% of the time from one that delivers 10%. Thompson Sampling fixes this by maintaining Bayesian priors per relay, updated from binary hit/miss observations at each closeOnEose subscription EOSE. Relays that consistently deliver get selected more; unreliable ones get deprioritized while still being explored occasionally.

CG3 coverage guarantee (default on when Thompson enabled) — Pareto-superior to plain Thompson

Plain Thompson can over-penalize relays that serve hard-to-reach authors. Authors with only one write relay are especially vulnerable: if their sole relay scores poorly on other authors, Thompson deprioritizes it — orphaning them entirely. CG3 addresses this by force-selecting sole-source relays before the main selection loop, with 0.3× observation weight to avoid distorting scores. The resulting budget reallocation also improves recall for multi-relay authors whose relays overlap with sole-source ones. In benchmarks, CG3 is Pareto-superior to plain Thompson across all tested profiles.

NIP-66 liveness filtering (nip66MonitorRelays) — ~45% faster loading, removes dead relays

A significant fraction of relays in kind-10002 lists are permanently offline. Connecting to dead relays wastes connection slots and forces subscriptions to wait for timeouts before EOSE, directly inflating load times. NIP-66 filtering eliminates these dead relays before selection, cutting ~45% off loading times in benchmarks. Gracefully degrades: skips filtering entirely if monitor data is stale (>4h), incomplete (<100 relays), or would orphan an author. .onion relays always pass through.

Connection cap (maxOutboxRelays) — works independently

Caps total unique outbox relays selected. Recommended: 20. Adaptive connection limit benchmarks (cap@10/15/30 across small, medium, and large follow graphs) confirmed 20 as the practical middle ground:

  • Small graphs (<200 follows): Saturate at 10–15 relays; cap@20 is more than sufficient.
  • Medium graphs (300–500 follows): Benefit most from higher caps. cap@10 → cap@30 gained +17pp for a 399-follow profile, with most gains realized by cap@20.
  • Large graphs (>1000 follows): Gains from more connections are real but noisy — session-to-session variance dominates beyond cap@20.

cap@30 provides marginal additional gains for medium graphs but increases connection overhead for all profiles. cap@10 loses roughly 6–15pp vs cap@20 for non-trivial follow lists (estimates; cross-session variance is high). 20 balances relay diversity against connection cost.

Usage

const ndk = new NDK({
    explicitRelayUrls: ['wss://relay.damus.io', 'wss://nos.lol'],
    enableThompsonSampling: true,
    maxOutboxRelays: 20,
    nip66MonitorRelays: ['wss://relay.nostr.watch'],
});

// Persist scores across sessions
const saved = localStorage.getItem('thompson-priors');
if (saved) ndk.thompsonSampler!.importPriors(JSON.parse(saved));
window.addEventListener('beforeunload', () => {
    localStorage.setItem('thompson-priors',
        JSON.stringify(ndk.thompsonSampler!.exportPriors()));
});

Benchmark evidence

Plain Thompson maximizes mean recall but regresses fiatjaf (-18pp) due to NDK's priority cascade. CG3 eliminates that regression while matching or improving every individual profile — it is Pareto-superior within the same benchmark conditions:

Algorithm Event Recall Notes
NDK + Thompson 22.0% 30-run measurement; regresses fiatjaf
NDK + Thompson + CG3 26.9% (+4.9pp) No per-profile regressions; Pareto-superior

Both figures from the same 30-run measurement (6 EN profiles × 1yr × 5 sessions, NIP-66 liveness, cap@20). CG3 beats plain Thompson on all 6 profiles. NDK without Thompson scores ~16–22% depending on conditions (see OUTBOX-REPORT §8.2).

NIP-66 liveness filtering reduced median loading times by ~45% by eliminating timeout waits on dead relays.

See OUTBOX-REPORT §8.5c.

Test plan

  • 41 new unit + integration tests, 68 existing tests in modified areas — all pass, 0 regressions
  • Integration: Thompson priors populate after EOSE, with sole-source 0.3× weighting
  • Integration: maxOutboxRelays: 5 returns ≤5 unique relay URLs
  • Integration: NIP-66 passes through on empty/stale monitor data

Closes #385
Closes #386

🤖 Generated with Claude Code

@alltheseas alltheseas changed the title feat(outbox): add Thompson Sampling (CG3) + NIP-66 liveness filtering outbox: improve coverage by +35% thompson learning, reduce load times by 40% nip-66 Mar 11, 2026
@alltheseas alltheseas force-pushed the feat/thompson-cg3-nip66 branch from b037628 to 8a2acaa Compare March 11, 2026 06:57
@alltheseas alltheseas changed the title outbox: improve coverage by +35% thompson learning, reduce load times by 40% nip-66 outbox: improve coverage by +35%, reduce load times by 40% nip-66 Mar 11, 2026
@alltheseas alltheseas force-pushed the feat/thompson-cg3-nip66 branch 2 times, most recently from d64759c to 6c7735f Compare March 11, 2026 07:14
alltheseas and others added 3 commits March 11, 2026 02:19
… utility

- NIP-66: filter dead relays using kind-30166 monitor data before outbox
  selection. Auto-refreshes on each relay selection round (non-blocking).
  Graceful degradation: passes through on stale/insufficient data, .onion
  always passes, never orphans an author.
- maxOutboxRelays: hard cap on unique outbox relay count, enforced for all
  code paths including the missing-relay fallback. Works independently of
  Thompson. Benchmark data shows 20 covers 93-97% of authors.
- sample-beta: Beta distribution sampling (Jöhnk + Marsaglia-Tsang). Returns
  0.5 on invalid inputs instead of throwing.

Signed-off-by: alltheseas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ntee

- Thompson Sampling: Bayesian relay scoring using Beta(α,β) priors. Learns
  from delivery outcomes observed at subscription EOSE (closeOnEose only).
  Scores are sampled once per relay per round to ensure stable sort order.
  Weighted by relay author count: (1 + ln(N)) × sampleBeta(α,β).
- CG3 (Coverage Guarantee v3): force-selects sole-source relays before
  the main selection loop. Uses 0.3× observation weight to prevent
  over-crediting sole-source deliveries. Conditional skip when sole-source
  count exceeds budget.
- Delivery observation: tracks which relays delivered events for which
  authors (including duplicate deliveries), observes hit/miss at EOSE.
  Skips inactive authors (P4), deduplicates per-round (P8),
  non-blocking (P10).

All features opt-in via NDK constructor options.

Closes nostr-dev-kit#385
Closes nostr-dev-kit#386
Signed-off-by: alltheseas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
41 tests covering:
- sample-beta: uniform prior, known distributions, invalid input handling
- NIP-66: pass-through on stale/empty data, dead relay filtering, .onion bypass
- Thompson: weighted scores, observation updates, dedup, decay, export/import
- Coverage guarantee: force-selection, budget skip/cap, ordering
- Integration: Thompson priors at EOSE, sole-source 0.3× weight,
  maxOutboxRelays cap, NIP-66 outbox pass-through

Signed-off-by: alltheseas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add thompson learning to outbox add nip-66 relay liveness check to outbox

1 participant