outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap by alltheseas · Pull Request #387 · nostr-dev-kit/ndk

alltheseas · 2026-03-11T06:42:11Z

Summary

NDK's outbox relay selection doesn't learn from delivery outcomes, can't filter dead relays, and has no connection cap. This PR adds three opt-in enhancements: Thompson Sampling with CG3 coverage guarantee (+4.9pp event recall over plain Thompson, 22.0% → 26.9%), NIP-66 dead relay filtering (~45% faster loading), and a configurable connection cap. Benchmarked across 6 profiles in the nostrability/outbox suite.

All features are opt-in. Default behavior is unchanged when no new options are set.

What's added

Thompson Sampling (`enableThompsonSampling: true`)

Relays vary dramatically in reliability, but NDK's static popularity ranking can't distinguish a relay that delivers 100% of the time from one that delivers 10%. Thompson Sampling fixes this by maintaining Bayesian priors per relay, updated from binary hit/miss observations at each closeOnEose subscription EOSE. Relays that consistently deliver get selected more; unreliable ones get deprioritized while still being explored occasionally.

CG3 coverage guarantee (default on when Thompson enabled) — Pareto-superior to plain Thompson

Plain Thompson can over-penalize relays that serve hard-to-reach authors. Authors with only one write relay are especially vulnerable: if their sole relay scores poorly on other authors, Thompson deprioritizes it — orphaning them entirely. CG3 addresses this by force-selecting sole-source relays before the main selection loop, with 0.3× observation weight to avoid distorting scores. The resulting budget reallocation also improves recall for multi-relay authors whose relays overlap with sole-source ones. In benchmarks, CG3 is Pareto-superior to plain Thompson across all tested profiles.

NIP-66 liveness filtering (`nip66MonitorRelays`) — ~45% faster loading, removes dead relays

A significant fraction of relays in kind-10002 lists are permanently offline. Connecting to dead relays wastes connection slots and forces subscriptions to wait for timeouts before EOSE, directly inflating load times. NIP-66 filtering eliminates these dead relays before selection, cutting ~45% off loading times in benchmarks. Gracefully degrades: skips filtering entirely if monitor data is stale (>4h), incomplete (<100 relays), or would orphan an author. .onion relays always pass through.

Connection cap (`maxOutboxRelays`) — works independently

Caps total unique outbox relays selected. Recommended: 20. Adaptive connection limit benchmarks (cap@10/15/30 across small, medium, and large follow graphs) confirmed 20 as the practical middle ground:

Small graphs (<200 follows): Saturate at 10–15 relays; cap@20 is more than sufficient.
Medium graphs (300–500 follows): Benefit most from higher caps. cap@10 → cap@30 gained +17pp for a 399-follow profile, with most gains realized by cap@20.
Large graphs (>1000 follows): Gains from more connections are real but noisy — session-to-session variance dominates beyond cap@20.

cap@30 provides marginal additional gains for medium graphs but increases connection overhead for all profiles. cap@10 loses roughly 6–15pp vs cap@20 for non-trivial follow lists (estimates; cross-session variance is high). 20 balances relay diversity against connection cost.

Usage

const ndk = new NDK({
    explicitRelayUrls: ['wss://relay.damus.io', 'wss://nos.lol'],
    enableThompsonSampling: true,
    maxOutboxRelays: 20,
    nip66MonitorRelays: ['wss://relay.nostr.watch'],
});

// Persist scores across sessions
const saved = localStorage.getItem('thompson-priors');
if (saved) ndk.thompsonSampler!.importPriors(JSON.parse(saved));
window.addEventListener('beforeunload', () => {
    localStorage.setItem('thompson-priors',
        JSON.stringify(ndk.thompsonSampler!.exportPriors()));
});

Benchmark evidence

Plain Thompson maximizes mean recall but regresses fiatjaf (-18pp) due to NDK's priority cascade. CG3 eliminates that regression while matching or improving every individual profile — it is Pareto-superior within the same benchmark conditions:

Algorithm	Event Recall	Notes
NDK + Thompson	22.0%	30-run measurement; regresses fiatjaf
NDK + Thompson + CG3	26.9% (+4.9pp)	No per-profile regressions; Pareto-superior

Both figures from the same 30-run measurement (6 EN profiles × 1yr × 5 sessions, NIP-66 liveness, cap@20). CG3 beats plain Thompson on all 6 profiles. NDK without Thompson scores ~16–22% depending on conditions (see OUTBOX-REPORT §8.2).

NIP-66 liveness filtering reduced median loading times by ~45% by eliminating timeout waits on dead relays.

See OUTBOX-REPORT §8.5c.

Test plan

41 new unit + integration tests, 68 existing tests in modified areas — all pass, 0 regressions
Integration: Thompson priors populate after EOSE, with sole-source 0.3× weighting
Integration: maxOutboxRelays: 5 returns ≤5 unique relay URLs
Integration: NIP-66 passes through on empty/stale monitor data

Closes #385
Closes #386

🤖 Generated with Claude Code

… utility - NIP-66: filter dead relays using kind-30166 monitor data before outbox selection. Auto-refreshes on each relay selection round (non-blocking). Graceful degradation: passes through on stale/insufficient data, .onion always passes, never orphans an author. - maxOutboxRelays: hard cap on unique outbox relay count, enforced for all code paths including the missing-relay fallback. Works independently of Thompson. Benchmark data shows 20 covers 93-97% of authors. - sample-beta: Beta distribution sampling (Jöhnk + Marsaglia-Tsang). Returns 0.5 on invalid inputs instead of throwing. Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ntee - Thompson Sampling: Bayesian relay scoring using Beta(α,β) priors. Learns from delivery outcomes observed at subscription EOSE (closeOnEose only). Scores are sampled once per relay per round to ensure stable sort order. Weighted by relay author count: (1 + ln(N)) × sampleBeta(α,β). - CG3 (Coverage Guarantee v3): force-selects sole-source relays before the main selection loop. Uses 0.3× observation weight to prevent over-crediting sole-source deliveries. Conditional skip when sole-source count exceeds budget. - Delivery observation: tracks which relays delivered events for which authors (including duplicate deliveries), observes hit/miss at EOSE. Skips inactive authors (P4), deduplicates per-round (P8), non-blocking (P10). All features opt-in via NDK constructor options. Closes nostr-dev-kit#385 Closes nostr-dev-kit#386 Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

41 tests covering: - sample-beta: uniform prior, known distributions, invalid input handling - NIP-66: pass-through on stale/empty data, dead relay filtering, .onion bypass - Thompson: weighted scores, observation updates, dedup, decay, export/import - Coverage guarantee: force-selection, budget skip/cap, ordering - Integration: Thompson priors at EOSE, sole-source 0.3× weight, maxOutboxRelays cap, NIP-66 outbox pass-through Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

alltheseas changed the title ~~feat(outbox): add Thompson Sampling (CG3) + NIP-66 liveness filtering~~ outbox: improve coverage by +35% thompson learning, reduce load times by 40% nip-66 Mar 11, 2026

alltheseas force-pushed the feat/thompson-cg3-nip66 branch from b037628 to 8a2acaa Compare March 11, 2026 06:57

alltheseas changed the title ~~outbox: improve coverage by +35% thompson learning, reduce load times by 40% nip-66~~ outbox: improve coverage by +35%, reduce load times by 40% nip-66 Mar 11, 2026

alltheseas force-pushed the feat/thompson-cg3-nip66 branch 2 times, most recently from d64759c to 6c7735f Compare March 11, 2026 07:14

alltheseas and others added 3 commits March 11, 2026 02:19

alltheseas force-pushed the feat/thompson-cg3-nip66 branch from 6c7735f to 8cb4a88 Compare March 11, 2026 07:20

alltheseas changed the title ~~outbox: improve coverage by +35%, reduce load times by 40% nip-66~~ outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap Mar 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387

outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387
alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
alltheseas:feat/thompson-cg3-nip66

alltheseas commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alltheseas commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's added

Thompson Sampling (enableThompsonSampling: true)

CG3 coverage guarantee (default on when Thompson enabled) — Pareto-superior to plain Thompson

NIP-66 liveness filtering (nip66MonitorRelays) — ~45% faster loading, removes dead relays

Connection cap (maxOutboxRelays) — works independently

Usage

Benchmark evidence

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alltheseas commented Mar 11, 2026 •

edited

Loading

Thompson Sampling (`enableThompsonSampling: true`)

NIP-66 liveness filtering (`nip66MonitorRelays`) — ~45% faster loading, removes dead relays

Connection cap (`maxOutboxRelays`) — works independently