outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387
Open
alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
Open
outbox: add Thompson+CG3 (+4.9pp recall), NIP-66 filtering (~45% faster), connection cap#387alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
alltheseas wants to merge 3 commits intonostr-dev-kit:masterfrom
Conversation
b037628 to
8a2acaa
Compare
d64759c to
6c7735f
Compare
… utility - NIP-66: filter dead relays using kind-30166 monitor data before outbox selection. Auto-refreshes on each relay selection round (non-blocking). Graceful degradation: passes through on stale/insufficient data, .onion always passes, never orphans an author. - maxOutboxRelays: hard cap on unique outbox relay count, enforced for all code paths including the missing-relay fallback. Works independently of Thompson. Benchmark data shows 20 covers 93-97% of authors. - sample-beta: Beta distribution sampling (Jöhnk + Marsaglia-Tsang). Returns 0.5 on invalid inputs instead of throwing. Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ntee - Thompson Sampling: Bayesian relay scoring using Beta(α,β) priors. Learns from delivery outcomes observed at subscription EOSE (closeOnEose only). Scores are sampled once per relay per round to ensure stable sort order. Weighted by relay author count: (1 + ln(N)) × sampleBeta(α,β). - CG3 (Coverage Guarantee v3): force-selects sole-source relays before the main selection loop. Uses 0.3× observation weight to prevent over-crediting sole-source deliveries. Conditional skip when sole-source count exceeds budget. - Delivery observation: tracks which relays delivered events for which authors (including duplicate deliveries), observes hit/miss at EOSE. Skips inactive authors (P4), deduplicates per-round (P8), non-blocking (P10). All features opt-in via NDK constructor options. Closes nostr-dev-kit#385 Closes nostr-dev-kit#386 Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
41 tests covering: - sample-beta: uniform prior, known distributions, invalid input handling - NIP-66: pass-through on stale/empty data, dead relay filtering, .onion bypass - Thompson: weighted scores, observation updates, dedup, decay, export/import - Coverage guarantee: force-selection, budget skip/cap, ordering - Integration: Thompson priors at EOSE, sole-source 0.3× weight, maxOutboxRelays cap, NIP-66 outbox pass-through Signed-off-by: alltheseas Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6c7735f to
8cb4a88
Compare
This was referenced Mar 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NDK's outbox relay selection doesn't learn from delivery outcomes, can't filter dead relays, and has no connection cap. This PR adds three opt-in enhancements: Thompson Sampling with CG3 coverage guarantee (+4.9pp event recall over plain Thompson, 22.0% → 26.9%), NIP-66 dead relay filtering (~45% faster loading), and a configurable connection cap. Benchmarked across 6 profiles in the nostrability/outbox suite.
All features are opt-in. Default behavior is unchanged when no new options are set.
What's added
Thompson Sampling (
enableThompsonSampling: true)Relays vary dramatically in reliability, but NDK's static popularity ranking can't distinguish a relay that delivers 100% of the time from one that delivers 10%. Thompson Sampling fixes this by maintaining Bayesian priors per relay, updated from binary hit/miss observations at each
closeOnEosesubscription EOSE. Relays that consistently deliver get selected more; unreliable ones get deprioritized while still being explored occasionally.CG3 coverage guarantee (default on when Thompson enabled) — Pareto-superior to plain Thompson
Plain Thompson can over-penalize relays that serve hard-to-reach authors. Authors with only one write relay are especially vulnerable: if their sole relay scores poorly on other authors, Thompson deprioritizes it — orphaning them entirely. CG3 addresses this by force-selecting sole-source relays before the main selection loop, with 0.3× observation weight to avoid distorting scores. The resulting budget reallocation also improves recall for multi-relay authors whose relays overlap with sole-source ones. In benchmarks, CG3 is Pareto-superior to plain Thompson across all tested profiles.
NIP-66 liveness filtering (
nip66MonitorRelays) — ~45% faster loading, removes dead relaysA significant fraction of relays in kind-10002 lists are permanently offline. Connecting to dead relays wastes connection slots and forces subscriptions to wait for timeouts before EOSE, directly inflating load times. NIP-66 filtering eliminates these dead relays before selection, cutting ~45% off loading times in benchmarks. Gracefully degrades: skips filtering entirely if monitor data is stale (>4h), incomplete (<100 relays), or would orphan an author.
.onionrelays always pass through.Connection cap (
maxOutboxRelays) — works independentlyCaps total unique outbox relays selected. Recommended: 20. Adaptive connection limit benchmarks (cap@10/15/30 across small, medium, and large follow graphs) confirmed 20 as the practical middle ground:
cap@30 provides marginal additional gains for medium graphs but increases connection overhead for all profiles. cap@10 loses roughly 6–15pp vs cap@20 for non-trivial follow lists (estimates; cross-session variance is high). 20 balances relay diversity against connection cost.
Usage
Benchmark evidence
Plain Thompson maximizes mean recall but regresses fiatjaf (-18pp) due to NDK's priority cascade. CG3 eliminates that regression while matching or improving every individual profile — it is Pareto-superior within the same benchmark conditions:
Both figures from the same 30-run measurement (6 EN profiles × 1yr × 5 sessions, NIP-66 liveness, cap@20). CG3 beats plain Thompson on all 6 profiles. NDK without Thompson scores ~16–22% depending on conditions (see OUTBOX-REPORT §8.2).
NIP-66 liveness filtering reduced median loading times by ~45% by eliminating timeout waits on dead relays.
See OUTBOX-REPORT §8.5c.
Test plan
maxOutboxRelays: 5returns ≤5 unique relay URLsCloses #385
Closes #386
🤖 Generated with Claude Code