Problem
Algorithm recall drops from 83% at 7 days to 24% at 1 year. Relay retention is not the main cause — median relay event availability at 1yr is 77% (measured across 277 relays, 42 cache files, 4 profiles via bench/retention-profile.ts).
The dominant factor is relay discovery: a user's current kind-10002 (NIP-65) lists where they write now, not where they wrote a year ago. When users change relays, the link to their old content is lost.
Evidence
- Relay-side drop (7d→1yr): ~20% (e.g. relay.damus.io 94%→76%, nos.lol 95%→77%)
- Algorithm-side drop (7d→1yr): ~70% (83%→24%)
- Relay retention explains ~1/4 of the recall collapse
- Historical kind-10002 events are gone — probed 6 relays, all correctly discard old versions per NIP-01 replaceable event semantics
Why existing mechanisms don't help
- Kind-10002 is replaceable: old relay lists are discarded. No relay archives them.
- NIP-66 (current): monitors relay liveness and capabilities, but not retention depth.
- Hardcoded archival relay lists: centralizing and fragile.
Proposed solutions
Two independent, complementary protocol changes:
1. NIP-65 archive marker (author-level relay history)
PR: nostr-protocol/nips#2243
When a user migrates from relay A to relay B, their client marks relay A as archive instead of removing it. Clients fetching historical content check archive-tagged relays.
["r", "wss://old-relay.example.com", "archive"]
- No new event kinds, no NIP-01 changes
- Backward compatible — clients that don't understand
archive ignore unknown markers
- Opt-in — clients SHOULD NOT add archive tags without user consent
2. NIP-66 retention-depth tag (relay-level retention discovery)
Monitors probe relays at increasing time depths and publish results:
["retention-depth", "kind:1", "31536000"]
Meaning: "this relay returned kind-1 events at least 1yr old."
- No baseline computation needed (unlike recall-at-window)
- Any NIP-66 monitor can add this today
bench/retention-profile.ts is the offline proof-of-concept
How they complement each other
| Scenario |
archive |
retention-depth |
| Author migrated, old relay has events |
Finds old relay directly |
Doesn't know author was there |
| Author stayed, current relay pruned |
No archive tags to follow |
Flags relay as low-retention |
| Client doesn't support archive |
No help |
Still guides relay selection |
Either alone improves recall. Together they close the gap from both sides. Both are independent and can ship separately.
Data
Full retention profile: deno task retention (zero network, reads from phase2 cache)
Top relays by 1yr availability (≥20 authors):
- relay.ditto.pub: 98% (61 authors)
- nostr-pub.wellorder.net: 93% (289 authors)
- nostr21.com: 86% (56 authors)
- nos.lol: 77% (1295 authors)
- relay.damus.io: 76% (1451 authors)
Related
Problem
Algorithm recall drops from 83% at 7 days to 24% at 1 year. Relay retention is not the main cause — median relay event availability at 1yr is 77% (measured across 277 relays, 42 cache files, 4 profiles via
bench/retention-profile.ts).The dominant factor is relay discovery: a user's current kind-10002 (NIP-65) lists where they write now, not where they wrote a year ago. When users change relays, the link to their old content is lost.
Evidence
Why existing mechanisms don't help
Proposed solutions
Two independent, complementary protocol changes:
1. NIP-65
archivemarker (author-level relay history)PR: nostr-protocol/nips#2243
When a user migrates from relay A to relay B, their client marks relay A as
archiveinstead of removing it. Clients fetching historical content check archive-tagged relays.archiveignore unknown markers2. NIP-66 retention-depth tag (relay-level retention discovery)
Monitors probe relays at increasing time depths and publish results:
Meaning: "this relay returned kind-1 events at least 1yr old."
bench/retention-profile.tsis the offline proof-of-conceptHow they complement each other
Either alone improves recall. Together they close the gap from both sides. Both are independent and can ship separately.
Data
Full retention profile:
deno task retention(zero network, reads from phase2 cache)Top relays by 1yr availability (≥20 authors):
Related