Not a first step. Layer 0+1 (liveness + scope) must ship and prove the pattern first. See #14. Recall is documented here so people can see where this goes, not because anyone should build it today.
What recall measures
Liveness answers: "is this relay up?" Recall answers: "does this relay actually deliver the events it should have?"
A client that tracks per-relay results can compute: "I expected 47 events from relay X, I got 38." That's a recall of 38/47 = 80.9%.
Published event
{
"kind": 30166,
"tags": [
["d", "wss://relay.example.com/"],
["scope", "follow-graph"],
["recall", "38", "47", "604800"]
],
"content": "",
"created_at": 1234567890
}
`["recall", "", "", "<window_seconds>"]` — "over the last 7 days, I got 38 of 47 expected events from this relay."
Why it's hard
Most clients today merge relay results into a single feed stream. To compute recall, a client needs to:
// 1. Track which events came from which relay
const perRelayEvents = new Map<string, Set<string>>();
pool.on("event", (relay, event) => {
const set = perRelayEvents.get(relay) ?? new Set();
set.add(event.id);
perRelayEvents.set(relay, set);
});
// 2. Compute baseline: what's the total set of events across ALL relays?
// (you need to query multiple relays to know what you're missing)
const allEventIds = new Set<string>();
for (const ids of perRelayEvents.values()) {
for (const id of ids) allEventIds.add(id);
}
// 3. Per-relay recall = what fraction of the baseline did this relay deliver?
for (const [relay, ids] of perRelayEvents) {
const recall = ids.size / allEventIds.size;
// relay "wss://relay.example.com/" delivered 80.9% of known events
}
This is an architectural change — not a feature flag. The client must:
- Maintain per-relay event tracking (step 1)
- Query multiple relays for the same pubkeys to establish a baseline (step 2)
- Periodically compute and publish recall (step 3)
Most clients don't do step 1 at all today. Step 2 means intentionally querying relays you wouldn't otherwise connect to, just to know what's missing.
Why it must come from clients, not a service
Publishing recall from a centralized benchmark recreates the "trusted third party" problem this whole approach is designed to eliminate. Each user's recall observations come from their own follow graph, their own relay connections, their own vantage point. That's the point.
What it enables
Recall is the signal that separates "alive but useless" from "alive and delivering." A relay can respond to pings and WebSocket connections (passes liveness) but drop 40% of events (fails recall). Only clients who actually use the relay for real queries can observe this.
Combined with WoT: if 15 people you follow report 90%+ recall for a relay, that's a strong signal. If 15 people report <50% recall, something is wrong — even if the relay passes every liveness check.
What recall measures
Liveness answers: "is this relay up?" Recall answers: "does this relay actually deliver the events it should have?"
A client that tracks per-relay results can compute: "I expected 47 events from relay X, I got 38." That's a recall of 38/47 = 80.9%.
Published event
{ "kind": 30166, "tags": [ ["d", "wss://relay.example.com/"], ["scope", "follow-graph"], ["recall", "38", "47", "604800"] ], "content": "", "created_at": 1234567890 }`["recall", "", "", "<window_seconds>"]` — "over the last 7 days, I got 38 of 47 expected events from this relay."
Why it's hard
Most clients today merge relay results into a single feed stream. To compute recall, a client needs to:
This is an architectural change — not a feature flag. The client must:
Most clients don't do step 1 at all today. Step 2 means intentionally querying relays you wouldn't otherwise connect to, just to know what's missing.
Why it must come from clients, not a service
Publishing recall from a centralized benchmark recreates the "trusted third party" problem this whole approach is designed to eliminate. Each user's recall observations come from their own follow graph, their own relay connections, their own vantage point. That's the point.
What it enables
Recall is the signal that separates "alive but useless" from "alive and delivering." A relay can respond to pings and WebSocket connections (passes liveness) but drop 40% of events (fails recall). Only clients who actually use the relay for real queries can observe this.
Combined with WoT: if 15 people you follow report 90%+ recall for a relay, that's a strong signal. If 15 people report <50% recall, something is wrong — even if the relay passes every liveness check.