ceac: skip peer-cache rows outside the verdict lookback window by catoneone · Pull Request #540 · AffineFoundation/affine-cortex

catoneone · 2026-06-09T11:31:11Z

Why

`scores_index` has no TTL — it accumulates rows for every miner
(hotkey, revision) that ever ran through anti-copy, including
deregistered ones. Observed on the validator at the time of writing:

`tick: 10/652 rows pending` → 652 R2 blob fetches
~4 MB / blob → 2.6 GB of R2 I/O on every refresh-service
restart
~70 % of those rows are older than `verdict_lookback_days` and
`_pick_origin` already filters them out during the verdict math
This was making the post-restart "build peer cache" step take
10-15+ minutes on a large table, blocking the first batch of
verdicts behind it.

What

Compute a single floor up front:

```
peer_fb_floor = min(pending.first_block) - lookback_blocks
```

`min()` (not `max()`) anchors off the earliest pending candidate, so
the resulting floor is valid for every pending row this tick will
judge. Rows whose `first_block` is below `peer_fb_floor` are skipped
before the R2 fetch. The log line now reports both the cache size and
the skip count so trim depth is visible per tick:

```
[anticopy.verdict] peer cache built: 192 blobs (skipped 461 outside 7d lookback)
```

`verdict_lookback_days=0` keeps the original "load everything"
behaviour intact (validated by a new test alongside the affirmative
case).

Test plan

`pytest tests/test_anticopy_verdict_backfill.py` — 8 passed
(5 existing + 2 new: skip-outside-window + lookback-zero-disables-filter).
Validator deploy: confirm `peer cache built: X blobs
(skipped Y outside Nd lookback)` log appears with X+Y == `list_all`
size, and that the tick completes proportionally faster than the
pre-PR 2.6 GB pull.

Notes

Doesn't touch the actual verdict math or row-level decision criteria —
strictly an I/O / memory optimisation for the cache-build step. The
verdict outcomes are identical (the rows we now skip were already
being filtered out by `_pick_origin`'s own lookback check).

scores_index has no TTL, so the table grows unbounded as miners register/dereg/upload new revisions — observed: 653 rows / ~2.6 GB of R2 pulls on every refresh-service restart even though >70% are older than verdict_lookback_days and can never influence a verdict. _pick_origin already filters peers older than the same window during the verdict math, so the R2 work was pure overhead. Compute peer_fb_floor = min(pending.first_block) - lookback_blocks once per tick and skip rows whose first_block is below it before issuing the R2 fetch. Earliest-pending anchor keeps the floor valid for every pending candidate this tick will judge. lookback_days=0 disables the filter (matches the existing 'all rows always loaded' behaviour expected by the older fixtures). Logged at INFO: peer cache size + how many rows were skipped, so an operator can sanity-check the trim depth across a deploy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ceac: skip peer-cache rows outside the verdict lookback window#540

ceac: skip peer-cache rows outside the verdict lookback window#540
catoneone wants to merge 1 commit into
mainfrom
ceac/anticopy-verdict-lookback-peer-cache

catoneone commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

catoneone commented Jun 9, 2026

Why

What

Test plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant