Add sealed / active split to reconciliation report#78
Open
AnIrishDuck wants to merge 2 commits into
Open
Conversation
Reconciliation previously ran every segment through the strict size-diff
pipeline, which flagged the currently active (writeable) tail segment of
every partition on every pass: writes accumulate on disk before the
manifest flush, so the active segment legitimately diverges. That noise
blocks running reconcile continuously alongside writes.
Split the output into two buckets keyed off each partition's sealed_ix
watermark, read once per partition pass:
- sealed bucket: segments at or below sealed_ix go through the existing
strict-diff + CAS-fix pipeline, unchanged.
- active bucket: segments above sealed_ix (or all segments when the
partition has never sealed one) are recorded informationally as
{ topic, partition, manifest_size, disk_size, delta } and never fixed.
Introduce ReconcileReport { sealed: ReconcileStats, active: Vec<..> } and
replace ReconcileJob::stats() with report(). The /info handler now also
returns the active-segment bucket (additive, serde-default field).
last_manifest_update_age is left as an optional None field with a TODO:
there is no durable-update timestamp hook in the write path yet, and this
PR deliberately does not add one.
6f1deba to
db332ba
Compare
JONBRWN
approved these changes
Jun 8, 2026
JONBRWN
left a comment
Collaborator
There was a problem hiding this comment.
get the rust check resolved but looks good
3808dad to
09c612f
Compare
No plans to populate it in the near term; can be re-added if/when the write path grows a durable-update timestamp hook.
09c612f to
8e220e2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Create a new
ReconcileReportstruct to split out the stats. Then, never run any fixes on active (unsealed) segments. This is the first and largest half of getting reconciliation concurrent-ready. The other smaller half is properly interacting with retention.Claude's summary (edited to remove irrelevant removed code):
Reconciliation previously ran every segment through the strict size-diff
pipeline, which flagged the currently active (writeable) tail segment of
every partition on every pass: writes accumulate on disk before the
manifest flush, so the active segment legitimately diverges. That noise
blocks running reconcile continuously alongside writes.
Split the output into two buckets keyed off each partition's sealed_ix
watermark, read once per partition pass:
strict-diff + CAS-fix pipeline, unchanged.
partition has never sealed one) are recorded informationally as
{ topic, partition, manifest_size, disk_size, delta } and never fixed.
Introduce ReconcileReport { sealed: ReconcileStats, active: Vec<..> } and
replace ReconcileJob::stats() with report(). The /info handler now also
returns the active-segment bucket (additive, serde-default field).