Skip to content

Add sealed / active split to reconciliation report#78

Open
AnIrishDuck wants to merge 2 commits into
mainfrom
fmurphy/reconcile-sealed-active
Open

Add sealed / active split to reconciliation report#78
AnIrishDuck wants to merge 2 commits into
mainfrom
fmurphy/reconcile-sealed-active

Conversation

@AnIrishDuck

Copy link
Copy Markdown
Collaborator

Create a new ReconcileReport struct to split out the stats. Then, never run any fixes on active (unsealed) segments. This is the first and largest half of getting reconciliation concurrent-ready. The other smaller half is properly interacting with retention.

Claude's summary (edited to remove irrelevant removed code):

Reconciliation previously ran every segment through the strict size-diff
pipeline, which flagged the currently active (writeable) tail segment of
every partition on every pass: writes accumulate on disk before the
manifest flush, so the active segment legitimately diverges. That noise
blocks running reconcile continuously alongside writes.

Split the output into two buckets keyed off each partition's sealed_ix
watermark, read once per partition pass:

  • sealed bucket: segments at or below sealed_ix go through the existing
    strict-diff + CAS-fix pipeline, unchanged.
  • active bucket: segments above sealed_ix (or all segments when the
    partition has never sealed one) are recorded informationally as
    { topic, partition, manifest_size, disk_size, delta } and never fixed.

Introduce ReconcileReport { sealed: ReconcileStats, active: Vec<..> } and
replace ReconcileJob::stats() with report(). The /info handler now also
returns the active-segment bucket (additive, serde-default field).

Reconciliation previously ran every segment through the strict size-diff
pipeline, which flagged the currently active (writeable) tail segment of
every partition on every pass: writes accumulate on disk before the
manifest flush, so the active segment legitimately diverges. That noise
blocks running reconcile continuously alongside writes.

Split the output into two buckets keyed off each partition's sealed_ix
watermark, read once per partition pass:

- sealed bucket: segments at or below sealed_ix go through the existing
  strict-diff + CAS-fix pipeline, unchanged.
- active bucket: segments above sealed_ix (or all segments when the
  partition has never sealed one) are recorded informationally as
  { topic, partition, manifest_size, disk_size, delta } and never fixed.

Introduce ReconcileReport { sealed: ReconcileStats, active: Vec<..> } and
replace ReconcileJob::stats() with report(). The /info handler now also
returns the active-segment bucket (additive, serde-default field).

last_manifest_update_age is left as an optional None field with a TODO:
there is no durable-update timestamp hook in the write path yet, and this
PR deliberately does not add one.
@AnIrishDuck AnIrishDuck requested review from JONBRWN and imp June 5, 2026 14:50
@AnIrishDuck AnIrishDuck force-pushed the fmurphy/reconcile-sealed-active branch from 6f1deba to db332ba Compare June 8, 2026 20:44

@JONBRWN JONBRWN left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get the rust check resolved but looks good

@AnIrishDuck AnIrishDuck force-pushed the fmurphy/reconcile-sealed-active branch 2 times, most recently from 3808dad to 09c612f Compare June 8, 2026 22:19
No plans to populate it in the near term; can be re-added if/when the
write path grows a durable-update timestamp hook.
@AnIrishDuck AnIrishDuck force-pushed the fmurphy/reconcile-sealed-active branch from 09c612f to 8e220e2 Compare June 8, 2026 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants