Skip to content

fix an issue when a full page is filtered out#43

Merged
lucila merged 4 commits intomainfrom
lucila/skip-empty-filtered-pages
Apr 7, 2026
Merged

fix an issue when a full page is filtered out#43
lucila merged 4 commits intomainfrom
lucila/skip-empty-filtered-pages

Conversation

@lucila
Copy link
Copy Markdown
Contributor

@lucila lucila commented Apr 6, 2026

What and why? 🤔

When a fetched batch contains only filtered-out elements (e.g., moderated posts), CursorlessFilteredStream counts it against the retry budget. After retry_count retries, pagination stops with an empty result and no cursor — even though the inner stream has more content.

This PR adds a skip_empty_pages option (default true) that preserves the retry budget when a page yields zero retained results. Retries are only consumed when partial progress is made. The inner stream's natural exhaustion still guarantees termination.

  • skip_empty_pages: true (default) — fully-filtered pages don't decrement retry depth
  • skip_empty_pages: false — previous behavior

Testing Steps ✍️

  • Enumerate a CursorlessFilteredStream where multiple consecutive batches are entirely filtered — verify pagination continues past them
  • Verify skip_empty_pages: false preserves the old behavior
  • Verify retry depth still decrements normally when batches have partial results

Unit tests should pass

You're it! 👑

Tag a random reviewer by adding @Automattic/stream-builders to the reviewers section, but mention them here for visibility as well.

@coveralls
Copy link
Copy Markdown

coveralls commented Apr 6, 2026

Coverage Status

coverage: 80.255% (+0.05%) from 80.208% — lucila/skip-empty-filtered-pages into main

@lucila lucila marked this pull request as ready for review April 6, 2026 20:14
@lucila lucila requested a review from a team as a code owner April 6, 2026 20:14
@lucila lucila requested a review from koke April 6, 2026 20:14
@lucila lucila added enhancement New feature or request needs review labels Apr 6, 2026
$tracer && $tracer->filter_retry($this, $inner_cursor, $retry_cursor, $depth, $want_count, $inner_result->get_size(), count($retained));

// Don't count fully-filtered pages against retry budget
$new_depth = count($retained) === 0 && $this->skip_empty_pages ? $depth : $depth - 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are now potentially delegating exiting the recursion to the inner stream being exhausted, is there a risk of running out of memory/running into stack overflow issues?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that could happen if we have a huge block of filtered elements.

lets say that we paginate by 10 elements.

and we have 30 consecutive elements being filtered.

this change addresses that issue by not counting "depth" for empty pages.

current exhaustion
page 1 => 8 elements - depth 2
page 2 => 10 filtered elements - depth 1

with this fix:
page 1 => 8 elements - depth 2
page 2 => 10 filtered elements - depth 1
page 3 => 10 filtered elements - depth 1
page 3 => 10 filtered elements - depth 1
page 4 => 5 elements - depth 0

if we make little progress by finding some elements in subsequent pages (inner enumeration), we will decrement retry count (depth). but when we have a big chunk of filtered elements, we will be able to skip the whole block and fetch a few more elements from a subsequent page.

if we notice that this goes too deep and causes memory issues, we can turn it off by setting skip_empty_pages to false.

I want skip_empty_pages to be true by default so that we fix this bug on most places and we still have a way to stop the behaviour.

@lengare

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you for explaining. Let's call this out in release notes and carefully monitor affected templates during deploy.

$tracer && $tracer->filter_retry($this, $inner_cursor, $retry_cursor, $depth, $want_count, $inner_result->get_size(), count($retained));

// Don't count fully-filtered pages against retry budget
$new_depth = count($retained) === 0 && $this->skip_empty_pages ? $depth : $depth - 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you for explaining. Let's call this out in release notes and carefully monitor affected templates during deploy.

@lucila lucila merged commit f7de47f into main Apr 7, 2026
11 checks passed
@lucila lucila deleted the lucila/skip-empty-filtered-pages branch April 7, 2026 10:49
lucila added a commit that referenced this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request needs review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants