fix an issue when a full page is filtered out#43
Conversation
…vailable in the inner stream
| $tracer && $tracer->filter_retry($this, $inner_cursor, $retry_cursor, $depth, $want_count, $inner_result->get_size(), count($retained)); | ||
|
|
||
| // Don't count fully-filtered pages against retry budget | ||
| $new_depth = count($retained) === 0 && $this->skip_empty_pages ? $depth : $depth - 1; |
There was a problem hiding this comment.
Since we are now potentially delegating exiting the recursion to the inner stream being exhausted, is there a risk of running out of memory/running into stack overflow issues?
There was a problem hiding this comment.
yes, that could happen if we have a huge block of filtered elements.
lets say that we paginate by 10 elements.
and we have 30 consecutive elements being filtered.
this change addresses that issue by not counting "depth" for empty pages.
current exhaustion
page 1 => 8 elements - depth 2
page 2 => 10 filtered elements - depth 1
with this fix:
page 1 => 8 elements - depth 2
page 2 => 10 filtered elements - depth 1
page 3 => 10 filtered elements - depth 1
page 3 => 10 filtered elements - depth 1
page 4 => 5 elements - depth 0
if we make little progress by finding some elements in subsequent pages (inner enumeration), we will decrement retry count (depth). but when we have a big chunk of filtered elements, we will be able to skip the whole block and fetch a few more elements from a subsequent page.
if we notice that this goes too deep and causes memory issues, we can turn it off by setting skip_empty_pages to false.
I want skip_empty_pages to be true by default so that we fix this bug on most places and we still have a way to stop the behaviour.
There was a problem hiding this comment.
Sounds good, thank you for explaining. Let's call this out in release notes and carefully monitor affected templates during deploy.
| $tracer && $tracer->filter_retry($this, $inner_cursor, $retry_cursor, $depth, $want_count, $inner_result->get_size(), count($retained)); | ||
|
|
||
| // Don't count fully-filtered pages against retry budget | ||
| $new_depth = count($retained) === 0 && $this->skip_empty_pages ? $depth : $depth - 1; |
There was a problem hiding this comment.
Sounds good, thank you for explaining. Let's call this out in release notes and carefully monitor affected templates during deploy.
What and why? 🤔
When a fetched batch contains only filtered-out elements (e.g., moderated posts), CursorlessFilteredStream counts it against the retry budget. After
retry_countretries, pagination stops with an empty result and no cursor — even though the inner stream has more content.This PR adds a
skip_empty_pagesoption (default true) that preserves the retry budget when a page yields zero retained results. Retries are only consumed when partial progress is made. The inner stream's natural exhaustion still guarantees termination.skip_empty_pages: true (default) — fully-filtered pages don't decrement retry depthskip_empty_pages: false — previous behaviorTesting Steps ✍️
Unit tests should pass
You're it! 👑
Tag a random reviewer by adding
@Automattic/stream-buildersto the reviewers section, but mention them here for visibility as well.