Problem Description
Running openmeta scout --limit 5 or any command that triggers fetchTrendingIssues() quickly hits GitHub's Search API rate limit (403). The retry mechanism then discards all previously fetched pages and restarts from page 0, creating a fatal retry loop that eventually crashes the process with zero results returned.
Reproduction
openmeta scout --refresh --limit 5
Observed: the command hits 403 within seconds, retries 3 times, then crashes with "GitHub issue discovery failed."Add the "--refresh" argument is to make sure the local cache to be skipped so that the bug can be reproduced.
Root Cause Analysis
Three compounding issues in paginateSearchWithRetry() (github.ts):
A. Unbounded greedy pagination (root cause)
octokit.paginate.iterator walks every available page with no upper bound. Labels like good first issue have hundreds of pages. GitHub's Search API allows only 30 authenticated requests per minute. The iterator fires requests back-to-back with no throttling, exhausting the quota within seconds.
B. All-or-nothing data loss on retry (fatal amplifier)
const items = [] is declared inside the try block, scoped to each retry attempt. When page N triggers a 403:
Attempt 1: fetch pages 1..4 successfully (120 items) → page 5 hits 403
→ catch block: items is out of scope, 120 items GONE
→ wait ~60s for rate limit window reset
Attempt 2: fetch pages 1..4 again → page 5 hits 403 again
→ 120 items GONE again
Attempt 3: same → crash with zero results
The retry always starts from page 0 with a fresh empty array. Since the rate limit window (30 requests) is exactly consumed by the pages before the crash point, every retry hits the same wall.
C. Rate-limit retry targets the wrong problem
The old code reads X-RateLimit-Reset headers and applies exponential backoff (1s → 2s → 4s) specifically for 403/429 errors. But waiting and retrying from scratch is futile when the underlying problem is unbounded fetch depth — the quota will always be exhausted before all pages are collected.
Proposed Solutions
1. Preserve already-fetched data across retries
Move const items = [] outside the while loop so that when a rate-limit error occurs mid-pagination, the method can gracefully return whatever has been collected so far instead of discarding it.
// Before (broken): items dies with the try block
while (attempt < MAX_PAGINATION_RETRIES) {
try {
const items = []; // ← scoped here, lost on error
for await (...) { items.push(...); }
return items;
} catch { /* items is gone */ }
}
// After (fixed): items survives across retry attempts
const items = [];
while (attempt < MAX_PAGINATION_RETRIES) {
try {
for await (...) { items.push(...); }
return items;
} catch (error) {
if (isRateLimit && items.length > 0) {
return items; // ← graceful degradation
}
// ...
}
}
Getting 120 results with a warning is infinitely better than crashing with 0 results.
2. [For discussion] Add a maximum page limit to avoid triggering rate limits entirely
Introduce MAX_SEARCH_PAGES to cap pagination depth. With the current setup:
| Parameter |
Value |
| Authenticated Search API quota |
30 requests/min |
Results per page (SEARCH_RESULTS_PER_PAGE) |
30 |
Label groups (FILTER_LABEL_GROUPS) |
2 |
Math for a safe limit:
- If we set
MAX_SEARCH_PAGES = 13 per label group: 13 × 2 = 26 requests total
- Results: 13 × 30 = ~390 results per label group, ~780 total
- Leaves 4 requests (of 30) as safety margin
3. [For discussion] If full pagination is truly desired, the retry must resume from the breakpoint
If the authors genuinely want to fetch all pages (hundreds), the current retry-from-page-0 approach is fundamentally broken regardless of backoff strategy. The fix would require tracking the last successful page and resuming from there on retry:
const items: SearchIssueItem[] = [];
let currentPage = 1;
while (attempt < MAX_PAGINATION_RETRIES) {
try {
for await (const response of this.octokit.paginate.iterator(
this.octokit.rest.search.issuesAndPullRequests,
{
q: searchQuery,
sort: 'updated',
order: 'desc',
per_page: SEARCH_RESULTS_PER_PAGE,
page: currentPage, // ← resume from breakpoint
},
)) {
const pageItems = Array.isArray((response as any).data) ? (response as any).data : [];
items.push(...pageItems);
currentPage++;
}
return items;
} catch (error) {
// currentPage stays where it was — retry picks up from the breakpoint
// items survives across attempts — no data loss
}
}
octokit.paginate.iterator accepts a page parameter that is passed through to the underlying API call, so resuming from a breakpoint does not require abandoning the iterator — just tracking currentPage and passing it on retry.
Problem Description
Running
openmeta scout --limit 5or any command that triggersfetchTrendingIssues()quickly hits GitHub's Search API rate limit (403). The retry mechanism then discards all previously fetched pages and restarts from page 0, creating a fatal retry loop that eventually crashes the process with zero results returned.Reproduction
Observed: the command hits 403 within seconds, retries 3 times, then crashes with "GitHub issue discovery failed."Add the "--refresh" argument is to make sure the local cache to be skipped so that the bug can be reproduced.
Root Cause Analysis
Three compounding issues in
paginateSearchWithRetry()(github.ts):A. Unbounded greedy pagination (root cause)
octokit.paginate.iteratorwalks every available page with no upper bound. Labels likegood first issuehave hundreds of pages. GitHub's Search API allows only 30 authenticated requests per minute. The iterator fires requests back-to-back with no throttling, exhausting the quota within seconds.B. All-or-nothing data loss on retry (fatal amplifier)
const items = []is declared inside thetryblock, scoped to each retry attempt. When page N triggers a 403:The retry always starts from page 0 with a fresh empty array. Since the rate limit window (30 requests) is exactly consumed by the pages before the crash point, every retry hits the same wall.
C. Rate-limit retry targets the wrong problem
The old code reads
X-RateLimit-Resetheaders and applies exponential backoff (1s → 2s → 4s) specifically for 403/429 errors. But waiting and retrying from scratch is futile when the underlying problem is unbounded fetch depth — the quota will always be exhausted before all pages are collected.Proposed Solutions
1. Preserve already-fetched data across retries
Move
const items = []outside thewhileloop so that when a rate-limit error occurs mid-pagination, the method can gracefully return whatever has been collected so far instead of discarding it.Getting 120 results with a warning is infinitely better than crashing with 0 results.
2. [For discussion] Add a maximum page limit to avoid triggering rate limits entirely
Introduce
MAX_SEARCH_PAGESto cap pagination depth. With the current setup:SEARCH_RESULTS_PER_PAGE)FILTER_LABEL_GROUPS)Math for a safe limit:
MAX_SEARCH_PAGES = 13per label group: 13 × 2 = 26 requests total3. [For discussion] If full pagination is truly desired, the retry must resume from the breakpoint
If the authors genuinely want to fetch all pages (hundreds), the current retry-from-page-0 approach is fundamentally broken regardless of backoff strategy. The fix would require tracking the last successful page and resuming from there on retry:
octokit.paginate.iteratoraccepts apageparameter that is passed through to the underlying API call, so resuming from a breakpoint does not require abandoning the iterator — just trackingcurrentPageand passing it on retry.