source-stripe-native: speed up fetching connected account ids with concurrent worker system by Alex-Bair · Pull Request #4229 · estuary/connectors

Alex-Bair · 2026-04-13T16:03:50Z

Description:

The previous _fetch_connected_account_ids implementation was a single sequential paginator through GET /v1/accounts. For platforms with 36k+ connected accounts, this was taking over an hour. That's too slow since all connected account ids are fetched each time the capture starts up; we'd rather not spend an hour fetching connected account ids before capturing any data.

This commit replaces that sequential paginator with a concurrent worker system modeled after source-klaviyo-native's events backfill. The concurrent worker system partitions the time range into chunks using Stripe's created[gte]/created[lte] query parameters and has multiple workers paginate through their respective chunks in parallel.

Workers detect dense time windows (chunks that take >30s to paginate) and, when idle workers are available, submit the remaining unprocessed range to a subdivision worker that splits it into smaller chunks for other workers to pick up.

Notes for reviewers:

This is an isolated change that's a drop in replacement for how the connector fetches connected account ids. The very similar concurrent worker system in source-klaviyo-native has been working well for multiple months, and I anticipate it'll work well in source-stripe-native too to speed up fetching all connected account ids by at least 5x, likely more.

…ncurrent worker system The previous `_fetch_connected_account_ids` implementation was a single sequential paginator through GET `/v1/accounts`. For platforms with 36k+ connected accounts, this was taking over an hour. That's too slow since all connected account ids are fetched each time the capture starts up; we'd rather not spend an hour fetching connected account ids before capturing any data. This commit replaces that sequential paginator with a concurrent worker system modeled after `source-klaviyo-native`'s events backfill that partitions the time range into chunks using Stripe's `created[gte]`/`created[lte]` query parameters and has multiple workers paginate through their respective chunks in parallel. Workers detect dense time windows (chunks that take >30s to paginate) and, when idle workers are available, submit the remaining unprocessed range to a subdivision worker that splits it into smaller chunks for other workers to pick up.

Alex-Bair · 2026-04-13T16:22:09Z

Note: most of the CI checks failed due to an intermittent GitHub issue, but the one for source-stripe-native did succeed once GitHub stopped returning 504s.

Alex-Bair marked this pull request as ready for review April 13, 2026 16:22

Alex-Bair requested a review from a team April 13, 2026 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

source-stripe-native: speed up fetching connected account ids with concurrent worker system#4229

source-stripe-native: speed up fetching connected account ids with concurrent worker system#4229
Alex-Bair wants to merge 1 commit intomainfrom
bair/source-stripe-native-speed-up-fetching-connected-accounts

Alex-Bair commented Apr 13, 2026

Uh oh!

Alex-Bair commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Alex-Bair commented Apr 13, 2026

Uh oh!

Alex-Bair commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant