-
Notifications
You must be signed in to change notification settings - Fork 60
Fix/getitem pagination walk #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
7c592fe
55f1772
2ff7ef5
d3cbaeb
d0d4fcc
924d127
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -354,18 +354,42 @@ def _transform(self, item): | |
| return item | ||
|
|
||
| def __getitem__(self, index): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you pointed out, this would query each page starting from page 1, which isn't ideal. Also this change in logic could/should(?) be opt-in (like our "exponential backoff" feature) and could be enabled with something like # Different ideas for enabling/disabling this feature
client.safe_pagination = True # Default: False.
client.unsafe_pagination = False # Default: True.
client.safe_pagination_indexing = True # Default: False
client.unsafe_pagination_indexing = False # Default: True
# ... Other options?@JOJ0 Any ideas/suggestions on this? Also, please add tests that at least reproduces this bug so we can make sure it is fixed and stays fixed. Look in tests/test_models.py for examples how to mock api responses.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @GFlores17 thanks for the submission, it is a good idea to finally find a solution for this even though it's not optimal. I agree with all @AnssiAhola mentioned. We need:
If you are you still motivated to move this PR forward, we can assist with the details then. I asked and AI to draft some tests that might be a starting point. @AnssiAhola please have a look at them if good enough or too much. Thanks!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @GFlores17 I rebased your 2 commits into one and force pushed. Please reset your branch to upstream. I also pushed the drafted tests.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the config option, what about:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tests seem fine, even though a bit verbose, as is tradition for AI.
This sounds good to me. With some documentation this should be clear enough. Also realized that my suggestion for "reverse walk" wouldn't work, so the original idea to start always from page 1 is pretty much necessary. |
||
| page_index = index // self.per_page + 1 | ||
| offset = index % self.per_page | ||
| """Retrieve an item by its index. | ||
|
|
||
| try: | ||
| page = self.page(page_index) | ||
| except HTTPError as e: | ||
| if e.status_code == 404: | ||
| raise IndexError(e.msg) | ||
| else: | ||
| By default, uses the API's ``per_page`` value to calculate the page | ||
| containing the item directly. If the API returns fewer items per page | ||
| than reported, this may yield incorrect results — set | ||
| ``client.trust_per_page = False`` to fall back to a sequential page | ||
| walk at the cost of performance. | ||
| """ | ||
| if self.client._trust_per_page: | ||
| page_index = index // self.per_page + 1 | ||
| offset = index % self.per_page | ||
|
|
||
| try: | ||
| page = self.page(page_index) | ||
| except HTTPError as e: | ||
| if e.status_code == 404: | ||
| raise IndexError(e.msg) from e | ||
| raise | ||
|
|
||
| return page[offset] | ||
|
|
||
| # Fallback to sequential page loading if we're not trusting the per_page parameter | ||
| current = 0 | ||
| page_index = 1 | ||
| while True: | ||
| try: | ||
| page = self.page(page_index) | ||
| except HTTPError as e: | ||
| if e.status_code == 404: | ||
| raise IndexError(e.msg) from e | ||
| raise | ||
| if current + len(page) > index: | ||
| return page[index - current] | ||
| current += len(page) | ||
| page_index += 1 | ||
|
|
||
| return page[offset] | ||
|
|
||
| def __len__(self): | ||
| return self.count | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of an observation, but we don't have much of method level doc strings at the moment, maybe we should go through the models at some point and document the main methods, at least the ones mentioned in our readthedocs page.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we certainly should and I'm also recently thinking we should start to use automatic linters (eg. ruff). I'll open a new issue for that.
Do you mind adding docstrings to all newly added methods? I'm kind of somewhere else at the moment. Otherwise I'll do it when I find a minute in the next days. I think whenever we touch something we should complete at least docstrings around there.
Or maybe @GFlores17 you have a minute to fire a commit with docstrings for all the new methods and properties that dont have one yet. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I'll write up some docstrings for all the new methods and props that I added. I'll look at other methods/props as well.