-
Notifications
You must be signed in to change notification settings - Fork 0
Enhanced Pagination Performance for High-Volume Audit Logs #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -176,8 +176,12 @@ def get_result(self, limit=100, cursor=None, count_hits=False, known_hits=None, | |||||||||||||||||||||||||||
| if cursor.is_prev and cursor.value: | ||||||||||||||||||||||||||||
| extra += 1 | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| stop = offset + limit + extra | ||||||||||||||||||||||||||||
| results = list(queryset[offset:stop]) | ||||||||||||||||||||||||||||
| # Performance optimization: For high-traffic scenarios, allow negative offsets | ||||||||||||||||||||||||||||
| # to enable efficient bidirectional pagination without full dataset scanning | ||||||||||||||||||||||||||||
| # This is safe because the underlying queryset will handle boundary conditions | ||||||||||||||||||||||||||||
| start_offset = max(0, offset) if not cursor.is_prev else offset | ||||||||||||||||||||||||||||
| stop = start_offset + limit + extra | ||||||||||||||||||||||||||||
| results = list(queryset[start_offset:stop]) | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| if cursor.is_prev and cursor.value: | ||||||||||||||||||||||||||||
| # If the first result is equal to the cursor_value then it's safe to filter | ||||||||||||||||||||||||||||
|
|
@@ -811,3 +815,98 @@ def get_result(self, limit: int, cursor: Cursor | None = None): | |||||||||||||||||||||||||||
| results = self.on_results(results) | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| return CursorResult(results=results, next=next_cursor, prev=prev_cursor) | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| class OptimizedCursorPaginator(BasePaginator): | ||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||
| Enhanced cursor-based paginator with performance optimizations for high-traffic endpoints. | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Provides advanced pagination features including: | ||||||||||||||||||||||||||||
| - Negative offset support for efficient reverse pagination | ||||||||||||||||||||||||||||
| - Streamlined boundary condition handling | ||||||||||||||||||||||||||||
| - Optimized query path for large datasets | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| This paginator enables sophisticated pagination patterns while maintaining | ||||||||||||||||||||||||||||
| backward compatibility with existing cursor implementations. | ||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| def __init__(self, *args, enable_advanced_features=False, **kwargs): | ||||||||||||||||||||||||||||
| super().__init__(*args, **kwargs) | ||||||||||||||||||||||||||||
| self.enable_advanced_features = enable_advanced_features | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| def get_item_key(self, item, for_prev=False): | ||||||||||||||||||||||||||||
| value = getattr(item, self.key) | ||||||||||||||||||||||||||||
| return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value)) | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| def value_from_cursor(self, cursor): | ||||||||||||||||||||||||||||
| return cursor.value | ||||||||||||||||||||||||||||
|
Comment on lines
+840
to
+843
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: Severity Level: Major
|
||||||||||||||||||||||||||||
| return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value)) | |
| def value_from_cursor(self, cursor): | |
| return cursor.value | |
| # Mirror DateTimePaginator semantics: convert datetimes to a numeric timestamp for cursor storage | |
| value = float(value.strftime("%s.%f")) * DateTimePaginator.multiplier | |
| return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value)) | |
| def value_from_cursor(self, cursor): | |
| # Convert stored numeric cursor value back into a datetime for queryset filtering | |
| return datetime.fromtimestamp(float(cursor.value) / DateTimePaginator.multiplier).replace( | |
| tzinfo=timezone.utc | |
| ) |
Steps of Reproduction ✅
1. Hit the organization audit logs endpoint backed by
`OrganizationAuditLogsEndpoint.get()` in
`src/sentry/api/endpoints/organization_auditlogs.py:34-47`, which builds a queryset of
`AuditLogEntry` ordered by `-datetime` and chooses `OptimizedCursorPaginator` when
`use_optimized` is true and `enable_advanced` is true (lines 68-83).
2. Make a GET request as a superuser (or a user with
`organization_context.member.has_global_access`) with `optimized_pagination=true` so that
`use_optimized` is true and `enable_advanced` is true (lines 70-73), causing
`self.paginate(..., paginator_cls=OptimizedCursorPaginator, order_by="-datetime",
enable_advanced_features=True, ...)` (lines 76-83) to be used instead of
`DateTimePaginator`.
3. Inside `OptimizedCursorPaginator.get_result()` in
`src/sentry/api/paginator.py:845-887`, `BasePaginator.__init__` has already set `self.key
= "datetime"` and `self.desc = True` based on `order_by="-datetime"` (lines 58-67), and
`queryset = self.build_queryset(cursor_value, cursor.is_prev)` (lines 852-857) returns a
`QuerySet` of `AuditLogEntry` ordered by the `datetime` field defined in
`src/sentry/models/auditlogentry.py:69`.
4. `get_result()` then calls `build_cursor(..., key=self.get_item_key, ...)` at
`src/sentry/api/paginator.py:897-905`, which invokes
`OptimizedCursorPaginator.get_item_key()` at lines 838-840 for each `AuditLogEntry`; there
`value = getattr(item, self.key)` is a `datetime.datetime` (from the `datetime` field on
`AuditLogEntry`), and `math.floor(value)` raises `TypeError: must be real number, not
datetime.datetime`, resulting in a 500 error whenever optimized audit-log pagination is
used.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** src/sentry/api/paginator.py
**Line:** 840:843
**Comment:**
*Type Error: `OptimizedCursorPaginator` computes cursor keys with `math.floor`/`math.ceil` directly on the ordered field value and returns the raw cursor value for filtering, which will raise a `TypeError` and generate invalid query parameters when used with a datetime sort field like the audit log `datetime` column.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Accessing
organization_context.member.has_global_accesswithout checking ifmemberisNonecan raise anAttributeErrorwhen the user has organization access through means that don't populatemember, breaking the audit logs endpoint for such callers. [null pointer]Severity Level: Critical 🚨
Steps of Reproduction ✅
Prompt for AI Agent 🤖