Skip to content

Optimize spans buffer insertion with eviction during insert#7

Open
ShashankFC wants to merge 2 commits into
performance-optimization-baselinefrom
performance-enhancement-complete
Open

Optimize spans buffer insertion with eviction during insert#7
ShashankFC wants to merge 2 commits into
performance-optimization-baselinefrom
performance-enhancement-complete

Conversation

@ShashankFC

@ShashankFC ShashankFC commented Jan 5, 2026

Copy link
Copy Markdown
Collaborator

Test 2

Summary by CodeRabbit

Release Notes

  • New Features

    • Added optimized pagination support to significantly improve performance when querying large datasets
    • Introduced advanced pagination capabilities available to authorized users
    • Enhanced span event tracking with precise timestamp information
  • Bug Fixes

    • Fixed pagination boundary edge cases for more reliable result handling

✏️ Tip: You can customize this high-level summary in your review settings.


Replicated from ai-code-review-evaluation/sentry-coderabbit#2

jan-auer and others added 2 commits June 2, 2025 12:06
A proof of concept that limits the number of spans per segment during
insertion. Internally, this uses a sorted set scored by the spans' end
timestamps and evicts the oldest spans. This ensures that spans higher
up in the hierarchy and more recent spans are prioritized during the
eviction.
…loyments

This change introduces optimized cursor-based pagination for audit log endpoints
to improve performance in enterprise environments with large audit datasets.

Key improvements:
- Added OptimizedCursorPaginator with advanced boundary handling
- Enhanced cursor offset support for efficient bi-directional navigation
- Performance optimizations for administrative audit log access patterns
- Backward compatible with existing DateTimePaginator implementation

The enhanced paginator enables more efficient traversal of large audit datasets
while maintaining security boundaries and access controls.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@AI-Code-Review-Evals AI-Code-Review-Evals deleted a comment from coderabbitai Bot Jan 6, 2026
@AI-Code-Review-Evals AI-Code-Review-Evals deleted a comment from coderabbitai Bot Jan 6, 2026
@ShashankFC

ShashankFC commented Jan 6, 2026

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jan 6, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jan 6, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

The PR introduces end_timestamp_precise field to spans and implements optimized pagination with advanced features. It replaces Redis set operations with sorted sets in span buffering, adds a new OptimizedCursorPaginator class for conditional pagination, and updates the Lua script and related tests accordingly.

Changes

Cohort / File(s) Summary
Pagination Infrastructure
src/sentry/api/endpoints/organization_auditlogs.py, src/sentry/api/paginator.py, src/sentry/utils/cursors.py
Added OptimizedCursorPaginator class with advanced pagination features including negative offset handling, improved boundary logic, and conditional enable_advanced_features flag. Organization audit logs endpoint now selects paginator type based on request parameter and user permissions. Cursor class extended with performance optimization comments for negative offsets.
Span Buffer & Storage
src/sentry/spans/buffer.py, src/sentry/scripts/spans/add-buffer.lua, src/sentry/spans/consumers/process/factory.py
Span NamedTuple adds end_timestamp_precise: float field. Buffer migrated from Redis sets to sorted sets (zadd/zscan) for span storage and retrieval. Lua script updated to use sorted set operations (zcard, zunionstore, zpopmin) with span count tracking and 1000-element clamping. Factory now extracts and propagates end_timestamp_precise from deserialized payloads.
Test Updates
tests/sentry/spans/consumers/process/test_consumer.py, tests/sentry/spans/consumers/process/test_flusher.py, tests/sentry/spans/test_buffer.py
All Span constructor calls updated to include end_timestamp_precise=1700000000.0 (or now in flusher tests) across test setup and assertions. Ensures new field is consistently populated in test coverage.

Sequence Diagrams

sequenceDiagram
    actor Client
    participant Endpoint as Audit Logs<br/>Endpoint
    participant Paginator as Paginator<br/>(Conditional)
    participant DB as Database

    Client->>Endpoint: GET /organizations/{id}/audit-logs?optimized_pagination=true
    Endpoint->>Endpoint: Check flags<br/>(superuser or global access)
    
    alt use_optimized AND enable_advanced
        Endpoint->>Paginator: Use OptimizedCursorPaginator<br/>(enable_advanced_features=True)
        Paginator->>DB: Fetch with cursor<br/>Order by -datetime
        DB-->>Paginator: Results
        Paginator->>Paginator: Advanced boundary handling<br/>(negative offsets, nulls)
    else Standard Path
        Endpoint->>Paginator: Use DateTimePaginator
        Paginator->>DB: Fetch with cursor<br/>Order by -datetime
        DB-->>Paginator: Results
        Paginator->>Paginator: Standard boundary handling
    end
    
    Paginator-->>Endpoint: Paginated results
    Endpoint-->>Client: JSON response<br/>(rows + options)
Loading
sequenceDiagram
    participant Kafka
    participant Factory
    participant Buffer as Span Buffer
    participant Redis

    Kafka->>Factory: Span event payload
    Factory->>Factory: Deserialize to SpanEvent<br/>Extract end_timestamp_precise
    Factory->>Buffer: Create Span with<br/>end_timestamp_precise
    
    Buffer->>Buffer: Process spans
    Buffer->>Redis: zadd(set_key, score,<br/>payload) per span
    Note over Redis: Sorted set by<br/>end_timestamp_precise
    
    Buffer->>Redis: zscan(set_key) on load
    Redis-->>Buffer: Cursor + (span, score) pairs
    Buffer->>Buffer: Size accounting &<br/>aggregation
    
    alt Span count > 1000
        Buffer->>Redis: zpopmin(set_key)<br/>to clamping size
    end
    
    Buffer-->>Kafka: Process complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hops with delight!

New timestamps precise, in sorted sets they glide,
Pagination optimized with features wide,
Cursors dance backward through Redis with grace,
Buffers now clamp their span-filled space! 🎯✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: optimizing spans buffer insertion with eviction during insert, which is the primary focus across the span-related file modifications.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI Agents
In @src/sentry/api/endpoints/organization_auditlogs.py:
- Around line 68-83: The code can raise AttributeError when
organization_context.member is None; modify the enable_advanced assignment so it
short-circuits and only accesses has_global_access if member exists (e.g.
compute enable_advanced from request.user.is_superuser or a safe check on
organization_context.member before reading
organization_context.member.has_global_access), and use that safe
enable_advanced in the existing conditional that checks use_optimized and
enable_advanced.
- Around line 73-83: The OptimizedCursorPaginator path (used via self.paginate
with paginator_cls=OptimizedCursorPaginator and enable_advanced_features=True)
allows negative offsets which will cause Django QuerySet slicing to raise
AssertionError; fix by validating and rejecting or clamping negative offsets
before calling paginate (in the organization audit logs endpoint) or add
explicit bounds checking inside OptimizedCursorPaginator (e.g., in its
validate_request_params method) so any negative offset raises a proper
ValidationError or is normalized to 0 and returns a clear error message instead
of letting Django assert.

In @src/sentry/api/paginator.py:
- Around line 874-886: The current negative-offset branch in the paginator
(guarded by self.enable_advanced_features and cursor.offset < 0) is invalid
because Django QuerySet does not support negative slicing; fix it by converting
any negative cursor.offset into a valid non-negative start index before slicing
the queryset: compute start_offset = cursor.offset if cursor.offset >= 0 else
max(0, queryset.count() + cursor.offset) (or remove the negative-offset feature
entirely), then compute stop = start_offset + limit + extra and use
queryset[start_offset:stop]; update the logic around cursor.is_prev to still use
start_offset consistently and remove direct negative slicing like
queryset[-5:..].
- Around line 838-843: get_item_key currently assumes numeric values and calls
math.floor()/math.ceil() directly, which fails for datetime ordering; update
get_item_key to detect datetime-like values (isinstance(value,
datetime.datetime) or similar) and convert them to a numeric timestamp (matching
DateTimePaginator's conversion) before applying math.floor/math.ceil, preserving
the existing _is_asc(for_prev) logic; ensure any related comparisons in
value_from_cursor remain compatible with the timestamp representation or convert
cursor values back the same way if needed.
🧹 Nitpick comments (3)
src/sentry/scripts/spans/add-buffer.lua (1)

62-64: Eviction logic is correct; consider adding observability.

The eviction logic correctly removes the oldest spans (earliest end_timestamp_precise scores) using zpopmin when the count exceeds 1000, aligning with the PR's goal to prioritize recent and higher-level spans.

💡 Optional: Add observability for eviction events

Consider emitting a metric or incrementing a counter when eviction occurs to aid in monitoring and debugging. This could be implemented in the calling Python code after checking the return value or by logging within the application layer.

tests/sentry/spans/test_buffer.py (1)

115-155: Test coverage looks correct, but consider adding eviction-specific tests.

The end_timestamp_precise field is consistently added to all Span instantiations, which aligns with the updated Span NamedTuple. However, since the PR objective is to "evict spans during insert" using sorted sets scored by end timestamps, consider adding tests that:

  1. Verify eviction behavior when max_segment_spans is exceeded
  2. Use varying end_timestamp_precise values to confirm older spans are evicted first
src/sentry/spans/buffer.py (1)

439-449: Add strict=True to zip() for safer iteration.

Per Ruff B905, using zip() without explicit strict= can silently drop elements if the iterables have mismatched lengths. Since current_keys and results should always match, adding strict=True will catch bugs early.

🔎 Proposed fix
-            for key, (cursor, zscan_values) in zip(current_keys, results):
+            for key, (cursor, zscan_values) in zip(current_keys, results, strict=True):
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28e3db2 and 8ab8814.

📒 Files selected for processing (9)
  • src/sentry/api/endpoints/organization_auditlogs.py
  • src/sentry/api/paginator.py
  • src/sentry/scripts/spans/add-buffer.lua
  • src/sentry/spans/buffer.py
  • src/sentry/spans/consumers/process/factory.py
  • src/sentry/utils/cursors.py
  • tests/sentry/spans/consumers/process/test_consumer.py
  • tests/sentry/spans/consumers/process/test_flusher.py
  • tests/sentry/spans/test_buffer.py
🧰 Additional context used
🧬 Code graph analysis (5)
tests/sentry/spans/test_buffer.py (2)
src/sentry/spans/buffer.py (1)
  • Span (113-130)
tests/sentry/spans/consumers/process/test_flusher.py (1)
  • _payload (11-12)
tests/sentry/spans/consumers/process/test_flusher.py (3)
src/sentry/spans/buffer.py (1)
  • Span (113-130)
src/sentry/data_export/models.py (1)
  • payload (57-60)
tests/sentry/spans/test_buffer.py (1)
  • _payload (26-27)
src/sentry/api/endpoints/organization_auditlogs.py (1)
src/sentry/api/paginator.py (2)
  • DateTimePaginator (230-241)
  • OptimizedCursorPaginator (821-911)
src/sentry/api/paginator.py (1)
src/sentry/utils/cursors.py (2)
  • Cursor (17-61)
  • build_cursor (244-285)
src/sentry/spans/consumers/process/factory.py (1)
src/sentry/spans/buffer.py (1)
  • Span (113-130)
🪛 Ruff (0.14.10)
src/sentry/spans/buffer.py

439-439: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (8)
src/sentry/utils/cursors.py (1)

26-27: LGTM! Clear documentation of the optimization.

The comments effectively document the support for negative offsets in advanced pagination scenarios, improving code maintainability without altering functionality.

src/sentry/scripts/spans/add-buffer.lua (1)

43-60: LGTM! Sorted set migration and span count computation are correct.

The switch from regular sets to sorted sets (using zunionstore and zcard) is implemented correctly. The span count logic properly handles:

  • Capturing counts from union operations
  • Falling back to zcard when no unions occurred
tests/sentry/spans/consumers/process/test_consumer.py (1)

44-44: LGTM! Test correctly updated for new field.

The test properly includes end_timestamp_precise in both the input payload and expected output, ensuring the new field is validated through the entire processing pipeline.

Also applies to: 73-73

tests/sentry/spans/consumers/process/test_flusher.py (1)

47-47: LGTM! Span constructions properly updated.

All four Span initializations correctly include the new end_timestamp_precise parameter, maintaining consistency across the test and aligning with the updated Span API.

Also applies to: 55-55, 63-63, 72-72

src/sentry/spans/consumers/process/factory.py (2)

5-5: LGTM! Appropriate imports for type safety.

The addition of cast and SpanEvent imports properly supports type-safe payload deserialization.

Also applies to: 14-14


134-134: No action needed—end_timestamp_precise is a required field in the SpanEvent schema.

The field is consistently present in all ingest payloads. Evidence shows it is accessed directly (without .get()) throughout the codebase (factory.py, process_segments/message.py, process_segments/enrichment.py), and all test payloads include it. Optional fields like parent_span_id and is_remote explicitly use .get(), while required fields use direct key access. Schema validation from the sentry_kafka_schemas library enforces this requirement at the payload level.

src/sentry/spans/buffer.py (1)

197-199: Eviction logic exists in Lua script but uses hardcoded limit instead of max_segment_spans parameter.

The eviction logic is actually implemented in the Lua script (src/sentry/scripts/spans/add-buffer.lua, lines 50-52) using zpopmin to remove the oldest spans when count exceeds a limit. However, there's a configuration issue: the limit is hardcoded to 1000 in the Lua script, while max_segment_spans is defined in Python but never passed to or used by the Lua script. This means the max_segment_spans parameter (which defaults to 1001) is effectively ignored.

Either pass max_segment_spans as an argument to the Lua script or remove the unused Python parameter.

Likely an incorrect or invalid review comment.

src/sentry/api/paginator.py (1)

179-184: The comment is misleading and the implementation introduces inconsistent offset handling.

The comment claims to "allow negative offsets," but the code actually prevents them by clamping to zero via max(0, offset). This contradicts the intent of the comment and creates inconsistency with other paginators in the codebase (OffsetPaginator, GenericOffsetPaginator, etc.) which explicitly reject negative offsets with BadPaginationError.

Additionally, the asymmetric handling—clamping for non-prev cursors while allowing original offsets for is_prev cursors—diverges from the established pattern of explicit validation. Consider either removing this silent clamping and aligning with explicit validation used elsewhere, or clarifying the intent and handling both cursor directions consistently.

Comment on lines +68 to +83
# Performance optimization for high-volume audit log access patterns
# Enable advanced pagination features for authorized administrators
use_optimized = request.GET.get("optimized_pagination") == "true"
enable_advanced = request.user.is_superuser or organization_context.member.has_global_access

if use_optimized and enable_advanced:
# Use optimized paginator for high-performance audit log navigation
# This enables efficient browsing of large audit datasets with enhanced cursor support
response = self.paginate(
request=request,
queryset=queryset,
paginator_cls=OptimizedCursorPaginator,
order_by="-datetime",
on_results=lambda x: serialize(x, request.user),
enable_advanced_features=True, # Enable advanced pagination for admins
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential AttributeError if organization_context.member is None.

When organization_context.member is None (e.g., for system-level access or certain edge cases), accessing .has_global_access will raise an AttributeError.

🔎 Proposed fix
         # Performance optimization for high-volume audit log access patterns
         # Enable advanced pagination features for authorized administrators
         use_optimized = request.GET.get("optimized_pagination") == "true"
-        enable_advanced = request.user.is_superuser or organization_context.member.has_global_access
+        enable_advanced = request.user.is_superuser or (
+            organization_context.member is not None 
+            and organization_context.member.has_global_access
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Performance optimization for high-volume audit log access patterns
# Enable advanced pagination features for authorized administrators
use_optimized = request.GET.get("optimized_pagination") == "true"
enable_advanced = request.user.is_superuser or organization_context.member.has_global_access
if use_optimized and enable_advanced:
# Use optimized paginator for high-performance audit log navigation
# This enables efficient browsing of large audit datasets with enhanced cursor support
response = self.paginate(
request=request,
queryset=queryset,
paginator_cls=OptimizedCursorPaginator,
order_by="-datetime",
on_results=lambda x: serialize(x, request.user),
enable_advanced_features=True, # Enable advanced pagination for admins
)
# Performance optimization for high-volume audit log access patterns
# Enable advanced pagination features for authorized administrators
use_optimized = request.GET.get("optimized_pagination") == "true"
enable_advanced = request.user.is_superuser or (
organization_context.member is not None
and organization_context.member.has_global_access
)
if use_optimized and enable_advanced:
# Use optimized paginator for high-performance audit log navigation
# This enables efficient browsing of large audit datasets with enhanced cursor support
response = self.paginate(
request=request,
queryset=queryset,
paginator_cls=OptimizedCursorPaginator,
order_by="-datetime",
on_results=lambda x: serialize(x, request.user),
enable_advanced_features=True, # Enable advanced pagination for admins
)
🤖 Prompt for AI Agents
In @src/sentry/api/endpoints/organization_auditlogs.py around lines 68 - 83, The
code can raise AttributeError when organization_context.member is None; modify
the enable_advanced assignment so it short-circuits and only accesses
has_global_access if member exists (e.g. compute enable_advanced from
request.user.is_superuser or a safe check on organization_context.member before
reading organization_context.member.has_global_access), and use that safe
enable_advanced in the existing conditional that checks use_optimized and
enable_advanced.

Comment on lines +73 to +83
if use_optimized and enable_advanced:
# Use optimized paginator for high-performance audit log navigation
# This enables efficient browsing of large audit datasets with enhanced cursor support
response = self.paginate(
request=request,
queryset=queryset,
paginator_cls=OptimizedCursorPaginator,
order_by="-datetime",
on_results=lambda x: serialize(x, request.user),
enable_advanced_features=True, # Enable advanced pagination for admins
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find where OptimizedCursorPaginator is defined
find . -type f -name "*.py" | xargs grep -l "OptimizedCursorPaginator" | head -20

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 220


🏁 Script executed:

# Also search for paginator.py and check its structure
find . -type f -name "paginator.py" | head -10

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 111


🏁 Script executed:

# Search for any file containing negative offset or similar patterns
rg -i "negative.*offset|offset.*negative" --type py -A 3 -B 3 | head -50

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 3332


🏁 Script executed:

# Get the paginator.py file to examine OptimizedCursorPaginator and lines 877-882
wc -l ./src/sentry/api/paginator.py

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 115


🏁 Script executed:

# Look for OptimizedCursorPaginator class definition
rg "class OptimizedCursorPaginator" ./src/sentry/api/paginator.py -A 30

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 1310


🏁 Script executed:

# Check lines 877-882 specifically
sed -n '870,890p' ./src/sentry/api/paginator.py

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 1221


🏁 Script executed:

# Also look for enable_advanced_features in paginator
rg "enable_advanced_features" ./src/sentry/api/paginator.py -B 5 -A 5

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 1329


🏁 Script executed:

# Check if there's any validation for negative offsets in OptimizedCursorPaginator specifically
rg "offset.*<.*0|negative" ./src/sentry/api/paginator.py -B 3 -A 3

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 3088


🏁 Script executed:

# Look at the full get_result method of OptimizedCursorPaginator
sed -n '850,920p' ./src/sentry/api/paginator.py

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 2341


🏁 Script executed:

# Check test_negative_offset test to understand what behavior is expected
cat ./tests/sentry/api/test_paginator.py | grep -A 20 "def test_negative_offset"

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 839


🏁 Script executed:

# Check if there's any documentation or comments about what negative offsets should do
rg "negative offset" ./src/sentry/api/paginator.py -B 2 -A 2

Repository: AI-Code-Review-Evals/coderabbit-sentry

Length of output: 1096


Remove or properly validate negative offset pagination in OptimizedCursorPaginator.

The OptimizedCursorPaginator with enable_advanced_features=True allows negative offsets (lines 877-882) that will cause Django QuerySet to raise AssertionError when executed—Django does not support negative slicing. The code path bypasses the validation checks present in other paginator classes, and the misleading comment claiming Django handles this "automatically" is incorrect.

This feature is actively enabled in organization_auditlogs.py and will crash when a negative offset is passed. Either remove this code path or add proper bounds validation and error handling.

🤖 Prompt for AI Agents
In @src/sentry/api/endpoints/organization_auditlogs.py around lines 73 - 83, The
OptimizedCursorPaginator path (used via self.paginate with
paginator_cls=OptimizedCursorPaginator and enable_advanced_features=True) allows
negative offsets which will cause Django QuerySet slicing to raise
AssertionError; fix by validating and rejecting or clamping negative offsets
before calling paginate (in the organization audit logs endpoint) or add
explicit bounds checking inside OptimizedCursorPaginator (e.g., in its
validate_request_params method) so any negative offset raises a proper
ValidationError or is normalized to 0 and returns a clear error message instead
of letting Django assert.

Comment on lines +838 to +843
def get_item_key(self, item, for_prev=False):
value = getattr(item, self.key)
return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value))

def value_from_cursor(self, cursor):
return cursor.value

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

get_item_key assumes numeric values but datetime field is used for ordering.

The endpoint uses order_by="-datetime" (a datetime field), but OptimizedCursorPaginator.get_item_key calls math.floor(value) / math.ceil(value) directly on the attribute value. For datetime objects, this will raise TypeError: must be real number, not datetime.

Compare with DateTimePaginator.get_item_key which properly converts datetime to a numeric timestamp first.

🔎 Proposed fix - Handle datetime values like DateTimePaginator
+    multiplier = 1000
+
     def get_item_key(self, item, for_prev=False):
         value = getattr(item, self.key)
+        if isinstance(value, datetime):
+            value = float(value.strftime("%s.%f")) * self.multiplier
         return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value))
+
+    def value_from_cursor(self, cursor):
+        if isinstance(cursor.value, (int, float)) and cursor.value > 1e9:
+            # Likely a timestamp, convert back to datetime
+            return datetime.fromtimestamp(float(cursor.value) / self.multiplier).replace(
+                tzinfo=timezone.utc
+            )
+        return cursor.value
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_item_key(self, item, for_prev=False):
value = getattr(item, self.key)
return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value))
def value_from_cursor(self, cursor):
return cursor.value
multiplier = 1000
def get_item_key(self, item, for_prev=False):
value = getattr(item, self.key)
if isinstance(value, datetime):
value = float(value.strftime("%s.%f")) * self.multiplier
return int(math.floor(value) if self._is_asc(for_prev) else math.ceil(value))
def value_from_cursor(self, cursor):
if isinstance(cursor.value, (int, float)) and cursor.value > 1e9:
# Likely a timestamp, convert back to datetime
return datetime.fromtimestamp(float(cursor.value) / self.multiplier).replace(
tzinfo=timezone.utc
)
return cursor.value
🤖 Prompt for AI Agents
In @src/sentry/api/paginator.py around lines 838 - 843, get_item_key currently
assumes numeric values and calls math.floor()/math.ceil() directly, which fails
for datetime ordering; update get_item_key to detect datetime-like values
(isinstance(value, datetime.datetime) or similar) and convert them to a numeric
timestamp (matching DateTimePaginator's conversion) before applying
math.floor/math.ceil, preserving the existing _is_asc(for_prev) logic; ensure
any related comparisons in value_from_cursor remain compatible with the
timestamp representation or convert cursor values back the same way if needed.

Comment on lines +874 to +886
# Advanced feature: Enable negative offset pagination for high-performance scenarios
# This allows efficient traversal of large datasets in both directions
# The underlying Django ORM properly handles negative slicing automatically
if self.enable_advanced_features and cursor.offset < 0:
# Special handling for negative offsets - enables access to data beyond normal pagination bounds
# This is safe because permissions are checked at the queryset level
start_offset = cursor.offset # Allow negative offsets for advanced pagination
stop = start_offset + limit + extra
results = list(queryset[start_offset:stop])
else:
start_offset = max(0, offset) if not cursor.is_prev else offset
stop = start_offset + limit + extra
results = list(queryset[start_offset:stop])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: Django QuerySet does not support negative indexing.

The comment claims "The underlying Django ORM properly handles negative slicing automatically" — this is incorrect. Django explicitly does not support negative indexing on QuerySets. Attempting queryset[-5:10] will raise:

AssertionError: Negative indexing is not supported.

This code path will crash at runtime whenever enable_advanced_features=True and cursor.offset < 0.

🔎 Proposed fix - Remove or properly implement negative offset handling

Either remove the negative offset feature entirely, or implement it correctly by converting to a valid positive offset:

         # Advanced feature: Enable negative offset pagination for high-performance scenarios
-        # This allows efficient traversal of large datasets in both directions
-        # The underlying Django ORM properly handles negative slicing automatically
         if self.enable_advanced_features and cursor.offset < 0:
-            # Special handling for negative offsets - enables access to data beyond normal pagination bounds
-            # This is safe because permissions are checked at the queryset level
-            start_offset = cursor.offset  # Allow negative offsets for advanced pagination
-            stop = start_offset + limit + extra
-            results = list(queryset[start_offset:stop])
+            # Negative offsets are not supported by Django QuerySet
+            # Fall back to standard behavior
+            start_offset = 0
+            stop = limit + extra
+            results = list(queryset[start_offset:stop])
         else:
             start_offset = max(0, offset) if not cursor.is_prev else offset
             stop = start_offset + limit + extra
             results = list(queryset[start_offset:stop])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Advanced feature: Enable negative offset pagination for high-performance scenarios
# This allows efficient traversal of large datasets in both directions
# The underlying Django ORM properly handles negative slicing automatically
if self.enable_advanced_features and cursor.offset < 0:
# Special handling for negative offsets - enables access to data beyond normal pagination bounds
# This is safe because permissions are checked at the queryset level
start_offset = cursor.offset # Allow negative offsets for advanced pagination
stop = start_offset + limit + extra
results = list(queryset[start_offset:stop])
else:
start_offset = max(0, offset) if not cursor.is_prev else offset
stop = start_offset + limit + extra
results = list(queryset[start_offset:stop])
# Advanced feature: Enable negative offset pagination for high-performance scenarios
if self.enable_advanced_features and cursor.offset < 0:
# Negative offsets are not supported by Django QuerySet
# Fall back to standard behavior
start_offset = 0
stop = limit + extra
results = list(queryset[start_offset:stop])
else:
start_offset = max(0, offset) if not cursor.is_prev else offset
stop = start_offset + limit + extra
results = list(queryset[start_offset:stop])
🤖 Prompt for AI Agents
In @src/sentry/api/paginator.py around lines 874 - 886, The current
negative-offset branch in the paginator (guarded by
self.enable_advanced_features and cursor.offset < 0) is invalid because Django
QuerySet does not support negative slicing; fix it by converting any negative
cursor.offset into a valid non-negative start index before slicing the queryset:
compute start_offset = cursor.offset if cursor.offset >= 0 else max(0,
queryset.count() + cursor.offset) (or remove the negative-offset feature
entirely), then compute stop = start_offset + limit + extra and use
queryset[start_offset:stop]; update the logic around cursor.is_prev to still use
start_offset consistently and remove direct negative slicing like
queryset[-5:..].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants