feat(upsampling) - Support upsampled error count with performance optimizations by akshayutture-augment · Pull Request #13 · ai-code-review-evaluations/coderabbit-sentry

akshayutture-augment · 2025-11-14T23:42:42Z

Test 3

Summary by CodeRabbit

New Features
- Added error upsampling functionality for improved query performance on error events. The system now intelligently applies sample-weight transformations based on per-organization project allowlists to scale aggregated error metrics appropriately.
- Introduced new upsampled_count aggregation function for discovering upsampled error metrics.
Tests
- Added comprehensive test coverage for error upsampling behavior including allowlist validation and dataset-specific handling.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Replicated from ai-code-review-evaluation/sentry-coderabbit#3

…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.

- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.

akshayutture-augment · 2025-11-14T23:47:39Z

@coderabbitai review

akshayutture-augment · 2025-11-18T20:48:20Z

@coderabbitai review

coderabbitai · 2025-11-18T20:48:25Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-11-18T20:48:42Z

Walkthrough

Introduces error upsampling functionality to Sentry's event statistics system. Adds helper utilities to determine query eligibility based on organization allowlists and dataset type, applies transformations to rewrite count aggregations to upsampled_count, integrates these checks into the organization events stats endpoint, introduces a new upsampled_count function to the discover dataset, and includes comprehensive test coverage for the new functionality.

Changes

Cohort / File(s)	Summary
Configuration & Build `pyproject.toml`, `sentry-repo`	Updated mypy overrides and test references for error_upsampling modules; minor git submodule version update.
Error Upsampling Core `src/sentry/api/helpers/error_upsampling.py`	New module providing error upsampling helper functions: eligibility checking via organization allowlists with 60-second caching, query column transformation (count → upsampled_count), sample weight decision logic for datasets (errors/transactions/discover), and cache invalidation utilities.
Query Transformation & Integration `src/sentry/api/endpoints/organization_events_stats.py`, `src/sentry/search/events/datasets/discover.py`	Applies upsampling eligibility checks and column transformations in organization events stats endpoint across RPC and non-RPC query paths; adds upsampled_count aggregation function to discover dataset config.
Test Infrastructure `src/sentry/testutils/factories.py`	Adds helper to propagate error_sampling.client_sample_rate to normalized event data in store_event flow.
Test Coverage `tests/sentry/api/helpers/test_error_upsampling.py`, `tests/snuba/api/endpoints/test_organization_events_stats.py`	Comprehensive test suite for error upsampling eligibility, column transformation logic, query type detection, sample weight application rules, and end-to-end upsampling behavior across allowlist scenarios.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Endpoint as organization_events_stats<br/>endpoint
    participant Helper as error_upsampling<br/>helper
    participant Cache as Eligibility<br/>Cache
    participant Snuba
    
    Client->>Endpoint: GET /organization/events/stats
    Endpoint->>Helper: is_errors_query_for_error_upsampled_projects()
    Helper->>Cache: Check eligibility cache
    alt Cache Miss
        Cache-->>Helper: cache miss
        Helper->>Helper: _are_all_projects_error_upsampled()
        Helper->>Helper: _should_apply_sample_weight_transform()
        Helper->>Cache: Store result (60s TTL)
    else Cache Hit
        Cache-->>Helper: cached eligibility
    end
    Helper-->>Endpoint: upsampling_enabled: bool
    
    alt Upsampling Enabled
        Endpoint->>Helper: transform_query_columns_for_error_upsampling()
        Helper-->>Endpoint: count() → upsampled_count() as count
        Endpoint->>Snuba: Query with transformed columns
    else Upsampling Disabled
        Endpoint->>Snuba: Query with original columns
    end
    
    Snuba-->>Endpoint: Results
    Endpoint-->>Client: Stats response

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

error_upsampling.py: Verify caching logic correctness, cache invalidation coverage, and decision-making across dataset types (errors/transactions/discover)
organization_events_stats.py: Confirm upsampling transformations applied at all query path junctures (RPC and non-RPC, before each query execution)
discover.py: Validate upsampled_count aggregation function definition and behavior (sample_weight summation via toInt64)
Test comprehensiveness: Ensure all allowlist scenarios, query types, and edge cases are covered across both unit and integration tests

Poem

A rabbit hops through Snuba's fields so bright, 🐰
Where upsampled counts now dance in light,
Sample weights ascend with cache so keen,
Error events glow in queries clean,
Allowlists guide the path just right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.78% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the main change: implementing support for upsampled error counts with performance optimizations, which aligns with the core functionality added across multiple files.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch error-upsampling-race-condition

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

akshayutture-augment · 2025-11-22T01:32:53Z

@coderabbitai review

coderabbitai · 2025-11-22T01:32:58Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (6)

sentry-repo (1)

49-64: Span link sanitization and dropped-attribute accounting look solid

The new links handling, _sanitize_span_link, and corresponding test correctly preserve known attributes and accumulate sentry.dropped_attributes_count (including pre-existing drops). The JSON-encoding behavior is validated by the test and matches the intended storage model.

If you want to harden this further, you could defensively coerce sentry.dropped_attributes_count to int inside _sanitize_span_link to avoid unexpected types from Relay input, but that's strictly optional.

Also applies to: 74-100, 111-133
src/sentry/api/helpers/error_upsampling.py (1)
13-41: Error upsampling helpers are well-structured; a couple of small nits

The overall flow (allowlist check + short cache, dataset gating, and count() → upsampled_count() rewrite) looks coherent and matches the intended behavior.

Two minor points to consider:
Unused organization parameter in _are_all_projects_error_upsampled
organization is currently unused. If the allowlist is meant to be organization-scoped, you probably want:
allowlist = options.get(
    "issues.client_error_sampling.project_allowlist",
    [],
    organization=organization,
)
Otherwise, consider dropping the argument to avoid confusion.
_is_error_focused_query uses a raw substring match
Checking only "event.type:error" in query is simple but will also treat queries like !event.type:error or mixed-type queries as “error focused”. If you need stricter semantics later (e.g., “only errors”), you might want to tighten this to parse individual terms rather than doing a plain substring search.
Neither issue is blocking, but they’re worth keeping in mind as the feature evolves.

Also applies to: 43-65, 67-100, 103-140
src/sentry/search/events/datasets/discover.py (1)
1041-1052: upsampled_count implementation looks correct; tweak lambda arg to avoid lint noise

The aggregate toInt64(sum(sample_weight)) matches the intended upsampled-count semantics and aligns with the transform in transform_query_columns_for_error_upsampling.

You can silence the Ruff ARG005 warning and make intent clearer by ignoring the unused args parameter:
-                SnQLFunction(
-                    "upsampled_count",
-                    required_args=[],
-                    # Optimized aggregation for error upsampling - assumes sample_weight
-                    # exists for all events in allowlisted projects as per schema design
-                    snql_aggregate=lambda args, alias: Function(
-                        "toInt64",
-                        [Function("sum", [Column("sample_weight")])],
-                        alias,
-                    ),
-                    default_result_type="number",
-                ),
+                SnQLFunction(
+                    "upsampled_count",
+                    required_args=[],
+                    # Optimized aggregation for error upsampling - assumes sample_weight
+                    # exists for all events in allowlisted projects as per schema design
+                    snql_aggregate=lambda _, alias: Function(
+                        "toInt64",
+                        [Function("sum", [Column("sample_weight")])],
+                        alias,
+                    ),
+                    default_result_type="number",
+                ),
tests/sentry/api/helpers/test_error_upsampling.py (1)

19-101: Helper tests are solid; consider a small cache/invalidation round-trip test

The tests nicely cover allowlist evaluation, column transformation (including case/whitespace), and dataset/query gating.

If you want to future-proof the caching logic, an extra unit test exercising is_errors_query_for_error_upsampled_projects together with invalidate_upsampling_cache (e.g., toggling the allowlist and asserting the cached result updates after invalidation) would help catch regressions in the memoization layer without relying solely on higher-level endpoint tests.

tests/snuba/api/endpoints/test_organization_events_stats.py (2)

3554-3722: Consider expanding test coverage for edge cases.

The current test suite covers basic scenarios well, but consider adding tests for:

Events without error_sampling context (fallback behavior)

Different sample rates (0.01, 0.5, 1.0)

Edge case: client_sample_rate = 0 or = 1

Multiple yAxis parameters with upsampling (e.g., ["count()", "count_unique(user)"])

Integration with topEvents parameter

Comparison with comparisonDelta parameter

These additional test cases would ensure the error upsampling feature handles various scenarios correctly.

3617-3617: Clarify dataset parameter usage for consistency.

The test at line 3687 explicitly specifies "dataset": "discover" for transaction events, but other tests querying error events don't specify a dataset parameter. For clarity and consistency, consider explicitly specifying "dataset": "errors" in tests that query error events, or add a comment explaining the default dataset behavior.

Also applies to: 3641-3641, 3687-3687, 3711-3711

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a440b4 and 6ad6fe3.

📒 Files selected for processing (8)

pyproject.toml (2 hunks)
sentry-repo (1 hunks)
src/sentry/api/endpoints/organization_events_stats.py (6 hunks)
src/sentry/api/helpers/error_upsampling.py (1 hunks)
src/sentry/search/events/datasets/discover.py (1 hunks)
src/sentry/testutils/factories.py (3 hunks)
tests/sentry/api/helpers/test_error_upsampling.py (1 hunks)
tests/snuba/api/endpoints/test_organization_events_stats.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.5)

src/sentry/api/helpers/error_upsampling.py

44-44: Unused function argument: organization

(ARG001)

src/sentry/search/events/datasets/discover.py

1046-1046: Unused lambda argument: args

(ARG005)

src/sentry/testutils/factories.py

351-352: try-except-pass detected, consider logging the exception

(S110)

351-351: Do not catch blind exception: Exception

(BLE001)

356-357: try-except-pass detected, consider logging the exception

(S110)

356-356: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (2)

pyproject.toml (1)

176-177: Aligning new helpers/tests with stronger typing is appropriate

Including sentry.api.helpers.error_upsampling and its tests in the stricter mypy overrides is consistent with the rest of the configuration and should help keep this logic well-typed.

Also applies to: 464-465

tests/snuba/api/endpoints/test_organization_events_stats.py (1)

3610-3621: Feature flags are not required for error upsampling.

The error upsampling functionality is controlled via configuration options (allowlist), not feature flags. The helpers in src/sentry/api/helpers/error_upsampling.py check options.get("issues.client_error_sampling.project_allowlist") and verify dataset/query type—there are no feature flag checks anywhere in this code path. The test methods correctly mock the options to simulate different allowlist states, making feature flag wrapping unnecessary. Unlike other tests in this file that feature-gate specific functionality, these tests verify option-controlled behavior, which is a different mechanism.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-11-22T01:37:28Z

+from sentry.api.helpers.error_upsampling import (
+    is_errors_query_for_error_upsampled_projects,
+    transform_query_columns_for_error_upsampling,
+)


⚠️ Potential issue | 🟠 Major

Upsampling eligibility should use scoped_dataset, not outer dataset

The new wiring in _get_event_stats correctly centralizes the upsampling decision and applies the column transform right before query execution. However, there’s an important edge case:

should_upsample = is_errors_query_for_error_upsampled_projects( snuba_params, organization, dataset, request )

Here you pass the outer dataset, even though _get_event_stats is parameterized by scoped_dataset and get_event_stats_factory explicitly calls it with different datasets (notably split_dataset = discover when splitting a metrics-backed dashboard widget into error vs transaction-like queries).

When the original dataset is a metrics dataset and split_dataset is discover, _should_apply_sample_weight_transform inside is_errors_query_for_error_upsampled_projects will see the metrics dataset and return False, so upsampling is silently disabled for those split error queries even though they’re executed against discover where sample_weight is available.

You likely want to base the eligibility on the dataset actually being queried:

- should_upsample = is_errors_query_for_error_upsampled_projects( - snuba_params, organization, dataset, request - ) + should_upsample = is_errors_query_for_error_upsampled_projects( + snuba_params, organization, scoped_dataset, request + )

This keeps behavior unchanged in the simple path (where scoped_dataset == dataset), while enabling correct upsampling for the additional split queries the factory issues.

Also applies to: 124-124, 209-298, 326-499

🤖 Prompt for AI Agents

In src/sentry/api/endpoints/organization_events_stats.py around lines 14-17, 124, 209-298, and 326-499, the calls to is_errors_query_for_error_upsampled_projects (and any related upsampling eligibility checks) incorrectly pass the outer dataset variable; change these calls to pass scoped_dataset (the dataset actually being queried) so the upsampling eligibility and subsequent sample_weight/column transform logic use the query-specific dataset. Update any nearby uses that infer dataset for transform or eligibility to use scoped_dataset, and run tests to confirm behavior is unchanged when scoped_dataset == dataset but correct for split queries (e.g., metrics->discover).

coderabbitai · 2025-11-22T01:37:28Z

 from base64 import b64encode
 from binascii import hexlify
-from collections.abc import Mapping, Sequence
+from collections.abc import Mapping, MutableMapping, Sequence


⚠️ Potential issue | 🟡 Minor

Tighten _set_sample_rate_from_error_sampling semantics and exception handling

The propagation of contexts.error_sampling.client_sample_rate into sample_rate is useful, but there are two edge cases worth fixing:

0 sample rate is skipped
Using if client_sample_rate: means a valid 0 (or "0") will be ignored. You likely want to treat 0 as a valid explicit value.

Bare except Exception: pass hides malformed data
Swallowing all exceptions without logging makes it hard to notice bad payloads. Narrowing the exception types keeps tests robust without masking real problems.

A minimal adjustment that preserves resilience while fixing these issues:

-def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: - """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" - client_sample_rate = None - try: - client_sample_rate = ( - normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate") - ) - except Exception: - pass - if client_sample_rate: - try: - normalized_data["sample_rate"] = float(client_sample_rate) - except Exception: - pass +def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: + """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" + client_sample_rate = ( + normalized_data.get("contexts", {}) + .get("error_sampling", {}) + .get("client_sample_rate") + ) + if client_sample_rate is not None: + try: + normalized_data["sample_rate"] = float(client_sample_rate) + except (TypeError, ValueError): + # Ignore invalid values but don't break tests + pass

The call in store_event then benefits from the stricter but still safe behavior without further changes.

Also applies to: 344-357, 1047-1050

🤖 Prompt for AI Agents

In src/sentry/testutils/factories.py around lines 11 (and also apply same changes to 344-357 and 1047-1050), tighten `_set_sample_rate_from_error_sampling` by treating explicit zero as a valid sample rate (replace the truthy check like `if client_sample_rate:` with an explicit None/empty check so values 0 and "0" are accepted) and remove the bare `except Exception:` in favor of narrowly catching expected parsing/errors (e.g., ValueError, TypeError, KeyError) and surface or log the error instead of silently passing; update the handling accordingly so malformed data does not get silently swallowed while preserving resilience.

coderabbitai · 2025-11-22T01:37:28Z

+                "event_id": "a" * 32,
+                "message": "very bad",
+                "type": "error",
+                "exception": [{"type": "ValueError", "value": "Something went wrong 1"}],


⚠️ Potential issue | 🔴 Critical

Fix the exception field structure.

The exception field should be a dictionary with a values key, not a direct list. The correct format is:

"exception": { "values": [{"type": "ValueError", "value": "Something went wrong 1"}] }

This matches the Sentry event schema and the pattern used elsewhere in this file (see lines 2479-2487).

Apply this diff:

- "exception": [{"type": "ValueError", "value": "Something went wrong 1"}], + "exception": { + "values": [{"type": "ValueError", "value": "Something went wrong 1"}] + },

And similarly for line 3588:

- "exception": [{"type": "ValueError", "value": "Something went wrong 2"}], + "exception": { + "values": [{"type": "ValueError", "value": "Something went wrong 2"}] + },

Also applies to: 3588-3588

🤖 Prompt for AI Agents

In tests/snuba/api/endpoints/test_organization_events_stats.py around lines 3575 and 3588, the test uses "exception" as a direct list but it must be an object with a "values" key per Sentry schema; change each "exception": [ ... ] to "exception": {"values": [ ... ]} so the exception payload matches the expected structure used elsewhere in the file.

coderabbitai · 2025-11-22T01:37:28Z

+        assert data[0][1][0]["count"] == 10  # First bucket has 1 event
+        assert data[1][1][0]["count"] == 10  # Second bucket has 1 event


⚠️ Potential issue | 🟡 Minor

Clarify misleading comments.

The comments state "First bucket has 1 event" and "Second bucket has 1 event", but the assertions expect count == 10. The comments should clarify that each bucket contains 1 sampled event that is upsampled 10x due to the client_sample_rate: 0.1.

Apply this diff:

- assert data[0][1][0]["count"] == 10 # First bucket has 1 event - assert data[1][1][0]["count"] == 10 # Second bucket has 1 event + assert data[0][1][0]["count"] == 10 # First bucket: 1 event upsampled 10x (sample_rate=0.1) + assert data[1][1][0]["count"] == 10 # Second bucket: 1 event upsampled 10x (sample_rate=0.1)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert data[0][1][0]["count"] == 10 # First bucket has 1 event

assert data[1][1][0]["count"] == 10 # Second bucket has 1 event

assert data[0][1][0]["count"] == 10 # First bucket: 1 event upsampled 10x (sample_rate=0.1)

assert data[1][1][0]["count"] == 10 # Second bucket: 1 event upsampled 10x (sample_rate=0.1)

🤖 Prompt for AI Agents

In tests/snuba/api/endpoints/test_organization_events_stats.py around lines 3626 to 3627, the inline comments are misleading: they say "First/Second bucket has 1 event" while the assertions expect count == 10 due to upsampling from client_sample_rate: 0.1. Update the comments to explain that each bucket contains 1 sampled event which is upsampled by a factor of 10 (client_sample_rate 0.1) resulting in count == 10, so replace the existing comments with a concise note describing sampled event + upsampling.

yuvmen and others added 2 commits July 25, 2025 09:48

ai-code-review-evaluations deleted a comment from coderabbitai Bot Nov 18, 2025

coderabbitai Bot reviewed Nov 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(upsampling) - Support upsampled error count with performance optimizations#13

feat(upsampling) - Support upsampled error count with performance optimizations#13
akshayutture-augment wants to merge 2 commits into
masterfrom
error-upsampling-race-condition

akshayutture-augment commented Nov 14, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

akshayutture-augment commented Nov 14, 2025

Uh oh!

akshayutture-augment commented Nov 18, 2025

Uh oh!

coderabbitai Bot commented Nov 18, 2025

Uh oh!

coderabbitai Bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

akshayutture-augment commented Nov 22, 2025

Uh oh!

coderabbitai Bot commented Nov 22, 2025

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Nov 22, 2025

Uh oh!

coderabbitai Bot Nov 22, 2025

Uh oh!

coderabbitai Bot Nov 22, 2025

Uh oh!

coderabbitai Bot Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		assert data[0][1][0]["count"] == 10 # First bucket has 1 event
		assert data[1][1][0]["count"] == 10 # Second bucket has 1 event

Conversation

akshayutture-augment commented Nov 14, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

akshayutture-augment commented Nov 14, 2025

Uh oh!

akshayutture-augment commented Nov 18, 2025

Uh oh!

coderabbitai Bot commented Nov 18, 2025

Uh oh!

coderabbitai Bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

akshayutture-augment commented Nov 22, 2025

Uh oh!

coderabbitai Bot commented Nov 22, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akshayutture-augment commented Nov 14, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Nov 18, 2025 •

edited

Loading