feat(upsampling) - Support upsampled error count with performance optimizations#13
feat(upsampling) - Support upsampled error count with performance optimizations#13akshayutture-augment wants to merge 2 commits into
Conversation
…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.
|
@coderabbitai review |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
WalkthroughIntroduces error upsampling functionality to Sentry's event statistics system. Adds helper utilities to determine query eligibility based on organization allowlists and dataset type, applies transformations to rewrite count aggregations to upsampled_count, integrates these checks into the organization events stats endpoint, introduces a new upsampled_count function to the discover dataset, and includes comprehensive test coverage for the new functionality. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Endpoint as organization_events_stats<br/>endpoint
participant Helper as error_upsampling<br/>helper
participant Cache as Eligibility<br/>Cache
participant Snuba
Client->>Endpoint: GET /organization/events/stats
Endpoint->>Helper: is_errors_query_for_error_upsampled_projects()
Helper->>Cache: Check eligibility cache
alt Cache Miss
Cache-->>Helper: cache miss
Helper->>Helper: _are_all_projects_error_upsampled()
Helper->>Helper: _should_apply_sample_weight_transform()
Helper->>Cache: Store result (60s TTL)
else Cache Hit
Cache-->>Helper: cached eligibility
end
Helper-->>Endpoint: upsampling_enabled: bool
alt Upsampling Enabled
Endpoint->>Helper: transform_query_columns_for_error_upsampling()
Helper-->>Endpoint: count() → upsampled_count() as count
Endpoint->>Snuba: Query with transformed columns
else Upsampling Disabled
Endpoint->>Snuba: Query with original columns
end
Snuba-->>Endpoint: Results
Endpoint-->>Client: Stats response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (6)
sentry-repo (1)
49-64: Span link sanitization and dropped-attribute accounting look solidThe new
linkshandling,_sanitize_span_link, and corresponding test correctly preserve known attributes and accumulatesentry.dropped_attributes_count(including pre-existing drops). The JSON-encoding behavior is validated by the test and matches the intended storage model.If you want to harden this further, you could defensively coerce
sentry.dropped_attributes_counttointinside_sanitize_span_linkto avoid unexpected types from Relay input, but that's strictly optional.Also applies to: 74-100, 111-133
src/sentry/api/helpers/error_upsampling.py (1)
13-41: Error upsampling helpers are well-structured; a couple of small nitsThe overall flow (allowlist check + short cache, dataset gating, and
count()→upsampled_count()rewrite) looks coherent and matches the intended behavior.Two minor points to consider:
Unused
organizationparameter in_are_all_projects_error_upsampled
organizationis currently unused. If the allowlist is meant to be organization-scoped, you probably want:allowlist = options.get( "issues.client_error_sampling.project_allowlist", [], organization=organization, )Otherwise, consider dropping the argument to avoid confusion.
_is_error_focused_queryuses a raw substring match
Checking only"event.type:error" in queryis simple but will also treat queries like!event.type:erroror mixed-type queries as “error focused”. If you need stricter semantics later (e.g., “only errors”), you might want to tighten this to parse individual terms rather than doing a plain substring search.Neither issue is blocking, but they’re worth keeping in mind as the feature evolves.
Also applies to: 43-65, 67-100, 103-140
src/sentry/search/events/datasets/discover.py (1)
1041-1052:upsampled_countimplementation looks correct; tweak lambda arg to avoid lint noiseThe aggregate
toInt64(sum(sample_weight))matches the intended upsampled-count semantics and aligns with the transform intransform_query_columns_for_error_upsampling.You can silence the Ruff
ARG005warning and make intent clearer by ignoring the unusedargsparameter:- SnQLFunction( - "upsampled_count", - required_args=[], - # Optimized aggregation for error upsampling - assumes sample_weight - # exists for all events in allowlisted projects as per schema design - snql_aggregate=lambda args, alias: Function( - "toInt64", - [Function("sum", [Column("sample_weight")])], - alias, - ), - default_result_type="number", - ), + SnQLFunction( + "upsampled_count", + required_args=[], + # Optimized aggregation for error upsampling - assumes sample_weight + # exists for all events in allowlisted projects as per schema design + snql_aggregate=lambda _, alias: Function( + "toInt64", + [Function("sum", [Column("sample_weight")])], + alias, + ), + default_result_type="number", + ),tests/sentry/api/helpers/test_error_upsampling.py (1)
19-101: Helper tests are solid; consider a small cache/invalidation round-trip testThe tests nicely cover allowlist evaluation, column transformation (including case/whitespace), and dataset/query gating.
If you want to future-proof the caching logic, an extra unit test exercising
is_errors_query_for_error_upsampled_projectstogether withinvalidate_upsampling_cache(e.g., toggling the allowlist and asserting the cached result updates after invalidation) would help catch regressions in the memoization layer without relying solely on higher-level endpoint tests.tests/snuba/api/endpoints/test_organization_events_stats.py (2)
3554-3722: Consider expanding test coverage for edge cases.The current test suite covers basic scenarios well, but consider adding tests for:
- Events without
error_samplingcontext (fallback behavior)- Different sample rates (0.01, 0.5, 1.0)
- Edge case:
client_sample_rate = 0or= 1- Multiple yAxis parameters with upsampling (e.g.,
["count()", "count_unique(user)"])- Integration with
topEventsparameter- Comparison with
comparisonDeltaparameterThese additional test cases would ensure the error upsampling feature handles various scenarios correctly.
3617-3617: Clarify dataset parameter usage for consistency.The test at line 3687 explicitly specifies
"dataset": "discover"for transaction events, but other tests querying error events don't specify a dataset parameter. For clarity and consistency, consider explicitly specifying"dataset": "errors"in tests that query error events, or add a comment explaining the default dataset behavior.Also applies to: 3641-3641, 3687-3687, 3711-3711
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
pyproject.toml(2 hunks)sentry-repo(1 hunks)src/sentry/api/endpoints/organization_events_stats.py(6 hunks)src/sentry/api/helpers/error_upsampling.py(1 hunks)src/sentry/search/events/datasets/discover.py(1 hunks)src/sentry/testutils/factories.py(3 hunks)tests/sentry/api/helpers/test_error_upsampling.py(1 hunks)tests/snuba/api/endpoints/test_organization_events_stats.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.5)
src/sentry/api/helpers/error_upsampling.py
44-44: Unused function argument: organization
(ARG001)
src/sentry/search/events/datasets/discover.py
1046-1046: Unused lambda argument: args
(ARG005)
src/sentry/testutils/factories.py
351-352: try-except-pass detected, consider logging the exception
(S110)
351-351: Do not catch blind exception: Exception
(BLE001)
356-357: try-except-pass detected, consider logging the exception
(S110)
356-356: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (2)
pyproject.toml (1)
176-177: Aligning new helpers/tests with stronger typing is appropriateIncluding
sentry.api.helpers.error_upsamplingand its tests in the stricter mypy overrides is consistent with the rest of the configuration and should help keep this logic well-typed.Also applies to: 464-465
tests/snuba/api/endpoints/test_organization_events_stats.py (1)
3610-3621: Feature flags are not required for error upsampling.The error upsampling functionality is controlled via configuration options (allowlist), not feature flags. The helpers in
src/sentry/api/helpers/error_upsampling.pycheckoptions.get("issues.client_error_sampling.project_allowlist")and verify dataset/query type—there are no feature flag checks anywhere in this code path. The test methods correctly mock the options to simulate different allowlist states, making feature flag wrapping unnecessary. Unlike other tests in this file that feature-gate specific functionality, these tests verify option-controlled behavior, which is a different mechanism.Likely an incorrect or invalid review comment.
| from sentry.api.helpers.error_upsampling import ( | ||
| is_errors_query_for_error_upsampled_projects, | ||
| transform_query_columns_for_error_upsampling, | ||
| ) |
There was a problem hiding this comment.
Upsampling eligibility should use scoped_dataset, not outer dataset
The new wiring in _get_event_stats correctly centralizes the upsampling decision and applies the column transform right before query execution. However, there’s an important edge case:
should_upsample = is_errors_query_for_error_upsampled_projects(
snuba_params, organization, dataset, request
)Here you pass the outer dataset, even though _get_event_stats is parameterized by scoped_dataset and get_event_stats_factory explicitly calls it with different datasets (notably split_dataset = discover when splitting a metrics-backed dashboard widget into error vs transaction-like queries).
When the original dataset is a metrics dataset and split_dataset is discover, _should_apply_sample_weight_transform inside is_errors_query_for_error_upsampled_projects will see the metrics dataset and return False, so upsampling is silently disabled for those split error queries even though they’re executed against discover where sample_weight is available.
You likely want to base the eligibility on the dataset actually being queried:
- should_upsample = is_errors_query_for_error_upsampled_projects(
- snuba_params, organization, dataset, request
- )
+ should_upsample = is_errors_query_for_error_upsampled_projects(
+ snuba_params, organization, scoped_dataset, request
+ )This keeps behavior unchanged in the simple path (where scoped_dataset == dataset), while enabling correct upsampling for the additional split queries the factory issues.
Also applies to: 124-124, 209-298, 326-499
🤖 Prompt for AI Agents
In src/sentry/api/endpoints/organization_events_stats.py around lines 14-17,
124, 209-298, and 326-499, the calls to
is_errors_query_for_error_upsampled_projects (and any related upsampling
eligibility checks) incorrectly pass the outer dataset variable; change these
calls to pass scoped_dataset (the dataset actually being queried) so the
upsampling eligibility and subsequent sample_weight/column transform logic use
the query-specific dataset. Update any nearby uses that infer dataset for
transform or eligibility to use scoped_dataset, and run tests to confirm
behavior is unchanged when scoped_dataset == dataset but correct for split
queries (e.g., metrics->discover).
| from base64 import b64encode | ||
| from binascii import hexlify | ||
| from collections.abc import Mapping, Sequence | ||
| from collections.abc import Mapping, MutableMapping, Sequence |
There was a problem hiding this comment.
Tighten _set_sample_rate_from_error_sampling semantics and exception handling
The propagation of contexts.error_sampling.client_sample_rate into sample_rate is useful, but there are two edge cases worth fixing:
-
0sample rate is skipped
Usingif client_sample_rate:means a valid0(or"0") will be ignored. You likely want to treat0as a valid explicit value. -
Bare
except Exception: passhides malformed data
Swallowing all exceptions without logging makes it hard to notice bad payloads. Narrowing the exception types keeps tests robust without masking real problems.
A minimal adjustment that preserves resilience while fixing these issues:
-def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None:
- """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid."""
- client_sample_rate = None
- try:
- client_sample_rate = (
- normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate")
- )
- except Exception:
- pass
- if client_sample_rate:
- try:
- normalized_data["sample_rate"] = float(client_sample_rate)
- except Exception:
- pass
+def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None:
+ """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid."""
+ client_sample_rate = (
+ normalized_data.get("contexts", {})
+ .get("error_sampling", {})
+ .get("client_sample_rate")
+ )
+ if client_sample_rate is not None:
+ try:
+ normalized_data["sample_rate"] = float(client_sample_rate)
+ except (TypeError, ValueError):
+ # Ignore invalid values but don't break tests
+ passThe call in store_event then benefits from the stricter but still safe behavior without further changes.
Also applies to: 344-357, 1047-1050
🤖 Prompt for AI Agents
In src/sentry/testutils/factories.py around lines 11 (and also apply same
changes to 344-357 and 1047-1050), tighten
`_set_sample_rate_from_error_sampling` by treating explicit zero as a valid
sample rate (replace the truthy check like `if client_sample_rate:` with an
explicit None/empty check so values 0 and "0" are accepted) and remove the bare
`except Exception:` in favor of narrowly catching expected parsing/errors (e.g.,
ValueError, TypeError, KeyError) and surface or log the error instead of
silently passing; update the handling accordingly so malformed data does not get
silently swallowed while preserving resilience.
| "event_id": "a" * 32, | ||
| "message": "very bad", | ||
| "type": "error", | ||
| "exception": [{"type": "ValueError", "value": "Something went wrong 1"}], |
There was a problem hiding this comment.
Fix the exception field structure.
The exception field should be a dictionary with a values key, not a direct list. The correct format is:
"exception": {
"values": [{"type": "ValueError", "value": "Something went wrong 1"}]
}This matches the Sentry event schema and the pattern used elsewhere in this file (see lines 2479-2487).
Apply this diff:
- "exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
+ "exception": {
+ "values": [{"type": "ValueError", "value": "Something went wrong 1"}]
+ },And similarly for line 3588:
- "exception": [{"type": "ValueError", "value": "Something went wrong 2"}],
+ "exception": {
+ "values": [{"type": "ValueError", "value": "Something went wrong 2"}]
+ },Also applies to: 3588-3588
🤖 Prompt for AI Agents
In tests/snuba/api/endpoints/test_organization_events_stats.py around lines 3575
and 3588, the test uses "exception" as a direct list but it must be an object
with a "values" key per Sentry schema; change each "exception": [ ... ] to
"exception": {"values": [ ... ]} so the exception payload matches the expected
structure used elsewhere in the file.
| assert data[0][1][0]["count"] == 10 # First bucket has 1 event | ||
| assert data[1][1][0]["count"] == 10 # Second bucket has 1 event |
There was a problem hiding this comment.
Clarify misleading comments.
The comments state "First bucket has 1 event" and "Second bucket has 1 event", but the assertions expect count == 10. The comments should clarify that each bucket contains 1 sampled event that is upsampled 10x due to the client_sample_rate: 0.1.
Apply this diff:
- assert data[0][1][0]["count"] == 10 # First bucket has 1 event
- assert data[1][1][0]["count"] == 10 # Second bucket has 1 event
+ assert data[0][1][0]["count"] == 10 # First bucket: 1 event upsampled 10x (sample_rate=0.1)
+ assert data[1][1][0]["count"] == 10 # Second bucket: 1 event upsampled 10x (sample_rate=0.1)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| assert data[0][1][0]["count"] == 10 # First bucket has 1 event | |
| assert data[1][1][0]["count"] == 10 # Second bucket has 1 event | |
| assert data[0][1][0]["count"] == 10 # First bucket: 1 event upsampled 10x (sample_rate=0.1) | |
| assert data[1][1][0]["count"] == 10 # Second bucket: 1 event upsampled 10x (sample_rate=0.1) |
🤖 Prompt for AI Agents
In tests/snuba/api/endpoints/test_organization_events_stats.py around lines 3626
to 3627, the inline comments are misleading: they say "First/Second bucket has 1
event" while the assertions expect count == 10 due to upsampling from
client_sample_rate: 0.1. Update the comments to explain that each bucket
contains 1 sampled event which is upsampled by a factor of 10
(client_sample_rate 0.1) resulting in count == 10, so replace the existing
comments with a concise note describing sampled event + upsampling.
Test 3
Summary by CodeRabbit
New Features
upsampled_countaggregation function for discovering upsampled error metrics.Tests
✏️ Tip: You can customize this high-level summary in your review settings.
Replicated from ai-code-review-evaluation/sentry-coderabbit#3