Skip to content

feat(upsampling) - Support upsampled error count with performance optimizations#8

Open
ShashankFC wants to merge 2 commits into
masterfrom
error-upsampling-race-condition
Open

feat(upsampling) - Support upsampled error count with performance optimizations#8
ShashankFC wants to merge 2 commits into
masterfrom
error-upsampling-race-condition

Conversation

@ShashankFC

@ShashankFC ShashankFC commented Feb 13, 2026

Copy link
Copy Markdown

User description

Test 3

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced error upsampling support for organization event statistics queries, improving accuracy of error metrics for eligible projects through sample-weighted counting.
    • Upsampling applies selectively based on project allowlisting and query type.
  • Tests

    • Added comprehensive test coverage for error upsampling across various project allowlist scenarios and event dataset types.

✏️ Tip: You can customize this high-level summary in your review settings.


Replicated from ai-code-review-evaluation/sentry-coderabbit#3


CodeAnt-AI Description

Support error upsampling in organization event statistics with cached eligibility checks

What Changed

  • When all queried projects are allowlisted and the query targets error events, count() aggregations in the organization events-stats endpoint are replaced with an upsampled count that sums sample_weight, so sampled error events are counted at their effective weight.
  • The allowlist eligibility check is cached for 60 seconds to avoid repeated expensive lookups and improve query performance during bursts.
  • A new SnQL aggregation (upsampled_count) is available in discover to compute the sum of sample_weight for upsampled counts.
  • Event creation in test factories preserves client-provided error sampling rates so stored events can be upsampled in tests.
  • Comprehensive tests cover: all-projects-allowlisted behavior, partial/no-allowlist fallbacks, and correct handling of transaction vs. error queries.

Impact

✅ Accurate error counts for allowlisted sampled events
✅ Unchanged counts when not all projects are allowlisted
✅ Lower overhead for repeated allowlist checks during high-traffic queries

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

yuvmen and others added 2 commits July 25, 2025 09:48
…(#94376)

Part of the Error Upsampling project:
https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link

Events-stats API will now check if all projects in the query are
allowlisted for upsampling, and convert the count query to a sum over
`sample_weight` in Snuba, this is done by defining a new SnQL function
`upsampled_count()`.

I noticed there are also eps() and epm() functions in use in this
endpoint. I considered (and even worked on) also supporting
swapping eps() and epm() which for correctness should probably also not
count naively and use `sample_weight`, but this
caused some complications and since they are only in use by specific
dashboard widgets and not available in discover
I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance
- Separate upsampling eligibility check from query transformation for better optimization
- Remove unnecessary null checks in upsampled_count() function per schema requirements
- Add cache invalidation utilities for configuration management

This improves performance during high-traffic periods by avoiding repeated
expensive allowlist lookups while maintaining data consistency.
@ShashankFC

Copy link
Copy Markdown
Author

@CodeAnt-AI: review

@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added the size:L This PR changes 100-499 lines, ignoring generated files label Feb 13, 2026
@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Fragile aggregation detection
    The column transform logic only matches exact count() (case-insensitive after lower/strip) and replaces it with upsampled_count() as count. This misses common forms like count() AS cnt, COUNT() as count, count(column), or count() as "count", and it discards any original alias. This can cause incorrect query rewriting or broken aliases in results.

  • Missing null fallback
    The new upsampled_count aggregation sums sample_weight directly. If sample_weight is NULL for any events (e.g. non-allowlisted projects or older rows), the sum could be NULL or yield unexpected results. The aggregator should match existing semantics (fall back to 1 when sample_weight is missing) to preserve correct counts.

  • Integer cast/truncation
    The aggregator wraps the sum in toInt64, which may truncate fractional sample weights. Summed sample weights are often fractional and should be represented as a float (or rounded intentionally). Confirm whether truncation is intended; otherwise use a float cast (e.g., toFloat64) or preserve numeric precision.

  • Allowlist membership types
    The allowlist is read from options.get(...) and used with project_id in allowlist. If the stored allowlist contains strings (e.g., config serialized values) while project_id is an int, the membership test will fail. Consider normalizing types (cast allowlist entries to ints) before membership checks.

  • Missing query-level assertion
    Tests check the endpoint response counts but don't assert that the underlying SnQL aggregation was changed to the upsampled aggregation (sum of sample_weight / upsampled_count). This could miss regressions where counts appear correct by coincidence but the query uses the wrong aggregation.

# Early upsampling eligibility check for performance optimization
# This cached result ensures consistent behavior across query execution
should_upsample = is_errors_query_for_error_upsampled_projects(
snuba_params, organization, dataset, request

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The upsampling eligibility check currently always uses the outer dataset value instead of the per-call scoped_dataset, so in dashboard widget split flows (where _get_event_stats is invoked with different datasets like discover for error-only queries) error upsampling may incorrectly be disabled or enabled for the actual query being run; this should use scoped_dataset so the decision matches the dataset of each query execution. [logic error]

Severity Level: Critical 🚨
- ❌ Error dashboards undercount events for upsampled error projects.
- ⚠️ `sentry-api-0-organization-events-stats` misreports metricsEnhanced error series.
- ⚠️ Dashboard widgets using metricsEnhanced errors show inconsistent counts.
Suggested change
snuba_params, organization, dataset, request
snuba_params, organization, scoped_dataset, request
Steps of Reproduction ✅
1. Trigger the metrics-enhanced events stats endpoint using the existing tests at
`tests/snuba/api/endpoints/test_organization_events_stats_mep.py`, for example
`OrganizationEventsStatsMetricsEnhancedPerformanceEndpointTest.test_split_decision_for_top_events_errors_widget`
(around lines 859–883), which issues a GET to the `sentry-api-0-organization-events-stats`
endpoint with `dataset=metricsEnhanced`, `topEvents=5`, and an error-focused query
(`field=[\"count()\",\"error.type\"]`, `query=\"error.type:test_error\"`) and passes
`dashboardWidgetId=<widget id>`.

2. The request is handled by `OrganizationEventsStatsEndpoint.get` in
`src/sentry/api/endpoints/organization_events_stats.py` (lines 105–195). It resolves
`dataset` to the metrics-enhanced dataset (e.g. `metrics_enhanced_performance`), sets
`use_rpc`/`use_metrics` flags, then constructs `get_event_stats =
get_event_stats_factory(dataset)` at line 499, closing over this outer `dataset` variable.

3. Inside `get_event_stats_factory` (lines 326–497), because `metrics_enhanced` is true
and `dashboardWidgetId` is provided, the inner `fn` (lines 335–495) takes the dashboard
widget path: it computes `errors_only_query` and calls `_get_event_stats(discover,
query_columns, errors_only_query, ...)` at lines 382–394, i.e. with `scoped_dataset` equal
to the `discover` dataset module while the closed-over `dataset` remains the
metrics-enhanced dataset module.

4. In `_get_event_stats` (definition at lines 209–324), when invoked for this error-only
query, it runs the existing code at lines 218–222: `should_upsample =
is_errors_query_for_error_upsampled_projects(snuba_params, organization, dataset,
request)`. The helper `is_errors_query_for_error_upsampled_projects` (in
`src/sentry/api/helpers/error_upsampling.py:13–40`) calls
`_should_apply_sample_weight_transform(dataset, request)` (line 32), which checks the
*dataset module* (lines 103–127) and only returns true for `errors` or an error-focused
`discover` query. Because `_get_event_stats` passes the outer `dataset`
(metrics-enhanced), not the per-call `scoped_dataset` (`discover`),
`_should_apply_sample_weight_transform` returns False even when the actual query is
against `discover` with an error-only filter and all projects are allowlisted for
upsampling, so `should_upsample` is False and upsampling is incorrectly disabled for these
split error queries.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/sentry/api/endpoints/organization_events_stats.py
**Line:** 221:221
**Comment:**
	*Logic Error: The upsampling eligibility check currently always uses the outer `dataset` value instead of the per-call `scoped_dataset`, so in dashboard widget split flows (where `_get_event_stats` is invoked with different datasets like `discover` for error-only queries) error upsampling may incorrectly be disabled or enabled for the actual query being run; this should use `scoped_dataset` so the decision matches the dataset of each query execution.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@codeant-ai

codeant-ai Bot commented Feb 13, 2026

Copy link
Copy Markdown

CodeAnt AI finished running the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants