feat(upsampling) - Support upsampled error count with performance optimizations by ShashankFC · Pull Request #3 · AI-Code-Review-Evals/claude_code-sentry

ShashankFC · 2025-12-31T09:46:16Z

Test 3nn---nReplicated from ai-code-review-evaluation/sentry-greptile#3

…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.

- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.

ShashankFC · 2026-01-07T19:13:48Z

@claude

claude · 2026-01-07T19:14:03Z

Claude finished @ShashankFC's task in 7m 14s —— View job

PR Review

Todo List

Gather context and review PR changes
Read modified and new files
Analyze implementation for issues and improvements
Provide comprehensive review feedback

Overview

This PR implements error upsampling functionality for the events-stats API with performance optimizations including caching. The implementation allows Sentry to report upsampled error counts based on sample_weight for allowlisted projects.

Critical Issues

1. Race Condition in Cache Implementation ⚠️

Location: src/sentry/api/helpers/error_upsampling.py:27-40

There's a critical race condition in the caching logic that matches the PR branch name error-upsampling-race-condition:

cached_result = cache.get(cache_key)
if cached_result is not None:
    return cached_result and _should_apply_sample_weight_transform(dataset, request)

# Cache miss - perform fresh allowlist check
is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization)
cache.set(cache_key, is_eligible, 60)

Problem: The function _are_all_projects_error_upsampled() (lines 50-54) reads the allowlist configuration fresh every time it's called, even though the result is cached. This creates a window where:

Thread A checks cache (miss), reads allowlist (returns True)
Thread B checks cache (miss), reads allowlist (returns True)
Allowlist configuration changes (project removed from allowlist)
Thread A caches True
Thread B caches True
Both continue with stale data for 60 seconds

Additionally, there's a second race condition between the eligibility check and the dataset check:

Eligibility is checked and cached (project is allowlisted)
_should_apply_sample_weight_transform is called later
Configuration could change between these two checks

Impact: Events could be incorrectly upsampled for projects that were just removed from the allowlist, or vice versa, for up to 60 seconds.

Recommendation:

# Option 1: Cache the complete decision
should_upsample = is_errors_query_for_error_upsampled_projects(...)
# Store this boolean and use it consistently throughout request

# Option 2: Document this as acceptable eventual consistency
# Add comment explaining the 60-second window is intentional

Fix race condition →

2. Inconsistent Cache Key Generation

Location: src/sentry/api/helpers/error_upsampling.py:27 and error_upsampling.py:73

The cache key uses hash(tuple(sorted(project_ids))) which can have collisions:

cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"

Problem: Python's hash() is not cryptographically secure and can produce collisions, especially for similar project ID combinations. Different sets of project IDs could hash to the same value, causing incorrect cache hits.

Recommendation: Use a deterministic string representation instead:

cache_key = f"error_upsampling_eligible:{organization.id}:{','.join(map(str, sorted(snuba_params.project_ids)))}"

Fix cache key →

3. Missing Error Handling

Location: src/sentry/api/endpoints/organization_events_stats.py:220-226

The upsampling check has no error handling:

should_upsample = is_errors_query_for_error_upsampled_projects(
    snuba_params, organization, dataset, request
)

Problem: If the options service fails, cache fails, or any exception occurs, the entire query will fail. This could cause a production outage.

Recommendation: Add defensive error handling with fallback to non-upsampled queries:

try:
    should_upsample = is_errors_query_for_error_upsampled_projects(...)
except Exception as e:
    sentry_sdk.capture_exception(e)
    should_upsample = False  # Fail safe - use regular count

Add error handling →

Major Issues

4. Type Safety Issue with ModuleType

Location: src/sentry/api/helpers/error_upsampling.py:16

def is_errors_query_for_error_upsampled_projects(
    snuba_params: SnubaParams,
    organization: Organization,
    dataset: ModuleType,  # ❌ Too generic
    request: Request,
) -> bool:

Problem: Using ModuleType for the dataset parameter is overly permissive and loses type safety. Any Python module can be passed, not just dataset modules.

Recommendation: Define a protocol or use a more specific type:

from typing import Protocol

class DatasetModule(Protocol):
    """Protocol for dataset modules"""
    pass

def is_errors_query_for_error_upsampled_projects(
    snuba_params: SnubaParams,
    organization: Organization,
    dataset: DatasetModule,  # ✅ More specific
    request: Request,
) -> bool:

5. Incomplete Test Coverage

Location: tests/sentry/api/helpers/test_error_upsampling.py

The unit tests don't cover:

Cache hit scenarios
Cache invalidation (invalidate_upsampling_cache is never tested)
Race conditions between eligibility and dataset checks
Edge cases like empty project lists with cached results

Recommendation: Add tests for:

def test_cache_hit_scenario(self):
    # Test that cached results are used
    
def test_invalidate_upsampling_cache(self):
    # Test cache invalidation works
    
def test_race_condition_scenario(self):
    # Test behavior when config changes between checks

6. Magic Number - Cache TTL

Location: src/sentry/api/helpers/error_upsampling.py:38

cache.set(cache_key, is_eligible, 60)  # ❌ Magic number

Recommendation: Define as a named constant:

ERROR_UPSAMPLING_CACHE_TTL = 60  # seconds - balance between performance and config freshness

cache.set(cache_key, is_eligible, ERROR_UPSAMPLING_CACHE_TTL)

Minor Issues

7. Redundant Variable Assignment

Location: src/sentry/api/endpoints/organization_events_stats.py:224-226

upsampling_enabled = should_upsample  # Redundant
final_columns = query_columns

Problem: Creates unnecessary intermediate variables that don't add clarity.

Recommendation: Use should_upsample directly throughout the function.

8. Code Duplication

Location: src/sentry/api/endpoints/organization_events_stats.py:232-233, 276-277, 295-296

The transformation is duplicated three times:

if upsampling_enabled:
    final_columns = transform_query_columns_for_error_upsampling(query_columns)

Recommendation: Extract to a helper function or apply transformation once at the beginning.

9. Comment Grammar

Location: src/sentry/api/helpers/error_upsampling.py:52-53

# NOTE: This function reads the allowlist configuration fresh each time,
# which means it can return different results between calls if the

The comment is helpful but could be more precise about why this behavior exists and the implications.

10. Missing Docstring for `upsampled_count`

Location: src/sentry/search/events/datasets/discover.py:1041-1052

SnQLFunction(
    "upsampled_count",
    required_args=[],
    # Optimized aggregation for error upsampling - assumes sample_weight
    # exists for all events in allowlisted projects as per schema design
    snql_aggregate=lambda args, alias: Function(
        "toInt64",
        [Function("sum", [Column("sample_weight")])],
        alias,
    ),
    default_result_type="number",
),

Recommendation: Add a docstring explaining when this function should be used and its assumptions.

Performance Concerns

11. Cache Key Complexity

The cache key computation involves:

Tuple creation
Sorting project IDs
Hashing

For queries with many projects, this could be expensive. Consider profiling for high project counts.

12. Redundant Dataset Checks

The _should_apply_sample_weight_transform function is called after caching, which means the dataset check happens on every request even for cache hits. Consider including the dataset type in the cache key.

Security Concerns

13. Query Injection Risk (Low)

Location: src/sentry/api/helpers/error_upsampling.py:135-136

query = request.GET.get("query", "").lower()
if "event.type:error" in query:

This is a simple string check, but it doesn't validate query syntax. An attacker could potentially craft a query that includes event.type:error as a comment or string literal to trigger upsampling inappropriately.

Recommendation: Use proper query parsing instead of string matching if available in the codebase.

Documentation Issues

14. Missing Architecture Documentation

The PR description is minimal ("Test 3nn---n*Replicated from..."). There should be documentation explaining:

How error upsampling works
When to use it
How to add projects to the allowlist
Performance characteristics
Known limitations (eventual consistency)

15. Unclear Function Naming

The function _is_error_focused_query (line 130) only checks for event.type:error but doesn't check for error-specific filters. The name suggests broader checking.

Test Issues

16. Mock Test Data Inconsistency

Location: tests/snuba/api/endpoints/test_organization_events_stats.py:3576-3580

"contexts": {"error_sampling": {"client_sample_rate": 0.1}},

The test data includes client_sample_rate: 0.1 suggesting events should be upsampled by 10x, but the assertions expect exactly 10:

assert data[0][1][0]["count"] == 10  # First bucket has 1 event

This works, but the comment "has 1 event" is misleading - it has 1 event that's being reported as 10 due to upsampling.

17. Missing Edge Case Tests

No tests for:

Queries with top_events > 0
Queries with comparison delta
Empty result sets
Very large project lists

Positive Observations ✅

Good separation of concerns - upsampling logic is in a dedicated module
Comprehensive test coverage for basic scenarios
Caching strategy is a good optimization approach
Code follows Sentry patterns from CLAUDE.md
Type hints are used throughout
Comments explain intent well in most places

Summary

This PR implements a valuable feature but has several critical issues that should be addressed before merging:

Must Fix:

Race condition in cache implementation (addresses PR branch name concern)
Cache key collision risk
Missing error handling

Should Fix:
4. Type safety with ModuleType
5. Test coverage gaps
6. Magic numbers

Nice to Have:
7. Code deduplication
8. Documentation improvements
9. Performance optimizations

The implementation is on the right track but needs refinement for production readiness, especially around the caching strategy and race condition handling.

--- | Branch: error-upsampling-race-condition

yuvmen and others added 2 commits July 25, 2025 09:48

This was referenced Jan 7, 2026

feat(uptime): Add ability to use queues to manage parallelism #9

Draft

Span Buffer Multiprocess Enhancement with Health Monitoring #6

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(upsampling) - Support upsampled error count with performance optimizations#3

feat(upsampling) - Support upsampled error count with performance optimizations#3
ShashankFC wants to merge 2 commits into
masterfrom
error-upsampling-race-condition

ShashankFC commented Dec 31, 2025

Uh oh!

ShashankFC commented Jan 7, 2026

Uh oh!

claude Bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ShashankFC commented Dec 31, 2025

Uh oh!

ShashankFC commented Jan 7, 2026

Uh oh!

claude Bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Todo List

Overview

Critical Issues

1. Race Condition in Cache Implementation ⚠️

2. Inconsistent Cache Key Generation

3. Missing Error Handling

Major Issues

4. Type Safety Issue with ModuleType

5. Incomplete Test Coverage

6. Magic Number - Cache TTL

Minor Issues

7. Redundant Variable Assignment

8. Code Duplication

9. Comment Grammar

10. Missing Docstring for upsampled_count

Performance Concerns

11. Cache Key Complexity

12. Redundant Dataset Checks

Security Concerns

13. Query Injection Risk (Low)

Documentation Issues

14. Missing Architecture Documentation

15. Unclear Function Naming

Test Issues

16. Mock Test Data Inconsistency

17. Missing Edge Case Tests

Positive Observations ✅

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude Bot commented Jan 7, 2026 •

edited

Loading

10. Missing Docstring for `upsampled_count`