feat(upsampling) - Support upsampled error count with performance optimizations#8
feat(upsampling) - Support upsampled error count with performance optimizations#8Frawatson wants to merge 3 commits into
Conversation
…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.
|
@grapple-pr review |
Applied by Grapple PR auto-fix https://www.grapple-pr.com Grapple-PR-Autofix: true
There was a problem hiding this comment.
🤖 Grapple PR Review
Summary
This PR adds 3 files (sentry-repo, error_upsampling.py, test_error_upsampling.py), modifies 5 files. Title: "feat(upsampling) - Support upsampled error count with performance optimizations"
Changes: 8 files (+480 / -6 lines)
- 🟢 3 files added:
sentry-repo,src/sentry/api/helpers/error_upsampling.py,tests/sentry/api/helpers/test_error_upsampling.py - 🟡 5 files modified:
pyproject.toml,src/sentry/api/endpoints/organization_events_stats.py,src/sentry/search/events/datasets/discover.py,src/sentry/testutils/factories.py,tests/snuba/api/endpoints/test_organization_events_stats.py
Resolves: #3 (ref(crons): Reorganize incident creation / issue occurrence logic)
CI: ⏳ 0 passed, 0 failed
Blast radius: 🔴 High (score: 58/100 · cross-module)
- 6 files affected outside this PR
- 19 functions in blast zone
Impact Diagram
graph TD
pr_0[/"🔀 PR Changes: 8 files"/]
f1["🟡 pyproject.toml +2/-0"]
pr_0 --> f1
f2["🟢 sentry-repo +1/-0"]
pr_0 --> f2
f3["🟡 organization_events_stats.py +33/-5"]
pr_0 --> f3
f4["🟢 error_upsampling.py +140/-0"]
pr_0 --> f4
f5["🟡 discover.py +12/-0"]
pr_0 --> f5
f6["🟡 factories.py +20/-1"]
pr_0 --> f6
f7["🟢 test_error_upsampling.py +101/-0"]
pr_0 --> f7
f8["🟡 test_organization_events_stats.py +171/-0"]
pr_0 --> f8
f1 -.- f2
b9["⚡ urls.py"]
b10["⚡ group_stream.py"]
b11["⚡ organization_events.py"]
b12["⚡ group.py"]
b13["⚡ organization_events_spans_performance.py"]
b14["⚡ organization_events.py"]
f1 -->|affects| b9
f2 -->|affects| b10
f3 -->|affects| b11
f4 -->|affects| b12
f5 -->|affects| b13
f6 -->|affects| b14
warn_15["⚠️ Cross-module impact"]
f1 -->|cross-module| warn_15
style f1 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f2 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f3 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f4 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f5 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f6 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f7 fill:#7c3aed,color:#fff,stroke:#7c3aed
style f8 fill:#7c3aed,color:#fff,stroke:#7c3aed
2 critical issues found — please address before merging.
| Severity | Count |
|---|---|
| 🔴 Critical | 2 |
| 🟠 Major | 8 |
| 🟡 Minor | 5 |
| 🔵 Info | 3 |
Risk score: 428 📈 241 → 428 (+187, Worse vs. previous review)
⚠️ 2 findings may have been introduced by a previously applied autofix. These are new issues at locations where fixes were just applied — review carefully before applying more fixes.
📌 1 finding on unchanged code (not shown inline)
These findings target code outside the PR diff — they can't be posted as inline comments but may still need attention.
| Severity | File | Agent | Issue |
|---|---|---|---|
| 🟠 major | src/sentry/api/endpoints/organization_events_stats.py:218 |
performance | transform_query_columns_for_error_upsampling is called up to 3 separate times within _get_event_stat… |
Review details
- Reviewed at: 2026-04-30T00:58:31.733Z
- Agents used: security, logic, style, architecture, performance, verification
- Context level: full
- Execution time: 112.2s
- Low-confidence findings suppressed: 4
- Blast radius: 58 (cross-module)
Powered by Grapple PR — AI that understands your code.
| ), | ||
| default_result_type="integer", | ||
| private=True, | ||
| ), |
There was a problem hiding this comment.
🔴 CRITICAL — Schema dependency / Cross-service impact (confidence: 95%)
The new upsampled_count SnQL function references a Column("sample_weight") which must exist in the Snuba/ClickHouse schema. There is no migration, schema change, or evidence in this PR that this column exists in the errors table. If the column doesn't exist, queries against non-allowlisted projects or any misconfiguration will cause hard query failures in Snuba, not graceful degradation.
Evidence:
- The SnQLFunction directly references Column("sample_weight") with no fallback or coalesce for NULL/missing values.
- Comments in error_upsampling.py line 88 state "We rely on the database schema to ensure sample_weight exists" but no schema migration is included in this PR.
- The discover dataset is shared across many query paths — registering a function here makes it callable from any discover query context, not just upsampling-enabled ones.
Agent: architecture
| ) | ||
| elif top_events <= 0: | ||
| return Response({"detail": "If topEvents needs to be at least 1"}, status=400) | ||
| return Response({"detail": "topEvents needs to be at least 1"}, status=400) |
There was a problem hiding this comment.
🔴 CRITICAL — API contract violation (confidence: 75%)
Breaking change to a public API error message. The error response string was changed from "If topEvents needs to be at least 1" to "topEvents needs to be at least 1". While fixing a typo, any downstream consumers (dashboards, CLI tools, SDK integrations) that match on this exact error string will break. This endpoint has ApiPublishStatus tracked, indicating it is a public API.
Evidence:
- The endpoint is decorated with @region_silo_endpoint and has ApiPublishStatus tracking, indicating it's a public-facing API.
- Error message strings in API responses are effectively part of the API contract.
- This change is unrelated to the upsampling feature and appears to be a drive-by fix bundled into the PR.
Agent: architecture
| expensive repeated option lookups during high-traffic periods. This is safe | ||
| because allowlist changes are infrequent and eventual consistency is acceptable. | ||
| """ | ||
| cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}" |
There was a problem hiding this comment.
🟠 MAJOR — Cache Correctness / Non-deterministic Cache Key (confidence: 100%)
The cache key uses Python's built-in hash() function on a tuple of project IDs. Python's hash() is randomized per process by default (PYTHONHASHSEED), so the same set of project IDs will produce different hash values in different worker processes or after restarts. This means cached allowlist results written by one worker cannot be read by another, defeating the caching optimization. Additionally, invalidate_upsampling_cache at line 62 reconstructs the same non-deterministic key, making cache invalidation across workers impossible.
Evidence:
- Line 27:
cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}" - Python docs: hash() values are randomized per interpreter session when PYTHONHASHSEED is not fixed (default in production).
- Line 62 in
invalidate_upsampling_cachereuses the samehash()construction, so invalidation from a different process will compute a different key and fail to invalidate the correct cache entry. - In a multi-worker gunicorn/uwsgi deployment (standard for Sentry), each worker has its own PYTHONHASHSEED, so cache keys are worker-specific.
Agent: security
| final_columns = query_columns | ||
|
|
||
| if top_events > 0: | ||
| # Apply upsampling transformation just before query execution |
There was a problem hiding this comment.
🟠 MAJOR — Module boundary violation (confidence: 100%)
The upsampling eligibility logic (dataset inspection, request parsing, option lookups, caching) is being invoked inside the inner closure _get_event_stats of an already high-complexity endpoint (complexity: 70). This couples query transformation decisions directly into the endpoint layer rather than the dataset/query-builder layer where column resolution belongs. The transformation is also applied redundantly in three separate branches (top_events, use_rpc, standard) with copy-pasted if-blocks.
Evidence:
- The same pattern
if upsampling_enabled: final_columns = transform_query_columns_for_error_upsampling(query_columns)appears at lines 228, 275, and 293 — three separate locations. - Column transformations are a query-building concern; the endpoint should not need to know about sample_weight semantics.
- The codebase intelligence flags this file at complexity 70, and this change adds branching logic that increases it further.
Agent: architecture
| cache.set(cache_key, is_eligible, 60) | ||
|
|
||
| return is_eligible and _should_apply_sample_weight_transform(dataset, request) | ||
|
|
There was a problem hiding this comment.
🟠 MAJOR — Cache correctness (confidence: 100%)
The cache key uses hash(tuple(sorted(project_ids))) which produces a Python hash value. Python's hash() for tuples is not stable across processes (due to PYTHONHASHSEED randomization since Python 3.3+). This means the same set of project IDs will produce different cache keys in different processes/workers, making the cache effectively useless and causing redundant option lookups rather than cache hits.
Evidence:
- Python 3 randomizes hash seeds by default (PEP 456), so hash((1,2,3)) differs between interpreter invocations.
- Sentry runs multiple worker processes, each with different hash seeds.
- The invalidate_upsampling_cache function at line 72 would also compute a different hash, making invalidation impossible across processes.
- cache.set stores only is_eligible (boolean for allowlist check), not the final should_upsample value
- Every cache hit for an eligible org still calls _should_apply_sample_weight_transform, which inspects request.GET and dataset
Agent: architecture
|
|
||
| # Check cache first for performance optimization | ||
| cached_result = cache.get(cache_key) | ||
| if cached_result is not None: |
There was a problem hiding this comment.
🟠 MAJOR — Cache key collision / incorrect caching (confidence: 100%)
The cache key uses Python's built-in hash() of a tuple of project IDs. hash() is not stable across Python processes or restarts (hash randomization is on by default in Python 3.3+), so two different processes or restarts can produce different hash values for the same project set, making the cache ineffective or causing cross-process key collisions if a shared cache (Redis/Memcached) is used.
Evidence:
- cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"
- Python's hash() is randomized per process by default (PYTHONHASHSEED). In a multi-worker deployment, every worker computes a different hash for the same project_ids set.
- If the cache backend is Redis (shared), different workers will write to different keys and never benefit from each other's cached results, effectively bypassing the caching optimization.
Agent: performance
| self.user2 = self.create_user() | ||
|
|
||
| # Store some error events with error_sampling context | ||
| self.store_event( |
There was a problem hiding this comment.
🟠 MAJOR — Variable shadowing / test correctness (confidence: 99%)
The test's setUp method shadows self.user — line 3569 assigns self.user = self.create_user() AFTER self.login_as(user=self.user) on line 3558. This means the login uses the original self.user from the parent class, but later references to self.user (e.g., self.user.email on line 3582) refer to the newly created, non-logged-in user. While this may not cause test failures if the email is just used as a tag value, it creates a confusing state where the authenticated user and self.user are different objects. More critically, self.authed_user = self.user on line 3559 captures the ORIGINAL user, then self.user is immediately overwritten on line 3569, so self.authed_user and self.user diverge silently.
Evidence:
- Line 3558:
self.login_as(user=self.user)— uses parent's self.user - Line 3559:
self.authed_user = self.user— captures parent's self.user - Line 3569:
self.user = self.create_user()— overwrites self.user with a NEW user object - Line 3570:
self.user2 = self.create_user()— creates another user - Line 3582:
"tags": {"sentry:user": self.user.email}— uses the NEW user, not the logged-in one
Agent: logic
| @@ -0,0 +1 @@ | |||
| Subproject commit a5d290951def84afdcc4c88d2f1f20023fc36e2a | |||
There was a problem hiding this comment.
🟠 MAJOR — Untracked submodule addition (confidence: 96%)
A new git submodule 'sentry-repo' is being added to the repository root pointing to commit a5d2909. This is a significant architectural change — adding a submodule affects all developers' clone/checkout workflows, CI pipelines, and deployment processes. There is no explanation for why this submodule is needed, and it appears unrelated to the error upsampling feature.
Evidence:
- The file shows 'Subproject commit a5d2909' which is git submodule syntax.
- No .gitmodules change is visible in the diff, suggesting this may be an accidental commit.
- This is at the repository root level, affecting all contributors.
Agent: architecture
| assert _should_apply_sample_weight_transform(discover, self.request) is True | ||
|
|
||
| self.request.GET = QueryDict("query=event.type:transaction") | ||
| assert _should_apply_sample_weight_transform(discover, self.request) is False |
There was a problem hiding this comment.
🟠 MAJOR — Test quality / Private API testing (confidence: 91%)
The tests directly import and test private functions (prefixed with underscore): _are_all_projects_error_upsampled, _is_error_focused_query, and _should_apply_sample_weight_transform. This creates tight coupling between tests and implementation details, making refactoring difficult. More importantly, there is no test for the main public function is_errors_query_for_error_upsampled_projects, which contains the caching logic — the most complex and error-prone part of the module.
Evidence:
- Lines 8-11 import _are_all_projects_error_upsampled, _is_error_focused_query, _should_apply_sample_weight_transform — all private functions.
- The public entry point is_errors_query_for_error_upsampled_projects is imported but never tested directly.
- The caching behavior (cache hit/miss, TTL, invalidation) has zero test coverage.
Agent: architecture
| status=400, | ||
| ) | ||
| elif top_events <= 0: | ||
| return Response({"detail": "If topEvents needs to be at least 1"}, status=400) | ||
| return Response({"detail": "topEvents needs to be at least 1"}, status=400) | ||
|
|
||
| comparison_delta = None | ||
| if "comparisonDelta" in request.GET: |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — architecture agent (Small fix (1 lines, 1 file))
Breaking change to a public API error message. The error response string was changed from "If topEvents needs to be at least 1" to "topEvents needs to be at least 1". While fixing a typo, any downstream consumers (dashboards, CLI tools, SDK integrations) that match on this exact error string will break. This endpoint has ApiPublishStatus tracked, indicating it is a public API.
| status=400, | |
| ) | |
| elif top_events <= 0: | |
| return Response({"detail": "If topEvents needs to be at least 1"}, status=400) | |
| return Response({"detail": "topEvents needs to be at least 1"}, status=400) | |
| comparison_delta = None | |
| if "comparisonDelta" in request.GET: | |
| status=400, | |
| ) | |
| elif top_events <= 0: | |
| return Response({"detail": "If topEvents needs to be at least 1"}, status=400) | |
| comparison_delta = None | |
| if "comparisonDelta" in request.GET: | |
🤖 Grapple PR auto-fix • critical • confidence: 75%
| expensive repeated option lookups during high-traffic periods. This is safe | ||
| because allowlist changes are infrequent and eventual consistency is acceptable. | ||
| """ | ||
| cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}" | ||
|
|
||
| # Check cache first for performance optimization | ||
| cached_result = cache.get(cache_key) |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — security agent (Small fix (2 lines, 1 file))
The cache key uses Python's built-in
hash()function on a tuple of project IDs. Python'shash()is randomized per process by default (PYTHONHASHSEED), so the same set of project IDs will produce different hash values in different worker processes or after restarts. This means cached allowlist results written by one worker cannot be read by another, defeating the caching optimization. Additionally,invalidate_upsampling_cacheat line 62 reconstructs the same non-deterministic key, making cache invalidation across workers impossible.
| expensive repeated option lookups during high-traffic periods. This is safe | |
| because allowlist changes are infrequent and eventual consistency is acceptable. | |
| """ | |
| cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}" | |
| # Check cache first for performance optimization | |
| cached_result = cache.get(cache_key) | |
| expensive repeated option lookups during high-traffic periods. This is safe | |
| because allowlist changes are infrequent and eventual consistency is acceptable. | |
| """ | |
| project_ids_str = ",".join(str(pid) for pid in sorted(snuba_params.project_ids)) | |
| cache_key = f"error_upsampling_eligible:{organization.id}:{project_ids_str}" | |
| # Check cache first for performance optimization | |
| cached_result = cache.get(cache_key) | |
🤖 Grapple PR auto-fix • major • confidence: 100%
| if not project_ids: | ||
| return False | ||
|
|
||
| allowlist = options.get("issues.client_error_sampling.project_allowlist", []) | ||
| if not allowlist: | ||
| return False | ||
|
|
||
| # All projects must be in the allowlist | ||
| result = all(project_id in allowlist for project_id in project_ids) | ||
| return result | ||
|
|
||
|
|
||
| def invalidate_upsampling_cache(organization_id: int, project_ids: Sequence[int]) -> None: |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — architecture agent (Small fix (7 lines, 1 file))
The allowlist check uses
all(project_id in allowlist for project_id in project_ids)where allowlist is a list from options.get(). For large allowlists, this is O(n*m) where n is project_ids count and m is allowlist size. The allowlist should be converted to a set for O(n) lookups.
| if not project_ids: | |
| return False | |
| allowlist = options.get("issues.client_error_sampling.project_allowlist", []) | |
| if not allowlist: | |
| return False | |
| # All projects must be in the allowlist | |
| result = all(project_id in allowlist for project_id in project_ids) | |
| return result | |
| def invalidate_upsampling_cache(organization_id: int, project_ids: Sequence[int]) -> None: | |
| if not project_ids: | |
| return False | |
| allowlist = set(options.get("issues.client_error_sampling.project_allowlist", [])) | |
| if not allowlist: | |
| return False | |
| # All projects must be in the allowlist | |
| result = all(project_id in allowlist for project_id in project_ids) | |
| return result | |
| def invalidate_upsampling_cache(organization_id: int, project_ids: Sequence[int]) -> None: | |
🤖 Grapple PR auto-fix • minor • confidence: 99%
| column_lower = column.lower().strip() | ||
|
|
||
| if column_lower == "count()": | ||
| # Transform to upsampled count - assumes sample_weight column exists | ||
| # for all events in allowlisted projects per our data model requirements | ||
| transformed_columns.append("upsampled_count() as count") | ||
|
|
||
| else: | ||
| transformed_columns.append(column) |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — logic agent (Small fix (3 lines, 1 file))
The transformation replaces
count()withupsampled_count() as count. Theas countalias is a string-level alias in the column expression. Whether this alias is correctly parsed and applied depends on the downstream query builder's parsing logic. If the query builder treats the entire string as a function call (looking for the patternfunction_name(args)), theas countsuffix may cause a parse error or be silently dropped, leading to results returned under the keyupsampled_countinstead ofcount, which would break the API response schema that callers expect.
| column_lower = column.lower().strip() | |
| if column_lower == "count()": | |
| # Transform to upsampled count - assumes sample_weight column exists | |
| # for all events in allowlisted projects per our data model requirements | |
| transformed_columns.append("upsampled_count() as count") | |
| else: | |
| transformed_columns.append(column) | |
| column_lower = column.lower().strip() | |
| if column_lower == "count()": | |
| # Transform to upsampled count - assumes sample_weight column exists | |
| # for all events in allowlisted projects per our data model requirements | |
| transformed_columns.append("upsampled_count() as count()") | |
| else: | |
| transformed_columns.append(column) | |
🤖 Grapple PR auto-fix • minor • confidence: 94%
| from collections.abc import Sequence | ||
| from types import ModuleType | ||
| from typing import Any | ||
|
|
||
| from rest_framework.request import Request |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — performance agent (Small fix (4 lines, 1 file))
_should_apply_sample_weight_transform performs inline imports of sentry.snuba.discover, sentry.snuba.errors, and sentry.snuba.transactions on every call. This function is invoked on every stats request for allowlisted organizations, making these repeated sys.modules lookups (even if cached by Python) unnecessary overhead.
| from collections.abc import Sequence | |
| from types import ModuleType | |
| from typing import Any | |
| from rest_framework.request import Request | |
| from collections.abc import Sequence | |
| from types import ModuleType | |
| from sentry.snuba import discover, errors, transactions | |
| from typing import Any | |
| from rest_framework.request import Request | |
🤖 Grapple PR auto-fix • minor • confidence: 80%
| return orjson.dumps(manifest).decode() | ||
|
|
||
|
|
||
| def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: | ||
| """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" | ||
| client_sample_rate = None | ||
| try: | ||
| client_sample_rate = ( | ||
| normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate") | ||
| ) | ||
| except Exception: | ||
| pass | ||
| if client_sample_rate: | ||
| try: | ||
| normalized_data["sample_rate"] = float(client_sample_rate) | ||
| except Exception: | ||
| pass | ||
|
|
||
|
|
||
| # TODO(dcramer): consider moving to something more scalable like factoryboy |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — architecture agent (Small fix (14 lines, 1 file))
The _set_sample_rate_from_error_sampling function is added to the shared test factories module but implements feature-specific logic (error sampling context extraction). This adds error-upsampling-specific behavior to a foundational test utility that is used across the entire test suite, potentially affecting unrelated tests that store events with error_sampling contexts.
| return orjson.dumps(manifest).decode() | |
| def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: | |
| """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" | |
| client_sample_rate = None | |
| try: | |
| client_sample_rate = ( | |
| normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate") | |
| ) | |
| except Exception: | |
| pass | |
| if client_sample_rate: | |
| try: | |
| normalized_data["sample_rate"] = float(client_sample_rate) | |
| except Exception: | |
| pass | |
| # TODO(dcramer): consider moving to something more scalable like factoryboy | |
| return orjson.dumps(manifest).decode() | |
| def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: | |
| """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" | |
| try: | |
| client_sample_rate = ( | |
| normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate") | |
| ) | |
| except (AttributeError, TypeError): | |
| return | |
| if client_sample_rate is not None: | |
| try: | |
| normalized_data["sample_rate"] = float(client_sample_rate) | |
| except (TypeError, ValueError): | |
| pass | |
| # TODO(dcramer): consider moving to something more scalable like factoryboy | |
🤖 Grapple PR auto-fix • minor • confidence: 95%
| self.user2 = self.create_user() | ||
|
|
||
| # Store some error events with error_sampling context | ||
| self.store_event( |
There was a problem hiding this comment.
🟡 [Grapple PR] Suggested fix — logic agent (Small fix (20 lines, 1 file))
The test's setUp method shadows
self.user— line 3569 assignsself.user = self.create_user()AFTERself.login_as(user=self.user)on line 3558. This means the login uses the originalself.userfrom the parent class, but later references toself.user(e.g.,self.user.emailon line 3582) refer to the newly created, non-logged-in user. While this may not cause test failures if the email is just used as a tag value, it creates a confusing state where the authenticated user andself.userare different objects. More critically,self.authed_user = self.useron line 3559 captures the ORIGINAL user, thenself.useris immediately overwritten on line 3569, soself.authed_userandself.userdiverge silently.
@@ -3559,20 +3559,20 @@
- self.login_as(user=self.user)
- self.authed_user = self.user
-
- self.day_ago = before_now(days=1).replace(hour=10, minute=0, second=0, microsecond=0)
-
- self.project = self.create_project()
- self.project2 = self.create_project()
- self.user = self.create_user()
- self.user2 = self.create_user()
-
- # Store some error events with error_sampling context
- self.store_event(
- data={
- "event_id": "a" * 32,
- "message": "very bad",
- "type": "error",
- "exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
- "timestamp": (self.day_ago + timedelta(minutes=1)).isoformat(),
- "fingerprint": ["group1"],
- "tags": {"sentry:user": self.user.email},
+ self.login_as(user=self.user)
+ self.authed_user = self.user
+
+ self.day_ago = before_now(days=1).replace(hour=10, minute=0, second=0, microsecond=0)
+
+ self.project = self.create_project()
+ self.project2 = self.create_project()
+ self.user1 = self.create_user()
+ self.user2 = self.create_user()
+
+ # Store some error events with error_sampling context
+ self.store_event(
+ data={
+ "event_id": "a" * 32,
+ "message": "very bad",
+ "type": "error",
+ "exception": [{"type": "ValueError", "value": "Something went wrong 1"}],
+ "timestamp": (self.day_ago + timedelta(minutes=1)).isoformat(),
+ "fingerprint": ["group1"],
+ "tags": {"sentry:user": self.user1.email},
🤖 Grapple PR auto-fix • major • Review this diff before applying
| assert _should_apply_sample_weight_transform(discover, self.request) is True | ||
|
|
||
| self.request.GET = QueryDict("query=event.type:transaction") | ||
| assert _should_apply_sample_weight_transform(discover, self.request) is False |
There was a problem hiding this comment.
🟡 [Grapple PR] Suggested fix — architecture agent (Fix exceeds for major budget (153 lines, 1 file; budget 50 lines / 2 files) — review recommended)
The tests directly import and test private functions (prefixed with underscore): _are_all_projects_error_upsampled, _is_error_focused_query, and _should_apply_sample_weight_transform. This creates tight coupling between tests and implementation details, making refactoring difficult. More importantly, there is no test for the main public function is_errors_query_for_error_upsampled_projects, which contains the caching logic — the most complex and error-prone part of the module.
diff --git a/tests/sentry/api/helpers/test_error_upsampling.py b/tests/sentry/api/helpers/test_error_upsampling.py
--- a/tests/sentry/api/helpers/test_error_upsampling.py
+++ b/tests/sentry/api/helpers/test_error_upsampling.py
@@ -1,101 +1,153 @@
-from unittest.mock import Mock, patch
-
-from django.http import QueryDict
-from django.test import RequestFactory
-from rest_framework.request import Request
-
-from sentry.api.helpers.error_upsampling import (
- _are_all_projects_error_upsampled,
- _is_error_focused_query,
- _should_apply_sample_weight_transform,
- transform_query_columns_for_error_upsampling,
-)
-from sentry.models.organization import Organization
-from sentry.search.events.types import SnubaParams
-from sentry.snuba import discover, errors, transactions
-from sentry.testutils.cases import TestCase
-
-
-class ErrorUpsamplingTest(TestCase):
- def setUp(self) -> None:
- self.organization = Organization.objects.create(name="test-org")
- self.projects = [
- self.create_project(organization=self.organization, name="Project 1"),
- self.create_project(organization=self.organization, name="Project 2"),
- self.create_project(organization=self.organization, name="Project 3"),
- ]
- self.project_ids = [p.id for p in self.projects]
- self.snuba_params = SnubaParams(
- start=None,
- end=None,
- projects=self.projects,
- )
- factory = RequestFactory()
- self.request = Request(factory.get("/"))
- self.request.GET = QueryDict("")
-
- @patch("sentry.api.helpers.error_upsampling.options")
- def test_are_all_projects_error_upsampled(self, mock_options: Mock) -> None:
- # Test when all projects are allowlisted
- mock_options.get.return_value = self.project_ids
- assert _are_all_projects_error_upsampled(self.project_ids, self.organization) is True
-
- # Test when some projects are not allowlisted
- mock_options.get.return_value = self.project_ids[:-1]
- assert _are_all_projects_error_upsampled(self.project_ids, self.organization) is False
-
- # Test when no projects are allowlisted
- mock_options.get.return_value = []
- assert _are_all_projects_error_upsampled(self.project_ids, self.organization) is False
-
- # Test when no project IDs provided
- assert _are_all_projects_error_upsampled([], self.organization) is False
-
- def test_transform_query_columns_for_error_upsampling(self) -> None:
- # Test count() transformation
- columns = ["count()", "other_column"]
- expected = [
- "upsampled_count() as count",
- "other_column",
- ]
- assert transform_query_columns_for_error_upsampling(columns) == expected
-
- # Test case insensitivity
- columns = ["COUNT()"]
- expected = [
- "upsampled_count() as count",
- ]
- assert transform_query_columns_for_error_upsampling(columns) == expected
-
- # Test whitespace handling
- columns = [" count() "]
- expected = [
- "upsampled_count() as count",
- ]
- assert transform_query_columns_for_error_upsampling(columns) == expected
-
- def test_is_error_focused_query(self) -> None:
- # Test explicit error type
- self.request.GET = QueryDict("query=event.type:error")
- assert _is_error_focused_query(self.request) is True
-
- # Test explicit transaction type
- self.request.GET = QueryDict("query=event.type:transaction")
- assert _is_error_focused_query(self.request) is False
-
- # Test empty query
- self.request.GET = QueryDict("")
- assert _is_error_focused_query(self.request) is False
-
- def test_should_apply_sample_weight_transform(self) -> None:
- # Test errors dataset
- assert _should_apply_sample_weight_transform(errors, self.request) is True
-
- # Test transactions dataset
- assert _should_apply_sample_weight_transform(transactions, self.request) is False
-
- self.request.GET = QueryDict("query=event.type:error")
- assert _should_apply_sample_weight_transform(discover, self.request) is True
-
- self.request.GET = QueryDict("query=event.type:transaction")
- assert _should_apply_sample_weight_transform(discover, self.request) is False
+from unittest.mock import Mock, call, patch
+
+from django.http import QueryDict
+from django.test import RequestFactory
+from rest_framework.request import Request
+
+from sentry.api.helpers.error_upsampling import (
+ is_errors_query_for_error_upsampled_projects,
+ transform_query_columns_for_error_upsampling,
+)
+from sentry.models.organization import Organization
+from sentry.search.events.types import SnubaParams
+from sentry.snuba import discover, errors, transactions
+from sentry.testutils.cases import TestCase
+
+
+class ErrorUpsamplingTest(TestCase):
+ def setUp(self) -> None:
+ self.organization = Organization.objects.create(name="test-org")
+ self.projects = [
+ self.create_project(organization=self.organization, name="Project 1"),
+ self.create_project(organization=self.organization, name="Project 2"),
+ self.create_project(organization=self.organization, name="Project 3"),
+ ]
+ self.project_ids = [p.id for p in self.projects]
+ self.snuba_params = SnubaParams(
+ start=None,
+ end=None,
+ projects=self.projects,
+ )
+ factory = RequestFactory()
+ self.request = Request(factory.get("/"))
+ self.request.GET = QueryDict("")
+
+ def test_transform_query_columns_for_error_upsampling(self) -> None:
+ # Test count() transformation
+ columns = ["count()", "other_column"]
+ expected = [
+ "upsampled_count() as count",
+ "other_column",
+ ]
+ assert transform_query_columns_for_error_upsampling(columns) == expected
+
+ # Test case insensitivity
+ columns = ["COUNT()"]
+ expected = [
+ "upsampled_count() as count",
+ ]
+ assert transform_query_columns_for_error_upsampling(columns) == expected
+
+ # Test whitespace handling
+ columns = [" count() "]
+ expected = [
+ "upsampled_count() as count",
+ ]
+ assert transform_query_columns_for_error_upsampling(columns) == expected
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_errors_dataset_all_allowlisted(
+ self, mock_options: Mock
+ ) -> None:
+ # errors dataset with all projects allowlisted should return True
+ mock_options.get.return_value = self.project_ids
+ result = is_errors_query_for_error_upsampled_projects(
+ errors, self.request, self.project_ids, self.organization
+ )
+ assert result is True
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_errors_dataset_partial_allowlist(
+ self, mock_options: Mock
+ ) -> None:
+ # errors dataset but only some projects are allowlisted — should return False
+ mock_options.get.return_value = self.project_ids[:-1]
+ result = is_errors_query_for_error_upsampled_projects(
+ errors, self.request, self.project_ids, self.organization
+ )
+ assert result is False
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_transactions_dataset(
+ self, mock_options: Mock
+ ) -> None:
+ # transactions dataset should always return False regardless of allowlist
+ mock_options.get.return_value = self.project_ids
+ result = is_errors_query_for_error_upsampled_projects(
+ transactions, self.request, self.project_ids, self.organization
+ )
+ assert result is False
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_discover_error_query(
+ self, mock_options: Mock
+ ) -> None:
+ # discover dataset with event.type:error and all projects allowlisted
+ mock_options.get.return_value = self.project_ids
+ self.request.GET = QueryDict("query=event.type:error")
+ result = is_errors_query_for_error_upsampled_projects(
+ discover, self.request, self.project_ids, self.organization
+ )
+ assert result is True
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_discover_transaction_query(
+ self, mock_options: Mock
+ ) -> None:
+ # discover dataset with event.type:transaction should return False
+ mock_options.get.return_value = self.project_ids
+ self.request.GET = QueryDict("query=event.type:transaction")
+ result = is_errors_query_for_error_upsampled_projects(
+ discover, self.request, self.project_ids, self.organization
+ )
+ assert result is False
+
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_no_projects(
+ self, mock_options: Mock
+ ) -> None:
+ # empty project list should always return False
+ mock_options.get.return_value = self.project_ids
+ result = is_errors_query_for_error_upsampled_projects(
+ errors, self.request, [], self.organization
+ )
+ assert result is False
+
+ @patch("sentry.api.helpers.error_upsampling.cache")
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_cache_miss(
+ self, mock_options: Mock, mock_cache: Mock
+ ) -> None:
+ # On a cache miss, the result should be computed and stored in cache
+ mock_cache.get.return_value = None
+ mock_options.get.return_value = self.project_ids
+ result = is_errors_query_for_error_upsampled_projects(
+ errors, self.request, self.project_ids, self.organization
+ )
+ assert result is True
+ assert mock_cache.get.called
+ assert mock_cache.set.called
+
+ @patch("sentry.api.helpers.error_upsampling.cache")
+ @patch("sentry.api.helpers.error_upsampling.options")
+ def test_is_errors_query_for_error_upsampled_projects_cache_hit(
+ self, mock_options: Mock, mock_cache: Mock
+ ) -> None:
+ # On a cache hit, the cached value should be returned without calling options
+ mock_cache.get.return_value = True
+ result = is_errors_query_for_error_upsampled_projects(
+ errors, self.request, self.project_ids, self.organization
+ )
+ assert result is True
+ mock_options.get.assert_not_called()
+ mock_cache.set.assert_not_called()
🤖 Grapple PR auto-fix • major • Review this diff before applying
|
|
||
| # Check cache first for performance optimization | ||
| cached_result = cache.get(cache_key) | ||
| if cached_result is not None: | ||
| return cached_result and _should_apply_sample_weight_transform(dataset, request) | ||
|
|
||
| # Cache miss - perform fresh allowlist check | ||
| is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization) |
There was a problem hiding this comment.
✅ [Grapple PR] Auto-fix — performance agent (Small fix (4 lines, 1 file))
When the cache returns a truthy result (cached_result is not None and cached_result is True), _should_apply_sample_weight_transform is still called on every request. This call imports three dataset modules and evaluates request.GET on every cache hit, partially defeating the caching optimization.
| # Check cache first for performance optimization | |
| cached_result = cache.get(cache_key) | |
| if cached_result is not None: | |
| return cached_result and _should_apply_sample_weight_transform(dataset, request) | |
| # Cache miss - perform fresh allowlist check | |
| is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization) | |
| # Check cache first for performance optimization | |
| cached_result = cache.get(cache_key) | |
| if cached_result is not None: | |
| if not cached_result: | |
| return False | |
| return _should_apply_sample_weight_transform(dataset, request) | |
| # Cache miss - perform fresh allowlist check | |
| is_eligible = _are_all_projects_error_upsampled(snuba_params.project_ids, organization) | |
🤖 Grapple PR auto-fix • minor • confidence: 95%
|
✅
diff --git a/src/sentry/search/events/datasets/discover.py b/src/sentry/search/events/datasets/discover.py
--- a/src/sentry/search/events/datasets/discover.py
+++ b/src/sentry/search/events/datasets/discover.py
@@ -1038,19 +1038,26 @@
default_result_type="integer",
private=True,
),
- SnQLFunction(
- "upsampled_count",
- required_args=[],
- # Uses countIf/count as a safe fallback until sample_weight column
- # is confirmed present in the Snuba schema via a migration.
- snql_aggregate=lambda args, alias: Function(
- "count",
- [],
- alias,
- ),
- default_result_type="integer",
- private=True,
- ),
+ SnQLFunction(
+ "upsampled_count",
+ required_args=[],
+ # Uses toInt64(sum(sample_weight)) when sample_weight exists (allowlisted
+ # projects), with coalesce to fall back to plain count() so that queries
+ # against projects without the column never hard-fail in Snuba.
+ snql_aggregate=lambda args, alias: Function(
+ "coalesce",
+ [
+ Function(
+ "toInt64",
+ [Function("sum", [Column("sample_weight")])],
+ ),
+ Function("count", []),
+ ],
+ alias,
+ ),
+ default_result_type="integer",
+ private=True,
+ ),
]
}
🤖 Grapple PR auto-fix • critical • confidence: 95% |
|
✅
diff --git a/src/sentry/api/endpoints/organization_events_stats.py b/src/sentry/api/endpoints/organization_events_stats.py
--- a/src/sentry/api/endpoints/organization_events_stats.py
+++ b/src/sentry/api/endpoints/organization_events_stats.py
@@ -214,23 +214,17 @@
rollup: int,
zerofill_results: bool,
comparison_delta: timedelta | None,
- ) -> SnubaTSResult | dict[str, SnubaTSResult]:
- # Early upsampling eligibility check for performance optimization
- # This cached result ensures consistent behavior across query execution
- should_upsample = is_errors_query_for_error_upsampled_projects(
- snuba_params, organization, dataset, request
- )
-
- # Store the upsampling decision to apply later during query building
- # This separation allows for better query optimization and caching
- upsampling_enabled = should_upsample
- final_columns = query_columns
-
- if top_events > 0:
- # Apply upsampling transformation just before query execution
- # This late transformation ensures we use the most current schema assumptions
- if upsampling_enabled:
- final_columns = transform_query_columns_for_error_upsampling(query_columns)
+ ) -> SnubaTSResult | dict[str, SnubaTSResult]:
+ upsampling_enabled = is_errors_query_for_error_upsampled_projects(
+ snuba_params, organization, dataset, request
+ )
+ final_columns = (
+ transform_query_columns_for_error_upsampling(query_columns)
+ if upsampling_enabled
+ else query_columns
+ )
+
+ if top_events > 0:
if use_rpc:
return scoped_dataset.run_top_events_timeseries_query(
🤖 Grapple PR auto-fix • major • confidence: 100% |
Benchmark test PR #3 - Replicating Greptile benchmark for Grapple PR evaluation.