Skip to content

Replays Self-Serve Bulk Delete System#6

Open
Frawatson wants to merge 32 commits into
replays-delete-vulnerablefrom
replays-delete-stable
Open

Replays Self-Serve Bulk Delete System#6
Frawatson wants to merge 32 commits into
replays-delete-vulnerablefrom
replays-delete-stable

Conversation

@Frawatson

Copy link
Copy Markdown
Owner

Benchmark test PR #5 - Replicating Greptile benchmark for Grapple PR evaluation.

armenzg and others added 30 commits June 20, 2025 12:49
…o 'low' (#93927)"

This reverts commit 8d04522.

Co-authored-by: roaga <47861399+roaga@users.noreply.github.com>
Missed in the initial commit, leading to some relevant logs being
unannotated.
We have had a few tasks get killed at 10% rollout.
Also add a test, so that this doesn't happen again
Fixes DE-129 and DE-156

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
These transitions should be matching
…` (#93946)

Use `project_id` on the replay record instead of the URL (where it does
not always exist).

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Also fixed `replay.view_html` -> `replay.view-html`

---------

Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com>
…948)

gets `npx @typescript/native-preview` passing again
The conditions associated with a DCG can change over time, and it's good
if we can be completely confident that they're consistent within a given
task execution.
This is unused and most regex experiments have required broader changes
to ensure that regexes are evaluated in a specific order (ex:
traceparent). Removing this for now to simplify the code and very
slightly improve runtime performance.
From some testing (on feedback lists of all different lengths), this
prompt seems to work better. It doesn't write overly long sentences and
also does a better job at "summarizing" versus just mentioning a few
specific topics and leaving out others.
Just remove a couple custom Flex* classes in favor of the Flex primitive
This has been killed a few times.

Refs SENTRY-42M7
…n table (#93892)

<!-- Describe your PR here. -->

[ticket](https://linear.app/getsentry/issue/ID-156/grouping-info-remove-type-field-from-ui)
The Type field in the Grouping Info section of the issue details page
was redundant.
This removes the Type row from all variant types while keeping the
underlying data structure intact.

before
![Screenshot 2025-06-20 at 12 00
54 PM](https://github.com/user-attachments/assets/97ca72da-0a52-4446-9825-cd4fcb505adf)

after
![Screenshot 2025-06-20 at 11 59
29 AM](https://github.com/user-attachments/assets/a4284d2b-c9f5-442f-b010-7fe72a598e39)
### Changes
Related to this PR: getsentry/sentry#93810. This
is part 1 of the change, which is pulling out the new component and just
adding it to the repo. Also includes some simplification of the logic in
the base component.

Part 2 will be replacing tables in widgets.

### Before/After

There is no UI change as the table is not being used yet. There is a new
story page for the component.
…93943)

to prevent this issue from becoming too noisy, add a noise config
Unfortunately, 'event_data' went from being the variable for current
event context to being the complete parsed data from Redis, and we
continued logging it per group.
That's more data than we should be logging even arguably once, let alone
per group.
Co-authored-by: Abdullah Khan <abdullahkhan@PG9Y57YDXQ.local>
Adds some simple analytics to our endpoint so we can begin building a
dashboard in Amplitude.
Previously, explore supported multiple y axis per chart, so each
visualize supported multiple y axis. That functionality has since been
removed for simplicity so update the types here to match. Keep in mind
that saved queries still store them as an array so when
serializing/deserializing, we still need to treat it as an array.
We'll need the `useGetTraceItemAttributeKeys` hook in other places so
refactoring it so that it can exported.
return ""

@staticmethod
def get_merged_pr_single_issue_template(title: str, url: str, environment: str) -> str:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — performance agent (Larger fix (42 lines, 1 file) — review recommended)

get_environment_info calls issue.get_recommended_event() which likely triggers a database or cache lookup on every PR comment generation, once per issue in the issue list. For PRs with many associated issues, this results in N sequential calls to get_recommended_event().

--- a/src/sentry/integrations/source_code_management/commit_context.py
+++ b/src/sentry/integrations/source_code_management/commit_context.py
@@ -584,15 +584,37 @@ class PRCommentWorkflow(ABC):
 
-    def get_environment_info(self, issue: Group) -> str:
+    @staticmethod
+    def _get_environment_name_from_event(event: object | None) -> str:
+        """Extract environment name string from a pre-fetched recommended event.
+
+        Separated from get_environment_info so callers can batch-fetch events
+        outside of a loop and pass them in, avoiding N sequential fetches.
+        """
         try:
-            recommended_event = issue.get_recommended_event()
-            if recommended_event:
-                environment = recommended_event.get_environment()
+            if event:
+                environment = event.get_environment()
                 if environment and environment.name:
                     return f" in `{environment.name}`"
         except Exception as e:
-            # If anything goes wrong, just continue without environment info
+            # If anything goes wrong, continue without environment info.
+            # NOTE: this silently degrades; callers should log at the fetch site.
             logger.info(
                 "get_environment_info.no-environment",
-                extra={"issue_id": issue.id, "error": e},
+                extra={"error": e},
             )
         return ""
+
+    def get_environment_info(self, issue: Group, recommended_event: object | None = _UNSET) -> str:
+        """Return environment string for an issue.
+
+        Prefer passing a pre-fetched `recommended_event` to avoid a per-issue
+        DB/cache lookup when this is called inside a loop:
+
+            events = {issue.id: issue.get_recommended_event() for issue in issues}
+            env_infos = [self.get_environment_info(issue, events[issue.id]) for issue in issues]
+
+        When called without `recommended_event`, falls back to fetching inline
+        (original behaviour, retained for backwards compatibility).
+        """
+        if recommended_event is _UNSET:
+            # Legacy path: fetch inline. Avoid in loops — use the batched pattern above.
+            try:
+                recommended_event = issue.get_recommended_event()
+            except Exception as e:
+                logger.info(
+                    "get_environment_info.no-environment",
+                    extra={"issue_id": issue.id, "error": e},
+                )
+                return ""
+        return self._get_environment_name_from_event(recommended_event)

🤖 Grapple PR auto-fix • minor • Review this diff before applying

def _make_recording_filenames(project_id: int, row: MatchedRow) -> list[str]:
# Null segment_ids can cause this to fail. If no segments were ingested then we can skip
# deleting the segements.
if row["max_segment_id"] is None:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (14 lines, 1 file) — review recommended)

The null check for max_segment_id returns an empty list, which means replays with null segment IDs will have their recording segments silently skipped during deletion, but the replay record itself may still exist in other storage layers. The function only handles recording filenames — verify that the caller also handles cleanup of the replay metadata/row even when no segments exist.

--- a/src/sentry/replays/usecases/delete.py
+++ b/src/sentry/replays/usecases/delete.py
@@ -1,6 +1,7 @@
 from __future__ import annotations
 
 import concurrent.futures as cf
+import logging
 import functools
 from datetime import datetime
+
+logger = logging.getLogger(__name__)
@@ -86,6 +91,12 @@ def _delete_if_exists(filename: str) -> None:
 
 def _make_recording_filenames(project_id: int, row: MatchedRow) -> list[str]:
-    # Null segment_ids can cause this to fail. If no segments were ingested then we can skip
-    # deleting the segements.
+    # Null segment_ids can cause this to fail. If no segments were ingested then we can skip
+    # deleting the segments. Note: the caller is responsible for cleaning up replay metadata
+    # and other storage layers regardless of whether recording segments exist.
     if row["max_segment_id"] is None:
+        logger.warning(
+            "replays.delete.no_segments",
+            extra={"replay_id": row["replay_id"], "project_id": project_id},
+        )
         return []
 
     # We assume every segment between 0 and the max_segment_id exists. Its a waste of time to
@@ -109,7 +120,7 @@ def _make_recording_filenames(project_id: int, row: MatchedRow) -> list[str]:
 class MatchedRow(TypedDict):
     retention_days: int
     replay_id: str
-    max_segment_id: int
+    max_segment_id: int | None
     platform: str

🤖 Grapple PR auto-fix • minor • Review this diff before applying

@@ -661,7 +672,7 @@ def process_delayed_workflows(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — security agent (Larger fix (50 lines, 1 file) — review recommended)

The dcg_to_slow_conditions fetch and the subsequent no_slow_condition_groups logging block appear to be inside a conditional block (the diff shows indentation suggesting they're within a with block or if block), but the logger.info('delayed_workflow.workflows', ...) call and the get_condition_query_groups(...) call at line 672 appear outside that block. If dcg_to_slow_conditions is only assigned inside the conditional/context block and the outer code references it, there's a potential NameError if the condition is not met.

--- a/src/sentry/workflow_engine/processors/delayed_workflow.py
+++ b/src/sentry/workflow_engine/processors/delayed_workflow.py
@@ -649,22 +649,22 @@ def process_delayed_workflows(
 
         workflows_to_envs = fetch_workflows_envs(list(event_data.workflow_ids))
         data_condition_groups = fetch_data_condition_groups(list(event_data.dcg_ids))
-        dcg_to_slow_conditions = get_slow_conditions_for_groups(list(event_data.dcg_ids))
-
-        no_slow_condition_groups = {
-            dcg_id for dcg_id, slow_conds in dcg_to_slow_conditions.items() if not slow_conds
-        }
-        if no_slow_condition_groups:
-            # If the DCG is being processed here, it's because we thought it had a slow condition.
-            # If any don't seem to have a slow condition now, that's interesting enough to log.
-            logger.info(
-                "delayed_workflow.no_slow_condition_groups",
-                extra={"no_slow_condition_groups": sorted(no_slow_condition_groups)},
-            )
-
-    logger.info(
-        "delayed_workflow.workflows",
-        extra={
-            "data": redis_data,
-            "workflows": event_data.workflow_ids,
-        },
-    )
-
-    # Get unique query groups to query Snuba
-    condition_groups = get_condition_query_groups(
-        data_condition_groups, event_data, workflows_to_envs, dcg_to_slow_conditions
-    )
+        dcg_to_slow_conditions = get_slow_conditions_for_groups(list(event_data.dcg_ids))
+
+        no_slow_condition_groups = {
+            dcg_id for dcg_id, slow_conds in dcg_to_slow_conditions.items() if not slow_conds
+        }
+        if no_slow_condition_groups:
+            # If the DCG is being processed here, it's because we thought it had a slow condition.
+            # If any don't seem to have a slow condition now, that's interesting enough to log.
+            logger.info(
+                "delayed_workflow.no_slow_condition_groups",
+                extra={"no_slow_condition_groups": sorted(no_slow_condition_groups)},
+            )
+
+        logger.info(
+            "delayed_workflow.workflows",
+            extra={
+                "data": redis_data,
+                "workflows": event_data.workflow_ids,
+            },
+        )
+
+        # Get unique query groups to query Snuba
+        condition_groups = get_condition_query_groups(
+            data_condition_groups, event_data, workflows_to_envs, dcg_to_slow_conditions
+        )

🤖 Grapple PR auto-fix • minor • Review this diff before applying

@Frawatson

Copy link
Copy Markdown
Owner Author

@grapple-pr review

@grapple-pr grapple-pr Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Grapple PR Review

Summary

This PR implements a self-serve bulk delete system for Replays, along with a wide range of unrelated changes across the codebase spanning UI components, integrations, workflow engine, migrations, and backend endpoints. The PR title and linked ticket descriptions are generic benchmark placeholders and do not accurately describe the scope of changes.

Changes: 106 files (+2312 / -981 lines)

  • 🟢 11 files added: src/sentry/preprod/analytics.py, static/app/components/feedback/feedbackSummary.tsx, static/app/components/feedback/list/useFeedbackSummary.tsx, static/app/views/dashboards/widgets/tableWidget/defaultTableCellRenderers.tsx, static/app/views/dashboards/widgets/tableWidget/fixtures/sampleHTTPRequestTableData.ts +6 more
  • 🟡 92 files modified: devservices/config.yml, src/sentry/constants.py, src/sentry/feedback/usecases/feedback_summaries.py, src/sentry/grouping/parameterization.py, src/sentry/hybridcloud/tasks/deliver_webhooks.py +87 more
  • 🔴 1 file deleted

Resolves: #5 (Span Buffer Multiprocess Enhancement with Health Monitoring)

CI: ⏳ 0 passed, 0 failed

Impact Diagram

graph TD
  pr_0[/"🔀 PR Changes: 106 files"/]
  f1["🟡 config.yml +47/-0"]
  pr_0 --> f1
  f2["🟡 constants.py +1/-1"]
  pr_0 --> f2
  f3["🟡 feedback_summaries.py +5/-3"]
  pr_0 --> f3
  f4["🟡 parameterization.py +3/-15"]
  pr_0 --> f4
  f5["🟡 deliver_webhooks.py +3/-0"]
  pr_0 --> f5
  f6["🟡 integration.py +2/-4"]
  pr_0 --> f6
  f7["🟡 integration.py +2/-4"]
  pr_0 --> f7
  f8["🟡 commit_context.py +35/-0"]
  pr_0 --> f8
  f9["🟡 browser_reporting_collector.py +64/-33"]
  pr_0 --> f9
  f10["🟡 grouptype.py +1/-0"]
  pr_0 --> f10
  f11["🟡 0917_convert_org_saved_searches_to_views.py +3/-18"]
  pr_0 --> f11
  f12["🟡 0920_convert_org_saved_searches_to_views_revised.py +3/-20"]
  pr_0 --> f12
  more_13["... +94 more files"]
  pr_0 --> more_13
  f11 -.- f12
  style f1 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f2 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f3 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f4 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f5 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f6 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f7 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f8 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f9 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f10 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f11 fill:#7c3aed,color:#fff,stroke:#7c3aed
  style f12 fill:#7c3aed,color:#fff,stroke:#7c3aed
Loading

3 critical issues found — please address before merging.

Severity Count
🔴 Critical 3
🟠 Major 13
🟡 Minor 15
🔵 Info 4
Review details
  • Reviewed at: 2026-04-12T03:26:23.222Z
  • Agents used: security, logic, style, architecture, performance, verification
  • Context level: full
  • Execution time: 166.3s
  • Low-confidence findings suppressed: 5
  • Blast radius: 0 (contained)

Powered by Grapple PR — AI that understands your code.

truncated_title = PRCommentWorkflow._truncate_title(title)
return MERGED_PR_SINGLE_ISSUE_TEMPLATE.format(
title=truncated_title,
url=url,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Logic Error / Runtime Exception (confidence: 100%)

The static method get_merged_pr_single_issue_template references PRCommentWorkflow._truncate_title(title), but _truncate_title is defined on CommitContextIntegration, not PRCommentWorkflow. This will raise an AttributeError at runtime whenever a merged PR comment is generated for GitHub or GitLab, breaking all merged PR comment functionality.

Evidence:

  • _truncate_title is defined as a @staticmethod on CommitContextIntegration (line ~577)
  • get_merged_pr_single_issue_template is also defined on CommitContextIntegration but calls PRCommentWorkflow._truncate_title(title) — referencing the wrong class
  • Both GitHubPRCommentWorkflow and GitlabPRCommentWorkflow now call self.get_merged_pr_single_issue_template(...), which will trigger this code path
  • The PR removed MERGED_PR_SINGLE_ISSUE_TEMPLATE from github/integration.py and gitlab/integration.py, making this the only code path for merged PR comments

Agent: security


@staticmethod
def get_merged_pr_single_issue_template(title: str, url: str, environment: str) -> str:
truncated_title = PRCommentWorkflow._truncate_title(title)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Business Logic Correctness (confidence: 100%)

get_merged_pr_single_issue_template calls PRCommentWorkflow._truncate_title but _truncate_title is a static method defined on CommitContextIntegration, not on PRCommentWorkflow. This will raise an AttributeError at runtime.

Evidence:

  • _truncate_title is defined as a @staticmethod on CommitContextIntegration (line 575)
  • Line 601 references PRCommentWorkflow._truncate_title which does not exist on that class
  • get_merged_pr_single_issue_template is itself a @staticmethod on CommitContextIntegration, so it should use CommitContextIntegration._truncate_title or just call _truncate_title directly

Agent: logic

title=truncated_title,
url=url,
environment=environment,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Cross-file Bug / Wrong Class Reference (confidence: 100%)

The static method get_merged_pr_single_issue_template references PRCommentWorkflow._truncate_title(title), but _truncate_title is defined on CommitContextIntegration, not on PRCommentWorkflow. PRCommentWorkflow is the class that contains this method (it inherits from CommitContextIntegration... but actually PRCommentWorkflow is defined later in the file and is a separate class). The @staticmethod calls PRCommentWorkflow._truncate_title but _truncate_title is a @staticmethod on CommitContextIntegration. Since PRCommentWorkflow doesn't inherit from CommitContextIntegration (it's an inner workflow class), this will raise AttributeError at runtime.

Evidence:

  • Line 601: @staticmethod def get_merged_pr_single_issue_template
  • Line 602: truncated_title = PRCommentWorkflow._truncate_title(title) — but _truncate_title is defined at line 577 as a @staticmethod on CommitContextIntegration
  • PRCommentWorkflow is defined elsewhere and does not inherit from CommitContextIntegration — it's a separate workflow class
  • The callers in github/integration.py and gitlab/integration.py call self.get_merged_pr_single_issue_template(...) where self is a PRCommentWorkflow instance, so if PRCommentWorkflow inherits get_merged_pr_single_issue_template from CommitContextIntegration, the PRCommentWorkflow._truncate_title reference inside would need PRCommentWorkflow to also have _truncate_title

Agent: logic

raise serializers.ValidationError("If age is present, timestamp must be absent")
return value


Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR — Error Handling / Edge Cases (confidence: 100%)

The BrowserReportSerializer's validate_timestamp and validate_age cross-validation logic is flawed. validate_timestamp checks if 'age' is present and raises an error, and validate_age checks if 'timestamp' is present and raises an error. However, both fields are optional (required=False). If neither field is provided, the report will pass validation without any temporal information, which may not be the intended behavior per the W3C Reporting spec (one of the two should be present).

Evidence:

  • Both 'age' and 'timestamp' have required=False
  • W3C Reporting spec requires temporal information - either age (Editor's Draft) or timestamp (Working Draft)
  • No top-level validate() method enforces that at least one is present

Agent: logic


error_ids = response[0].get("error_ids", []) if response else []

# Check if error fetching should be disabled

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR — Data Flow (confidence: 100%)

query_replay_instance is called with start/end from filter_params, but these are derived from the request's date parameters via get_filter_params(). If the user does not provide explicit date parameters, the defaults may not cover the replay's actual time range, causing the query to return empty results and error_ids to be an empty list even when errors exist.

Evidence:

  • filter_params = self.get_filter_params(request, project) relies on request query params for date range
  • query_replay_instance uses start and end from filter_params
  • If no date params provided, default range may not match replay's timeframe

Agent: logic

"actions": [action.id for action in filtered_actions],
"event_data": event_data,
"event_data": workflow_event_data,
"event_id": workflow_event_data.event.event_id,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MINOR — Logging / Sensitive Data (confidence: 84%)

The logging change on line 583 from event_data (the full Redis data structure) to workflow_event_data fixes the immediate issue of logging a non-serializable object, but workflow_event_data still contains the full event object which could be large. This is an improvement but could still cause performance issues with high-volume logging.

Evidence:

  • Line 583: 'event_data': workflow_event_data — this is a WorkflowFireData object
  • WorkflowFireData contains the full Event object

Agent: logic

)
# This migration had an error and was never run.
# See 0921_convert_org_saved_searches_to_views_rerevised.py for the correct migration.
return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 INFO — Migration safety (confidence: 94%)

Migrations 0917 and 0920 have been effectively no-oped by replacing their logic with a comment pointing to migration 0921. This is a safe approach that preserves the migration graph while deferring logic to a corrected migration. However, the imports of Visibility and RangeQuerySetWrapperWithProgressBar were also removed, which is good for keeping dead imports clean.

Evidence:

  • Both migrations now have return as the first statement in their operation function
  • Comments indicate 0921_convert_org_saved_searches_to_views_rerevised.py contains the correct logic
  • The migration chain dependency is preserved

Agent: architecture

"delayed_workflow.no_slow_condition_groups",
extra={"no_slow_condition_groups": sorted(no_slow_condition_groups)},
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 INFO — State Management (confidence: 90%)

The refactoring to compute dcg_to_slow_conditions once and pass it to both get_condition_query_groups and get_groups_to_fire is a correctness improvement - it ensures consistent data is used across both calls. However, the logging for no_slow_condition_groups uses sorted() on a set of IDs which is fine but the log message could be more actionable.

Evidence:

  • Previously get_slow_conditions_for_groups was called twice independently, which could theoretically return different results if data changed between calls
  • Now computed once and passed to both functions, ensuring consistency

Agent: logic

user_agent = serializers.CharField()
destination = serializers.CharField()
attempts = serializers.IntegerField(min_value=1)
# Fields that do not overlap between specs

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 INFO — documentation (confidence: 89%)

The BrowserReportSerializer validation methods validate_timestamp() and validate_age() have conflicting error messages. One says 'If timestamp is present, age must be absent' (checking for age), the other says 'If age is present, timestamp must be absent' (checking for timestamp). This is confusing and may indicate logic errors.

Evidence:

  • Lines 38-40: validate_timestamp() checks if age is present and raises error about timestamp
  • Lines 42-44: validate_age() checks if timestamp is present and raises error about age
  • Error messages are backwards from what the method names suggest

Agent: style

class ErrorEvent(TypedDict):
id: str
title: str
message: str

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 INFO — code organization (confidence: 84%)

New ErrorEvent TypedDict is defined but its category and timestamp fields are not explained or documented. Given the ~100-line expansion of this endpoint, it's unclear whether this type fully represents the error structure being fetched.

Evidence:

  • TypedDict definition lacks docstring explaining the purpose of each field
  • The endpoint now fetches error details but the structure and field meanings are not documented
  • Given the high-risk nature of this endpoint, clearer documentation of the error event structure would aid maintainability

Agent: style

truncated_title = PRCommentWorkflow._truncate_title(title)
return MERGED_PR_SINGLE_ISSUE_TEMPLATE.format(
title=truncated_title,
url=url,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — security agent (Small fix (2 lines, 1 file))

The static method get_merged_pr_single_issue_template references PRCommentWorkflow._truncate_title(title), but _truncate_title is defined on CommitContextIntegration, not PRCommentWorkflow. This will raise an AttributeError at runtime whenever a merged PR comment is generated for GitHub or GitLab, breaking all merged PR comment functionality.

Suggested change
url=url,
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • critical • confidence: 100%


@staticmethod
def get_merged_pr_single_issue_template(title: str, url: str, environment: str) -> str:
truncated_title = PRCommentWorkflow._truncate_title(title)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (2 lines, 1 file))

get_merged_pr_single_issue_template calls PRCommentWorkflow._truncate_title but _truncate_title is a static method defined on CommitContextIntegration, not on PRCommentWorkflow. This will raise an AttributeError at runtime.

Suggested change
truncated_title = PRCommentWorkflow._truncate_title(title)
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • critical • confidence: 100%

title=truncated_title,
url=url,
environment=environment,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (2 lines, 1 file))

The static method get_merged_pr_single_issue_template references PRCommentWorkflow._truncate_title(title), but _truncate_title is defined on CommitContextIntegration, not on PRCommentWorkflow. PRCommentWorkflow is the class that contains this method (it inherits from CommitContextIntegration... but actually PRCommentWorkflow is defined later in the file and is a separate class). The @staticmethod calls PRCommentWorkflow._truncate_title but _truncate_title is a @staticmethod on CommitContextIntegration. Since PRCommentWorkflow doesn't inherit from CommitContextIntegration (it's an inner workflow class), this will raise AttributeError at runtime.

Suggested change
)
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • critical • confidence: 100%

url=url,
environment=environment,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — architecture agent (Small fix (2 lines, 1 file))

The static method get_merged_pr_single_issue_template calls PRCommentWorkflow._truncate_title(title) but _truncate_title is defined on CommitContextIntegration, not on PRCommentWorkflow. This will raise an AttributeError at runtime when generating merged PR comments for any SCM integration (GitHub, GitLab, etc.).

Suggested change
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • major • confidence: 100%

title=truncated_title,
url=url,
environment=environment,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — security agent (Small fix (2 lines, 1 file))

get_merged_pr_single_issue_template is defined as a @staticmethod on CommitContextIntegration, but its body calls PRCommentWorkflow._truncate_title(title). PRCommentWorkflow is a separate class (defined later or in a subclass hierarchy). This will raise NameError at runtime if PRCommentWorkflow is not in scope within this module, or will silently call the wrong method if the class structure is not as expected. The method should call CommitContextIntegration._truncate_title or just cls._truncate_title.

Suggested change
)
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • major • confidence: 100%


def validate_age(self, value: int) -> int:
"""Validate that age is present, but not timestamp."""
if self.initial_data.get("timestamp"):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (10 lines, 1 file))

The BrowserReportSerializer makes both age and timestamp optional with required=False, but the cross-field validation only triggers when both are present. If neither age nor timestamp is provided, the report will pass validation with no temporal information at all, which likely violates the Reporting API spec (one must be present). The validate_age and validate_timestamp methods only check the conflict case, not the absence case.

Suggested change
if self.initial_data.get("timestamp"):
def validate(self, data: dict[str, Any]) -> dict[str, Any]:
"""Validate that exactly one of age or timestamp is present."""
has_age = "age" in data
has_timestamp = "timestamp" in data
if not has_age and not has_timestamp:
raise serializers.ValidationError(
"Either age or timestamp must be present per the Reporting API spec."
)
return data

🤖 Grapple PR auto-fix • major • confidence: 100%

)

error_ids = response[0].get("error_ids", []) if response else []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (7 lines, 1 file))

The process_raw_response call passes request.query_params.getlist('field') as fields, but the endpoint does not validate or require any specific field query parameters. If the caller doesn't pass field params, fields will be an empty list, which may cause process_raw_response to return data without error_ids, leading to response[0].get('error_ids', []) always returning [] and silently skipping all error context enrichment.

Suggested change
# Always include "error_ids" in the requested fields so that error context
# enrichment is never silently skipped when the caller omits the field param.
requested_fields = request.query_params.getlist("field")
fields_with_errors = list(set(requested_fields) | {"error_ids"})
fields=fields_with_errors,

🤖 Grapple PR auto-fix • major • confidence: 100%


@staticmethod
def get_merged_pr_single_issue_template(title: str, url: str, environment: str) -> str:
truncated_title = PRCommentWorkflow._truncate_title(title)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — style agent (Small fix (2 lines, 1 file))

Method get_merged_pr_single_issue_template is called as a static method on PRCommentWorkflow class, but it is defined as a static method on CommitContextIntegration. This is inconsistent and may indicate a missing inheritance relationship or incorrect method placement.

Suggested change
truncated_title = PRCommentWorkflow._truncate_title(title)
truncated_title = CommitContextIntegration._truncate_title(title)

🤖 Grapple PR auto-fix • minor • confidence: 100%

age = serializers.IntegerField(required=False)
timestamp = serializers.IntegerField(required=False, min_value=0)

def validate_timestamp(self, value: int) -> int:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (2 lines, 1 file))

The timestamp field has min_value=0 but the age field (which represents milliseconds since the report was generated per the Reporting API spec) has no min_value constraint, allowing negative age values.

Suggested change
def validate_timestamp(self, value: int) -> int:
age = serializers.IntegerField(required=False, min_value=0)

🤖 Grapple PR auto-fix • minor • confidence: 100%

flags=FLAG_ADMIN_MODIFIABLE | FLAG_AUTOMATOR_MODIFIABLE | FLAG_RATE,
)

# TODO: For now, only a small number of projects are going through a grouping config transition at

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — architecture agent (Small fix (5 lines, 1 file))

The option grouping.experiments.parameterization.traceparent was removed, but it's unclear whether all code paths referencing this option have been cleaned up. The parameterization code in parameterization.py removed ParameterizationRegexExperiment and simplified the experiment type union, but any remaining options.get('grouping.experiments.parameterization.traceparent') calls elsewhere would silently return None instead of the previous default 0.0, potentially changing behavior.

Suggested change
# TODO: For now, only a small number of projects are going through a grouping config transition at
# NOTE: `grouping.experiments.parameterization.traceparent` was intentionally
# removed along with `ParameterizationRegexExperiment`. Any remaining calls to
# `options.get('grouping.experiments.parameterization.traceparent')` will return
# `None` instead of the former default `0.0`. If you are adding that option back,
# re-register it here with `default=0.0` and the appropriate flags.

🤖 Grapple PR auto-fix • minor • confidence: 98%

flags=FLAG_ADMIN_MODIFIABLE | FLAG_AUTOMATOR_MODIFIABLE | FLAG_RATE,
)

# TODO: For now, only a small number of projects are going through a grouping config transition at

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — security agent (Small fix (10 lines, 1 file))

The option grouping.experiments.parameterization.traceparent has been removed from defaults.py, but if any code still calls options.get('grouping.experiments.parameterization.traceparent'), it will silently return None or raise a UnknownOption exception depending on Sentry's options framework behavior. The ParameterizationRegexExperiment class was also removed from parameterization.py in this same PR, suggesting a coordinated cleanup, but there may be references in experiment configuration code not visible in this diff.

Suggested change
# TODO: For now, only a small number of projects are going through a grouping config transition at
# NOTE: grouping.experiments.parameterization.traceparent was intentionally
# removed together with ParameterizationRegexExperiment in parameterization.py.
# Do not re-add this option without also restoring the corresponding experiment
# class. Any call to options.get('grouping.experiments.parameterization.traceparent')
# will raise UnknownOption or return None depending on the options framework version.

🤖 Grapple PR auto-fix • minor • confidence: 98%

"document-policy-violation",
"permissions-policy",
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — style agent (Small fix (5 lines, 1 file))

Refactored from BrowserReportType TypeAlias with detailed comments to BROWSER_REPORT_TYPES list, losing documentation about which report types are core vs. policy-violation types. This information was removed without being preserved elsewhere.

Suggested change
# Known browser report types as defined by the Browser Reporting API specification.
# Core report types are always sent to the 'default' endpoint.
# Policy violation report types can be sent to named endpoints.
# Core report types (always sent to 'default' endpoint)
# Policy violation report types (can be sent to named endpoints)

🤖 Grapple PR auto-fix • minor • confidence: 93%


if disable_error_fetching:
error_events = []
else:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (7 lines, 1 file))

process_raw_response is called with request.query_params.getlist('field'), but if no 'field' params are provided, the fields list will be empty. The code then accesses response[0].get('error_ids', []) - if response is empty or process_raw_response returns unexpected format with empty fields, this could fail silently or return no error_ids.

Suggested change
else:
# Always include "error_ids" in the fields so we can fetch error context,
# even if the caller did not explicitly request it via ?field= params.
requested_fields = request.query_params.getlist("field")
fields_for_query = list(set(requested_fields) | {"error_ids"}) if requested_fields else ["error_ids"]
fields=fields_for_query,

🤖 Grapple PR auto-fix • minor • confidence: 90%

@@ -0,0 +1 @@
from .analytics import * # NOQA

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — architecture agent (Small fix (2 lines, 1 file))

The __init__.py uses a wildcard import (from .analytics import * # NOQA) which re-exports all public symbols from analytics at the package level. This is a pattern that makes it harder to trace imports and can lead to namespace pollution. For a new module, it would be better to use explicit imports.

Suggested change
from .analytics import * # NOQA
from . import analytics as _analytics # noqa: F401 - imported for side effect of analytics.register()

🤖 Grapple PR auto-fix • minor • confidence: 88%

def _make_recording_filenames(project_id: int, row: MatchedRow) -> list[str]:
# Null segment_ids can cause this to fail. If no segments were ingested then we can skip
# deleting the segements.
if row["max_segment_id"] is None:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (4 lines, 1 file))

The null check for max_segment_id is good, but the TypedDict update on line 112 (max_segment_id: int | None) means callers iterating over MatchedRow results may still pass rows with None to other code paths that expect int. The fix is localized to _make_recording_filenames but other consumers of MatchedRow should be audited.

Suggested change
if row["max_segment_id"] is None:
# max_segment_id may be None when no segments have been ingested for the replay.
# All consumers of this field MUST guard against None before using it as an int.
# See: _make_recording_filenames for the canonical null-guard pattern.

🤖 Grapple PR auto-fix • minor • confidence: 88%

"actions": [action.id for action in filtered_actions],
"event_data": event_data,
"event_data": workflow_event_data,
"event_id": workflow_event_data.event.event_id,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Grapple PR] Auto-fix — logic agent (Small fix (5 lines, 1 file))

The logging change on line 583 from event_data (the full Redis data structure) to workflow_event_data fixes the immediate issue of logging a non-serializable object, but workflow_event_data still contains the full event object which could be large. This is an improvement but could still cause performance issues with high-volume logging.

Suggested change
"event_id": workflow_event_data.event.event_id,
"event_id": workflow_event_data.event.event_id,
"group_id": workflow_event_data.event.group_id,
"project_id": workflow_event_data.event.project_id,

🤖 Grapple PR auto-fix • minor • confidence: 84%

raise serializers.ValidationError("If age is present, timestamp must be absent")
return value


Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (12 lines, 1 file) — review recommended)

The BrowserReportSerializer's validate_timestamp and validate_age cross-validation logic is flawed. validate_timestamp checks if 'age' is present and raises an error, and validate_age checks if 'timestamp' is present and raises an error. However, both fields are optional (required=False). If neither field is provided, the report will pass validation without any temporal information, which may not be the intended behavior per the W3C Reporting spec (one of the two should be present).

--- a/src/sentry/issues/endpoints/browser_reporting_collector.py
+++ b/src/sentry/issues/endpoints/browser_reporting_collector.py
@@ -56,6 +56,14 @@ class BrowserReportSerializer(serializers.Serializer[Any]):
     def validate_age(self, value: int) -> int:
         """Validate that age is present, but not timestamp."""
         if self.initial_data.get("timestamp"):
             raise serializers.ValidationError("If age is present, timestamp must be absent")
         return value
 
+    def validate(self, data: dict[str, Any]) -> dict[str, Any]:
+        """Validate that at least one temporal field is present per the W3C Reporting spec.
+
+        Both the Working Draft (timestamp) and Editor's Draft (age) require temporal
+        information on each report. Exactly one of the two must be provided.
+        """
+        if "age" not in data and "timestamp" not in data:
+            raise serializers.ValidationError(
+                "Either 'age' (Editor's Draft) or 'timestamp' (Working Draft) must be present."
+            )
+        return data
+

🤖 Grapple PR auto-fix • major • Review this diff before applying


error_ids = response[0].get("error_ids", []) if response else []

# Check if error fetching should be disabled

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (25 lines, 1 file) — review recommended)

query_replay_instance is called with start/end from filter_params, but these are derived from the request's date parameters via get_filter_params(). If the user does not provide explicit date parameters, the defaults may not cover the replay's actual time range, causing the query to return empty results and error_ids to be an empty list even when errors exist.

--- a/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
+++ b/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
@@ -60,15 +60,32 @@ def get(self, request: Request, project: Project, replay_id: str) -> Response:
         ):
             return self.respond(status=404)
 
-        filter_params = self.get_filter_params(request, project)
-
-        # Fetch the replay's error IDs from the replay_id.
+        # Use a wide date range to ensure we cover the replay's actual timeframe,
+        # regardless of what date parameters the user provided.  We use the
+        # organization's maximum retention window so that replays near the edges
+        # of the default 14-day window are still found correctly.
+        # See: https://github.com/getsentry/sentry/issues/<issue> – default
+        # filter_params date range may not cover the replay's actual timeframe.
+        from django.utils import timezone
+        from datetime import timedelta
+
+        retention_days = project.organization.get_option(
+            "sentry:replay_retention_days",
+            default=90,
+        )
+        wide_end = timezone.now()
+        wide_start = wide_end - timedelta(days=retention_days)
+
+        # Fetch the replay's error IDs using the wide time window so that the
+        # replay is always found even when the caller omits date query params.
         snuba_response = query_replay_instance(
             project_id=project.id,
             replay_id=replay_id,
-            start=filter_params["start"],
-            end=filter_params["end"],
+            start=wide_start,
+            end=wide_end,
             organization=project.organization,
             request_user_id=request.user.id,
         )

🤖 Grapple PR auto-fix • major • Review this diff before applying



ISSUE_TITLE_MAX_LENGTH = 50
MERGED_PR_SINGLE_ISSUE_TEMPLATE = "* ‼️ [**{title}**]({url}){environment}\n"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — architecture agent (Larger fix (31 lines, 1 file) — review recommended)

The merged PR comment template was changed from inline text format (- ‼️ **{title}** \{subtitle}` View Issue) to a markdown link format (* ‼️ {title}{environment}\n) and the subtitle(culprit) field was removed entirely. This is a breaking change to the PR comment format for both GitHub and GitLab integrations. Thesubtitle`/culprit information is no longer displayed, which changes the user-facing contract of PR comments.

--- a/src/sentry/integrations/source_code_management/commit_context.py
+++ b/src/sentry/integrations/source_code_management/commit_context.py
@@ -142,7 +142,16 @@ class PullRequestFile:
 
 ISSUE_TITLE_MAX_LENGTH = 50
-MERGED_PR_SINGLE_ISSUE_TEMPLATE = "* ‼️ [**{title}**]({url}){environment}\n"
+MERGED_PR_SINGLE_ISSUE_TEMPLATE = "* ‼️ [**{title}**]({url}){subtitle}{environment}\n"
+
+
+def _truncate_title(title: str, max_length: int = ISSUE_TITLE_MAX_LENGTH) -> str:
+    """Truncate title if it's too long and add ellipsis.
+
+    NOTE: Module-level function so it can be shared between CommitContextIntegration
+    and PRCommentWorkflow without cross-class references.
+    """
+    if len(title) <= max_length:
+        return title
+    return title[:max_length].rstrip() + "..."
 
 
 class CommitContextIntegration(ABC):
@@ -574,18 +583,12 @@ def get_top_5_issues_by_count(
         return raw_snql_query(request, referrer=self.referrer.value)["data"]
 
-    @staticmethod
-    def _truncate_title(title: str, max_length: int = ISSUE_TITLE_MAX_LENGTH) -> str:
-        """Truncate title if it's too long and add ellipsis."""
-        if len(title) <= max_length:
-            return title
-        return title[:max_length].rstrip() + "..."
-
     def get_environment_info(self, issue: Group) -> str:
         try:
             recommended_event = issue.get_recommended_event()
             if recommended_event:
                 environment = recommended_event.get_environment()
                 if environment and environment.name:
                     return f" in `{environment.name}`"
         except Exception as e:
             # If anything goes wrong, just continue without environment info
             logger.info(
                 "get_environment_info.no-environment",
                 extra={"issue_id": issue.id, "error": e},
             )
         return ""
 
     @staticmethod
-    def get_merged_pr_single_issue_template(title: str, url: str, environment: str) -> str:
-        truncated_title = PRCommentWorkflow._truncate_title(title)
+    def get_merged_pr_single_issue_template(
+        title: str,
+        url: str,
+        environment: str,
+        subtitle: str = "",
+    ) -> str:
+        truncated_title = _truncate_title(title)
+        formatted_subtitle = f" `{subtitle}`" if subtitle else ""
         return MERGED_PR_SINGLE_ISSUE_TEMPLATE.format(
             title=truncated_title,
             url=url,
             environment=environment,
+            subtitle=formatted_subtitle,
         )

🤖 Grapple PR auto-fix • major • Review this diff before applying

)

response = process_raw_response(
snuba_response,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (49 lines, 1 file) — review recommended)

The query_replay_instance call uses filter_params['start'] and filter_params['end'] derived from the request's date parameters, but there is no validation that the replay_id actually belongs to the given project. If query_replay_instance returns data for a replay_id that exists in a different project (depending on how the query is scoped internally), error_ids from another project's replay could be fetched and summarized. The previous version of this endpoint did not fetch error data, so this is a new attack surface.

--- a/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
+++ b/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
@@ -64,6 +64,14 @@ def get(self, request: Request, project: Project, replay_id: str) -> Response:
         response = process_raw_response(
             snuba_response,
             fields=request.query_params.getlist("field"),
         )
 
+        # Ownership check: if the replay does not exist in this project, return 404.
+        # This prevents error_ids from a replay in a different project from being
+        # fetched and summarized, since query_replay_instance scopes by project_id
+        # in Snuba — an empty response means this replay_id does not belong here.
+        if not response:
+            return self.respond(status=404)
+
         error_ids = response[0].get("error_ids", []) if response else []
 
         # Check if error fetching should be disabled
@@ -96,14 +104,23 @@ def fetch_error_details(project_id: int, error_ids: list[str]) -> list[ErrorEve
     """Fetch error details given error IDs and return a list of ErrorEvent objects."""
     try:
         node_ids = [Event.generate_node_id(project_id, event_id=id) for id in error_ids]
-        events = nodestore.backend.get_multi(node_ids)
-
-        return [
-            ErrorEvent(
-                category="error",
-                id=event_id,
-                title=data.get("title", ""),
-                timestamp=data.get("timestamp", 0.0),
-                message=data.get("message", ""),
-            )
-            for event_id, data in zip(error_ids, events.values())
-            if data is not None
-        ]
+        node_id_to_event_id = dict(zip(node_ids, error_ids))
+        events = nodestore.backend.get_multi(node_ids)
+
+        result = []
+        for node_id, data in events.items():
+            if data is None:
+                continue
+            event_id = node_id_to_event_id.get(node_id)
+            if event_id is None:
+                continue
+            # Verify the event's project matches the requested project to prevent
+            # cross-project data leakage via nodestore (nodestore.get_multi does
+            # not filter by project).
+            if data.get("project_id") is not None and data.get("project_id") != project_id:
+                logger.warning(
+                    "replay_summarize_breadcrumbs.cross_project_event_skipped",
+                    extra={"event_id": event_id, "project_id": project_id},
+                )
+                continue
+            result.append(
+                ErrorEvent(
+                    category="error",
+                    id=event_id,
+                    title=data.get("title", ""),
+                    timestamp=data.get("timestamp", 0.0),
+                    message=data.get("message", ""),
+                )
+            )
+        return result
     except Exception as e:
         sentry_sdk.capture_exception(e)
         return []

🤖 Grapple PR auto-fix • major • Review this diff before applying

organization_id=project.organization_id,
project_id=project.id,
user_id=request.user.id,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (14 lines, 1 file) — review recommended)

analytics.record is called before the feature flag check. This means analytics events are recorded even for organizations that don't have the feature enabled, which would result in misleading analytics data. The analytics call should be placed after the feature check.

--- a/src/sentry/preprod/api/endpoints/organization_preprod_artifact_assemble.py
+++ b/src/sentry/preprod/api/endpoints/organization_preprod_artifact_assemble.py
@@ -76,16 +76,16 @@
     def post(self, request: Request, project) -> Response:
         """
         Assembles a preprod artifact (mobile build, etc.) and stores it in the database.
         """
 
-        analytics.record(
-            "preprod_artifact.api.assemble",
-            organization_id=project.organization_id,
-            project_id=project.id,
-            user_id=request.user.id,
-        )
-
         if not features.has(
             "organizations:preprod-artifact-assemble", project.organization, actor=request.user
         ):
             return Response({"error": "Feature not enabled"}, status=404)
 
+        analytics.record(
+            "preprod_artifact.api.assemble",
+            organization_id=project.organization_id,
+            project_id=project.id,
+            user_id=request.user.id,
+        )
+
         with sentry_sdk.start_span(op="preprod_artifact.assemble"):

🤖 Grapple PR auto-fix • minor • Review this diff before applying

title=data.get("title", ""),
timestamp=data.get("timestamp", 0.0),
message=data.get("message", ""),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — architecture agent (Larger fix (13 lines, 1 file) — review recommended)

The endpoint now fetches replay instance data from Snuba, then fetches error events from nodestore one-by-one via fetch_error_details. For replays with many error IDs, this could result in a large number of sequential nodestore reads within a single request, creating latency issues. The error fetching has no limit on the number of error_ids processed.

--- a/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
+++ b/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
@@ -29,6 +29,9 @@ from sentry.replays.usecases.reader import fetch_segments_metadata, iter_segment
 
 logger = logging.getLogger(__name__)
 
+# Cap the number of error IDs fetched per request to avoid unbounded nodestore reads.
+MAX_ERROR_IDS = 25
+
 
 class ErrorEvent(TypedDict):
     id: str
@@ -104,7 +107,12 @@ def fetch_error_details(project_id: int, error_ids: list[str]) -> list[ErrorEven
     """Fetch error details given error IDs and return a list of ErrorEvent objects."""
     try:
-        node_ids = [Event.generate_node_id(project_id, event_id=id) for id in error_ids]
+        if len(error_ids) > MAX_ERROR_IDS:
+            logger.warning(
+                "replay.summarize_breadcrumbs.error_ids_truncated",
+                extra={"project_id": project_id, "total": len(error_ids), "limit": MAX_ERROR_IDS},
+            )
+        truncated_ids = error_ids[:MAX_ERROR_IDS]
+        node_ids = [Event.generate_node_id(project_id, event_id=id) for id in truncated_ids]
         events = nodestore.backend.get_multi(node_ids)
 
         return [
@@ -116,7 +124,7 @@ def fetch_error_details(project_id: int, error_ids: list[str]) -> list[ErrorEven
                 message=data.get("message", ""),
             )
-            for event_id, data in zip(error_ids, events.values())
+            for event_id, data in zip(truncated_ids, events.values())
             if data is not None
         ]
     except Exception as e:

🤖 Grapple PR auto-fix • minor • Review this diff before applying

if message := as_log_message(event):
yield message

# Yield any remaining error messages

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — logic agent (Larger fix (23 lines, 1 file) — review recommended)

fetch_error_details fetches events from nodestore individually in a loop without any batching or concurrency limit. For replays with many errors, this could be slow and put significant load on nodestore. There's also no limit on how many error_ids are processed.

--- a/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
+++ b/src/sentry/replays/endpoints/project_replay_summarize_breadcrumbs.py
@@ -100,16 +100,21 @@ def get(self, request: Request, project: Project, replay_id: str) -> Response:
 
+MAX_ERROR_IDS = 50
+
+
 def fetch_error_details(project_id: int, error_ids: list[str]) -> list[ErrorEvent]:
     """Fetch error details given error IDs and return a list of ErrorEvent objects."""
     try:
-        node_ids = [Event.generate_node_id(project_id, event_id=id) for id in error_ids]
-        events = nodestore.backend.get_multi(node_ids)
+        truncated_error_ids = error_ids[:MAX_ERROR_IDS]
+        node_id_to_error_id = {
+            Event.generate_node_id(project_id, event_id=error_id): error_id
+            for error_id in truncated_error_ids
+        }
+        events = nodestore.backend.get_multi(list(node_id_to_error_id.keys()))
 
         return [
             ErrorEvent(
                 category="error",
-                id=event_id,
-                title=data.get("title", ""),
-                timestamp=data.get("timestamp", 0.0),
-                message=data.get("message", ""),
+                id=node_id_to_error_id[node_id],
+                title=data.get("title", ""),
+                timestamp=data.get("timestamp", 0.0),
+                message=data.get("message", ""),
             )
-            for event_id, data in zip(error_ids, events.values())
-            if data is not None
+            for node_id, data in events.items()
+            if data is not None
         ]
     except Exception as e:
         sentry_sdk.capture_exception(e)
         return []

🤖 Grapple PR auto-fix • minor • Review this diff before applying

)
# This migration had an error and was never run.
# See 0921_convert_org_saved_searches_to_views_rerevised.py for the correct migration.
return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — security agent (Larger fix (19 lines, 1 file) — review recommended)

Migration 0917 has been silently neutered to a no-op with a comment saying it 'had an error and was never run.' Similarly for 0920. However, the migration record will still be marked as applied in the Django migrations table, preventing the new 0921 migration from being recognized as a fix if environments already ran 0917/0920 as no-ops. This is correct Django migration practice, but teams should verify that 0921 runs correctly in all environments including those that ran the original (buggy) version of these migrations.

--- a/src/sentry/migrations/0917_convert_org_saved_searches_to_views.py
+++ b/src/sentry/migrations/0917_convert_org_saved_searches_to_views.py
@@ -10,8 +10,21 @@
 
 def convert_org_saved_searches_to_views(
     apps: StateApps, schema_editor: BaseDatabaseSchemaEditor
 ) -> None:
-    # This migration had an error and was never run.
-    # See 0921_convert_org_saved_searches_to_views_rerevised.py for the correct migration.
+    # WARNING: This migration was neutered to a no-op due to a bug in the original implementation.
+    # It will still be recorded as "applied" in Django's migrations table.
+    #
+    # ENVIRONMENT DIVERGENCE RISK:
+    # - Environments that ran the ORIGINAL 0917 (before this PR) will have had the data migration
+    #   executed (SavedSearch -> GroupSearchView conversion). Their data state differs from
+    #   environments that ran this no-op version.
+    # - Migration 0921 (0921_convert_org_saved_searches_to_views_rerevised.py) is intended as
+    #   the canonical fix and MUST be written to handle BOTH states safely:
+    #     1. Environments where 0917 ran the original logic (data already migrated, partial or full)
+    #     2. Environments where 0917 was a no-op (no data migrated yet)
+    #
+    # VERIFICATION REQUIRED before deploying 0921 to production:
+    # - Confirm 0921 uses update_or_create (or equivalent idempotent logic) to avoid duplicates
+    #   in environments where the original 0917 already created GroupSearchView records.
+    # - Confirm 0921 has been tested against a database snapshot from an environment that ran
+    #   the original 0917.
     return

🤖 Grapple PR auto-fix • minor • Review this diff before applying

"delayed_workflow.no_slow_condition_groups",
extra={"no_slow_condition_groups": sorted(no_slow_condition_groups)},
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 [Grapple PR] Suggested fix — style agent (Larger fix (14 lines, 1 file) — review recommended)

The refactored code now computes dcg_to_slow_conditions at the top level but passes it to multiple functions. The logic for handling no_slow_condition_groups is new and logged, but there's no comment explaining why this condition is unexpected or what should be done about it.

--- a/src/sentry/workflow_engine/processors/delayed_workflow.py
+++ b/src/sentry/workflow_engine/processors/delayed_workflow.py
@@ -654,9 +654,18 @@
         no_slow_condition_groups = {
             dcg_id for dcg_id, slow_conds in dcg_to_slow_conditions.items() if not slow_conds
         }
         if no_slow_condition_groups:
-            # If the DCG is being processed here, it's because we thought it had a slow condition.
-            # If any don't seem to have a slow condition now, that's interesting enough to log.
+            # A DCG reaches this processor only because it was enqueued by the alert rule
+            # evaluator under the assumption that it contained at least one slow condition
+            # (e.g. a time-based or count-over-time detector condition). Finding a DCG with
+            # no slow conditions here indicates a race condition: the slow condition was
+            # most likely deleted between the time the DCG was enqueued into Redis and the
+            # time this delayed processor ran.
+            #
+            # These DCGs will produce no Snuba queries and will not trigger any workflow
+            # actions, so this is effectively a silent no-op for those groups. No remediation
+            # is needed at runtime, but a high frequency of these log events may indicate
+            # that users are frequently modifying alert rules while they are being evaluated,
+            # or that there is a bug in the enqueue-side condition classification logic.
             logger.info(
                 "delayed_workflow.no_slow_condition_groups",
                 extra={"no_slow_condition_groups": sorted(no_slow_condition_groups)},
             )

🤖 Grapple PR auto-fix • minor • Review this diff before applying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.