ref(crons): Reorganize incident creation / issue occurrence logic#8
ref(crons): Reorganize incident creation / issue occurrence logic#8everettbu wants to merge 1 commit into
Conversation
…0528) Since we'll be doing more with issue occurrences split out the concept of incidents into it's own logic module, as well as incident_occurrence into it's own module Part of GH-80527
WalkthroughThis change refactors the incident and failure handling logic for monitor check-ins. It introduces new modules to encapsulate incident occurrence creation and threshold logic, moves helper functions and types, and simplifies the Changes
Sequence Diagram(s)sequenceDiagram
participant MonitorCheckIn
participant mark_failed
participant try_incident_threshold
participant incident_occurrence
participant Kafka
MonitorCheckIn->>mark_failed: Failed check-in triggers mark_failed
mark_failed->>try_incident_threshold: Delegate incident/issue logic
try_incident_threshold->>incident_occurrence: If incident needed, create occurrence(s)
incident_occurrence->>Kafka: Send IssueOccurrence event
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/sentry/monitors/logic/incidents.py (1)
37-45: Consider adding type validation for the QuerySet values cast.The cast to
list[SimpleCheckIn]on line 37 assumes Django'svalues()returns the exact TypedDict structure. While this typically works, consider adding runtime validation or using a more explicit conversion to ensure type safety.src/sentry/monitors/logic/incident_occurrence.py (2)
133-133: Fix typo in docstring.- Builds a humam readible string from a list of failed check-ins. + Builds a human readable string from a list of failed check-ins.
83-86: Apply style improvements suggested by static analysis.The static analysis correctly identified two style improvements that would make the code more Pythonic.
- if failed_checkin.trace_id: - trace_id = failed_checkin.trace_id.hex - else: - trace_id = None + trace_id = failed_checkin.trace_id.hex if failed_checkin.trace_id else None- if checkin["status"] in HUMAN_FAILURE_STATUS_MAP.keys() + if checkin["status"] in HUMAN_FAILURE_STATUS_MAPAlso applies to: 142-142
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/sentry/monitors/logic/incident_occurrence.py(1 hunks)src/sentry/monitors/logic/incidents.py(1 hunks)src/sentry/monitors/logic/mark_failed.py(2 hunks)src/sentry/monitors/types.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
src/sentry/monitors/logic/incident_occurrence.py
83-86: Use ternary operator trace_id = failed_checkin.trace_id.hex if failed_checkin.trace_id else None instead of if-else-block
Replace if-else-block with trace_id = failed_checkin.trace_id.hex if failed_checkin.trace_id else None
(SIM108)
142-142: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
🔇 Additional comments (2)
src/sentry/monitors/types.py (1)
105-113: LGTM! Clean type definition for simplified check-in representation.The
SimpleCheckInTypedDict provides a well-structured, minimal representation of check-in data needed for incident processing. The field types are appropriate and the docstring clearly indicates its purpose.src/sentry/monitors/logic/mark_failed.py (1)
1-79: Excellent refactoring! Clean separation of concerns.The
mark_failedfunction has been simplified to focus solely on updating monitor environment state, while delegating incident threshold logic to the newtry_incident_thresholdfunction. This improves modularity and makes the code more maintainable.The conditional update pattern (lines 52-73) correctly prevents race conditions by ensuring newer check-ins aren't overwritten.
| def get_monitor_environment_context(monitor_environment: MonitorEnvironment): | ||
| config = monitor_environment.monitor.config.copy() | ||
| if "schedule_type" in config: | ||
| config["schedule_type"] = monitor_environment.monitor.get_schedule_type_display() | ||
|
|
||
| return { | ||
| "id": str(monitor_environment.monitor.guid), | ||
| "slug": str(monitor_environment.monitor.slug), | ||
| "name": monitor_environment.monitor.name, | ||
| "config": monitor_environment.monitor.config, | ||
| "status": monitor_environment.get_status_display(), | ||
| "type": monitor_environment.monitor.get_type_display(), | ||
| } |
There was a problem hiding this comment.
Use the modified config variable instead of the original.
The function modifies the config variable to include display values but then returns the original monitor.config instead of the modified version.
def get_monitor_environment_context(monitor_environment: MonitorEnvironment):
config = monitor_environment.monitor.config.copy()
if "schedule_type" in config:
config["schedule_type"] = monitor_environment.monitor.get_schedule_type_display()
return {
"id": str(monitor_environment.monitor.guid),
"slug": str(monitor_environment.monitor.slug),
"name": monitor_environment.monitor.name,
- "config": monitor_environment.monitor.config,
+ "config": config,
"status": monitor_environment.get_status_display(),
"type": monitor_environment.monitor.get_type_display(),
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_monitor_environment_context(monitor_environment: MonitorEnvironment): | |
| config = monitor_environment.monitor.config.copy() | |
| if "schedule_type" in config: | |
| config["schedule_type"] = monitor_environment.monitor.get_schedule_type_display() | |
| return { | |
| "id": str(monitor_environment.monitor.guid), | |
| "slug": str(monitor_environment.monitor.slug), | |
| "name": monitor_environment.monitor.name, | |
| "config": monitor_environment.monitor.config, | |
| "status": monitor_environment.get_status_display(), | |
| "type": monitor_environment.monitor.get_type_display(), | |
| } | |
| def get_monitor_environment_context(monitor_environment: MonitorEnvironment): | |
| config = monitor_environment.monitor.config.copy() | |
| if "schedule_type" in config: | |
| config["schedule_type"] = monitor_environment.monitor.get_schedule_type_display() | |
| return { | |
| "id": str(monitor_environment.monitor.guid), | |
| "slug": str(monitor_environment.monitor.slug), | |
| "name": monitor_environment.monitor.name, | |
| "config": config, | |
| "status": monitor_environment.get_status_display(), | |
| "type": monitor_environment.monitor.get_type_display(), | |
| } |
🤖 Prompt for AI Agents
In src/sentry/monitors/logic/incident_occurrence.py around lines 159 to 171, the
function modifies the local variable config but returns the original
monitor_environment.monitor.config instead of the updated config. Fix this by
returning the modified config variable in the returned dictionary under the
"config" key.
| checkins = MonitorCheckIn.objects.filter(id__in=[c["id"] for c in previous_checkins]) | ||
| for checkin in checkins: | ||
| create_incident_occurrence( | ||
| previous_checkins, | ||
| checkin, | ||
| incident, | ||
| received=received, | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Optimize database queries by avoiding redundant fetches.
The code fetches MonitorCheckIn objects by ID when the required data already exists in previous_checkins. This creates an unnecessary database query.
Consider passing the check-in data directly to create_incident_occurrence or modifying it to accept SimpleCheckIn objects:
- checkins = MonitorCheckIn.objects.filter(id__in=[c["id"] for c in previous_checkins])
- for checkin in checkins:
- create_incident_occurrence(
- previous_checkins,
- checkin,
- incident,
- received=received,
- )
+ for checkin_data in previous_checkins:
+ # If create_incident_occurrence needs the full MonitorCheckIn object,
+ # consider fetching only the required fields or modifying the function
+ # to work with SimpleCheckIn data
+ if checkin_data["id"] == failed_checkin.id:
+ create_incident_occurrence(
+ previous_checkins,
+ failed_checkin,
+ incident,
+ received=received,
+ )
+ else:
+ # Handle other check-ins if needed
+ passCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/sentry/monitors/logic/incidents.py around lines 93 to 100, the code
fetches MonitorCheckIn objects from the database using IDs already present in
previous_checkins, causing an unnecessary query. To fix this, refactor the code
to pass the existing check-in data from previous_checkins directly to
create_incident_occurrence, or update create_incident_occurrence to accept
SimpleCheckIn objects, thereby eliminating the redundant database fetch.
Test 8
Summary by CodeRabbit
New Features
Refactor