SMS workflow reminder retry count tracking#4
Conversation
* add retry count to workflow reminder * add logic to for retry count --------- Co-authored-by: CarinaWolli <wollencarina@gmail.com> Co-authored-by: Udit Takkar <53316345+Udit-takkar@users.noreply.github.com>
|
@CodeAnt-AI: review |
|
CodeAnt AI is running the review. |
Nitpicks 🔍
|
| id: reminder.id, | ||
| }, | ||
| data: { | ||
| retryCount: reminder.retryCount + 1, |
There was a problem hiding this comment.
Suggestion: Incrementing the retry counter using retryCount: reminder.retryCount + 1 is a non-atomic read-modify-write that can lose increments when multiple cron executions overlap, so this should use Prisma's atomic increment operation to be concurrency-safe. [race condition]
Severity Level: Major ⚠️
- ⚠️ Lost increments under concurrent cron executions.
- ❌ SMS retry limit enforcement becomes inaccurate.
- ⚠️ Affects POST /api/cron/workflows/scheduleSMSReminders endpoint.| retryCount: reminder.retryCount + 1, | |
| retryCount: { | |
| increment: 1, | |
| }, |
Steps of Reproduction ✅
1. The cron endpoint `/api/cron/workflows/scheduleSMSReminders` is implemented by
`handler` in `packages/features/ee/workflows/api/scheduleSMSReminders.ts:21-202` and
exported via `apps/web/pages/api/cron/workflows/scheduleSMSReminders.ts:1`.
2. On each POST, the handler loads unscheduled SMS reminders via
`prisma.workflowReminder.findMany` at `scheduleSMSReminders.ts:47-60` (filtering by
`method: WorkflowMethods.SMS`, `scheduled: false`, and `scheduledDate`), then loops over
them and calls `twilio.scheduleSMS` at lines 158-166.
3. When `twilio.scheduleSMS` resolves to a falsy value without throwing (e.g. transient
failure where the provider returns no SID), execution enters the `else` block at
`scheduleSMSReminders.ts:168-187`, which updates the row using `data: { retryCount:
reminder.retryCount + 1 }`.
4. If the cron endpoint is invoked concurrently on multiple app instances (or re-invoked
before the previous run finishes) so that two handlers load the same reminder (both
reading `retryCount = 0` from the `findMany` at lines 47-60) and both experience the same
non-throwing failure, then both `update` calls in the `else` block race: each computes
`retryCount: 0 + 1` and writes `1`, resulting in only a single increment being persisted
instead of two. Because cleanup logic at `scheduleSMSReminders.ts:28-44` deletes reminders
only when `retryCount > 1`, this lost increment can cause more send attempts than intended
before the reminder is cleaned up.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/features/ee/workflows/api/scheduleSMSReminders.ts
**Line:** 184:184
**Comment:**
*Race Condition: Incrementing the retry counter using `retryCount: reminder.retryCount + 1` is a non-atomic read-modify-write that can lose increments when multiple cron executions overlap, so this should use Prisma's atomic `increment` operation to be concurrency-safe.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| id: reminder.id, | ||
| }, | ||
| data: { | ||
| retryCount: reminder.retryCount + 1, |
There was a problem hiding this comment.
Suggestion: The retry counter increment in the error handler also uses retryCount: reminder.retryCount + 1, which can drop increments under concurrent runs of the cron job, so it should likewise use an atomic increment to ensure retries are counted correctly. [race condition]
Severity Level: Major ⚠️
- ⚠️ Retry counts inaccurate when provider errors occur.
- ❌ SMS reminders may exceed intended maximum retry attempts.
- ⚠️ Affects POST /api/cron/workflows/scheduleSMSReminders endpoint.| retryCount: reminder.retryCount + 1, | |
| retryCount: { | |
| increment: 1, | |
| }, |
Steps of Reproduction ✅
1. The same cron handler in
`packages/features/ee/workflows/api/scheduleSMSReminders.ts:21-202` iterates over
unscheduled SMS reminders and calls `twilio.scheduleSMS` (lines 158-166); if this call
throws (e.g. network error), execution jumps to the `catch` block at lines 189-198.
2. In the `catch` block, the reminder's `retryCount` is updated via
`prisma.workflowReminder.update` with `data: { retryCount: reminder.retryCount + 1 }` at
`scheduleSMSReminders.ts:190-197`, using the `retryCount` value that was originally read
in the `findMany` at lines 47-60.
3. If two instances of the cron endpoint `/api/cron/workflows/scheduleSMSReminders` run
concurrently and both load the same reminder (each seeing the same `retryCount` value from
`findMany`) and both encounter a thrown error from `twilio.scheduleSMS`, they will each
execute the `catch` block and issue concurrent `update` statements writing `retryCount` to
the same computed value (e.g. both write `1` instead of incrementing from `0` to `2`).
4. Because cleanup at `scheduleSMSReminders.ts:28-44` relies on `retryCount > 1` to stop
retrying and delete reminders, these lost increments from the non-atomic `retryCount:
reminder.retryCount + 1` in the `catch` block can allow additional unintended retry
attempts or leave reminders in the system longer than the intended maximum number of
failures.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** packages/features/ee/workflows/api/scheduleSMSReminders.ts
**Line:** 195:195
**Comment:**
*Race Condition: The retry counter increment in the error handler also uses `retryCount: reminder.retryCount + 1`, which can drop increments under concurrent runs of the cron job, so it should likewise use an atomic `increment` to ensure retries are counted correctly.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.|
CodeAnt AI finished running the review. |
User description
Test 9nn
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.
nn---n*Replicated from [ai-code-review-evaluation/cal.com-coderabbit#9](https://github.com/ai-code-review-evaluation/cal.com-coderabbit/pull/9)*CodeAnt-AI Description
Add retry count to SMS reminders and stop retrying after two attempts
What Changed
Impact
✅ Fewer duplicate SMS attempts✅ Fewer stuck reminders lingering indefinitely✅ Clearer retry tracking for failed SMS💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.