Skip to content

fix(idempotency): gate side-effecting handlers with Valkey claim to prevent redelivery duplicates#212

Merged
chrisleekr merged 2 commits into
mainfrom
fix/issue-202
Jun 3, 2026
Merged

fix(idempotency): gate side-effecting handlers with Valkey claim to prevent redelivery duplicates#212
chrisleekr merged 2 commits into
mainfrom
fix/issue-202

Conversation

@chrisleekr

@chrisleekr chrisleekr commented Jun 3, 2026

Copy link
Copy Markdown
Owner

Summary

Production webhook handlers bypassed the documented idempotency router, so GitHub redeliveries (same X-GitHub-Delivery header, replayed up to 3 days on auto-retry or operator redelivery) re-ran the full dispatch path, producing duplicate LLM charges, duplicate workflow_runs rows, and duplicate GitHub replies.

This PR introduces a durable Valkey-backed claim at the top of each side-effecting handler. claimDelivery(deliveryId, log) issues a SET <key> 1 NX EX 259200 (3-day TTL matching GitHub's redelivery window) and returns true exactly once per delivery ID. Redeliveries get false and the handler returns immediately. The gate is fail-open: when Valkey is unconfigured, configured-but-disconnected (gated on isValkeyHealthy() so a down connection skips the SET rather than blocking on Bun's default offline queue), or throws, it returns true and processing continues, degrading to at-least-once rather than dropping webhooks. The idx_workflow_runs_inflight partial-unique index remains the durable backstop when the Valkey claim is skipped.

Flow

flowchart TD
    classDef old fill:#c0392b,color:#ffffff
    classDef added fill:#1a5276,color:#ffffff
    classDef gate fill:#1e8449,color:#ffffff
    classDef neutral fill:#ecf0f1,color:#2c3e50

    WHK["Webhook arrives<br/>X-GitHub-Delivery D"]:::neutral
    OWN["Owner allowlist check"]:::neutral
    DSP_OLD["BEFORE: dispatch, LLM call,<br/>GitHub write on every delivery"]:::old
    CLM["claimDelivery<br/>SET D NX EX 259200"]:::gate
    DRP["Redelivery: return early,<br/>no side effects"]:::added
    DSP_NEW["First delivery or fail-open:<br/>dispatch, LLM call, GitHub write"]:::added
    IDX["idx_workflow_runs_inflight<br/>partial-unique index<br/>durable backstop"]:::neutral

    WHK --> OWN
    OWN -.->|before, no gate| DSP_OLD
    OWN --> CLM
    CLM -->|false| DRP
    CLM -->|true| DSP_NEW
    DSP_NEW --> IDX
Loading

Changes

  • src/webhook/idempotency.ts (new): claimDelivery(deliveryId, log) wraps isValkeyHealthy() + Valkey SET NX EX 259200. Fail-open on null client, disconnected client, and any thrown error.
  • src/webhook/events/issue-comment.ts, events/review-comment.ts, label branches of events/issues.ts and events/pull-request.ts: if (!(await claimDelivery(deliveryId, log))) return; inserted after the owner-allowlist gate, before any LLM or GitHub write.
  • src/webhook/events/review.ts: intentionally ungated. Added a doc comment explaining the exemption (fires only an idempotent reactor wake, no dispatch or write).
  • test/webhook/idempotency.test.ts (new): 6 unit cases covering claim-once, reject-redelivery, independent IDs, NX + EX 259200 assertion, fail-open on null client, fail-open on throw, and fail-open without issuing SET when configured-but-disconnected.
  • test/webhook/events/issues.test.ts: added a flushMicrotasks drain to push past the new async Valkey gate, preventing the now-deferred dispatch from leaking across tests.
  • CLAUDE.md + docs/build/architecture.md: rewrote the Idempotency section to document the claimDelivery Valkey gate, the review.ts exemption, fail-open semantics, and the idx_workflow_runs_inflight durable backstop. Retired the stale in-memory Map / isAlreadyProcessed description (it lives on the dead router.ts processRequest path that production handlers bypass).

Out of scope, tracked in issue #211: chat-thread DB re-check, markRedelivery catch rewrite, retiring the dead processRequest Map.

Test plan

  • test/webhook/idempotency.test.ts (6 cases, 100% coverage of claimDelivery)
  • test/webhook/events/issues.test.ts microtask-drain update; all webhook tests pass in isolation (CI runs each file in its own process)
  • typecheck, lint, format, docs-citations, docs-versions, mkdocs strict build all green

Closes #202

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced webhook delivery processing with improved duplicate handling using fast-path claim verification
    • Database-level safeguards prevent duplicate in-flight processing
    • Graceful fallback ensures webhook processing continues when caching infrastructure is unavailable
  • Documentation

    • Updated architecture and development documentation to reflect current webhook handling semantics and delivery mechanisms

…revent redelivery duplicates

Closes #202

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 3, 2026 09:32
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9f7eef2f-1460-4871-b2ad-0ecd548404e0

📥 Commits

Reviewing files that changed from the base of the PR and between 2c58ec1 and 8735a97.

📒 Files selected for processing (10)
  • CLAUDE.md
  • docs/build/architecture.md
  • src/webhook/events/issue-comment.ts
  • src/webhook/events/issues.ts
  • src/webhook/events/pull-request.ts
  • src/webhook/events/review-comment.ts
  • src/webhook/events/review.ts
  • src/webhook/idempotency.ts
  • test/webhook/events/issues.test.ts
  • test/webhook/idempotency.test.ts

📝 Walkthrough

Walkthrough

This PR implements webhook delivery idempotency using Valkey. A new claimDelivery function gates webhook handlers with a 3-day TTL claim; fail-open behavior preserves processing when Valkey is unavailable. Four production handlers now claim delivery before dispatch, with the review handler intentionally ungated. Tests are updated for async behavior, and documentation clarifies the new idempotency contract.

Changes

Webhook Delivery Idempotency

Layer / File(s) Summary
Core claimDelivery implementation and unit tests
src/webhook/idempotency.ts, test/webhook/idempotency.test.ts
New claimDelivery function performs Valkey SET idemp:webhook:<deliveryId> 1 NX EX 259200 with fail-open behavior when Valkey is unavailable, unconfigured, or unhealthy. Test suite verifies single claim per delivery, independent IDs, exact SET arguments, and three fail-open scenarios.
Handler idempotency gating (4 production handlers)
src/webhook/events/issue-comment.ts, src/webhook/events/issues.ts, src/webhook/events/pull-request.ts, src/webhook/events/review-comment.ts
Each handler imports claimDelivery and calls it at async entry before any dispatch work, returning early when delivery is already claimed to prevent redeliveries from executing canonical and legacy routing.
Review handler exemption documentation
src/webhook/events/review.ts
Expanded comments clarify that pull_request_review.submitted events intentionally skip claimDelivery and only trigger an idempotent reactor wake, since redeliveries are harmless due to self-deduping.
Test async behavior fixes for deferred dispatch
test/webhook/events/issues.test.ts
Added flushMicrotasks() helper to drain microtask queue; applied to bot label and T014 duplicate-label tests to ensure deferred dispatch callbacks complete before assertions, preventing cross-test leakage from fire-and-forget async IIFE that awaits claimDelivery.
Documentation updates
CLAUDE.md, docs/build/architecture.md
Updated idempotency contract: Valkey claim at handler entry (fail-open), exemption for review handler, durable idx_workflow_runs_inflight backstop, and clarification that legacy tracking-comment scan runs only on router path.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #211: Follow-up work on the same idempotency flow will need chat-thread re-check, markRedelivery dispatcher behavior revision, and retirement of router.ts in-memory Map that this PR leaves in place.

Possibly related PRs

  • chrisleekr/github-app-playground#61: Both PRs modify webhook handler entry points (issue-comment.ts, review-comment.ts) in the dispatch setup—this PR adds early claimDelivery gating while the other adds trigger-metadata wiring and reactions—so changes overlap at the same code-level entrypoints.

Suggested labels

type: docs 📋

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR implements most core requirements from #202 but leaves several unfinished: chat-thread durable recheck, dispatcher UNIQUE-violation handling, and legacy router dedup cleanup remain incomplete. Complete the remaining objectives: add tracking-comment durable recheck in chat-thread flow, replace dispatcher UNIQUE-violation handling to call markRedelivery, and reconcile or retire legacy router dedup.
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a Valkey-backed claim mechanism to gate side-effecting handlers against redelivery duplicates.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the Valkey-backed idempotency claim for webhook handlers as specified in issue #202; no unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a Valkey-backed, delivery-ID–keyed idempotency claim (SET ... NX EX 259200) to prevent GitHub webhook redeliveries from re-running side-effecting webhook handlers (LLM calls, workflow dispatch, and GitHub writes), while remaining fail-open when Valkey is unavailable.

Changes:

  • Introduces claimDelivery(deliveryId, log) and uses it to early-return on redeliveries in the side-effecting webhook handlers.
  • Updates webhook tests to accommodate the new async gate and adds focused unit tests for claimDelivery.
  • Updates docs (CLAUDE.md + architecture docs) to describe the new idempotency mechanism and the review.ts exemption.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/webhook/idempotency.ts New Valkey-based delivery claim helper for webhook idempotency with fail-open semantics.
src/webhook/events/issue-comment.ts Adds claimDelivery gate before dispatch/side effects on issue comment triggers.
src/webhook/events/review-comment.ts Adds claimDelivery gate before dispatch/side effects on review comment triggers.
src/webhook/events/issues.ts Adds claimDelivery gate before label-trigger dispatch path for issues.
src/webhook/events/pull-request.ts Adds claimDelivery gate before label-trigger dispatch path for pull requests.
src/webhook/events/review.ts Documents intentional exemption from delivery-claim gating (reactor wake only).
test/webhook/idempotency.test.ts New unit tests covering claim-once, TTL args, and fail-open behaviors.
test/webhook/events/issues.test.ts Adds microtask draining helper to stabilize assertions with the new async gate.
docs/build/architecture.md Updates architecture docs to reflect handler-level delivery claiming.
CLAUDE.md Updates project docs describing idempotency behavior and fail-open semantics.

Comment thread src/webhook/idempotency.ts Outdated
Comment thread test/webhook/idempotency.test.ts Outdated
…from review

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@chrisleekr chrisleekr merged commit 68dacdb into main Jun 3, 2026
22 checks passed
@chrisleekr chrisleekr deleted the fix/issue-202 branch June 3, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(idempotency): production webhook handlers bypass router dedup, redeliveries unguarded

2 participants