Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Single HTTP server (`src/app.ts`) using `octokit` App class. Webhook events arri
## Key Concepts

- **Async processing**: Webhook must respond within 10 seconds. All heavy work runs asynchronously after 200 OK.
- **Idempotency**: Two-layer guard. Fast path: in-memory `Map` keyed by `X-GitHub-Delivery` header (lost on restart). Durable: `isAlreadyProcessed()` checks GitHub for an existing tracking comment, survives pod restarts and OOM kills.
- **Idempotency**: GitHub webhooks are at-least-once, a delivery (auto-retry or operator redelivery) replays with the SAME `X-GitHub-Delivery` header for up to 3 days. The four side-effecting event handlers (`events/issue-comment.ts`, `events/review-comment.ts`, and the label branches of `events/issues.ts` + `events/pull-request.ts`) call `claimDelivery(deliveryId)` (`src/webhook/idempotency.ts`) at the very top of their dispatch path, before any LLM call, `workflow_runs` insert, or GitHub write. `claimDelivery` is a Valkey `SET key 1 NX EX 259200` claim: it returns `true` exactly once per `deliveryId` within the 3-day window (the redelivery gets `false` and the handler returns early). It is **fail-OPEN**, when Valkey is unconfigured or errors it returns `true`, degrading to at-least-once rather than dropping webhooks. `events/review.ts` is intentionally NOT gated: it fires only an idempotent reactor wake (no dispatch/write). Durable backstop behind the best-effort Valkey layer: the `idx_workflow_runs_inflight` partial-unique index, which makes the dispatcher reject a second in-flight run for the same workflow+target even if the Valkey claim was skipped (fail-open). `claimDelivery` also fails open when Valkey is configured-but-disconnected (gated on `isValkeyHealthy()` so a down connection skips the SET rather than blocking on Bun's offline queue). (The legacy in-memory `Map` + `isAlreadyProcessed()` tracking-comment scan lives on the `router.ts` `processRequest` path, which the production handlers bypass, issue #202.)
- **Repo checkout**: Each request clones the repo to a unique temp dir. Claude operates on local files via `cwd`.
- **MCP servers**: Comment updates, inline reviews, and Context7 for library docs. Git changes are made via git CLI (Bash tool) on the cloned repo.
- **Scheduled actions**: a repo may ship a `.github-app.yaml` at its default-branch root declaring prompt-based actions on a cron schedule. The internal scheduler (`src/scheduler/`, gated by `SCHEDULER_ENABLED` + `DATABASE_URL` + non-empty `ALLOWED_OWNERS`) enqueues a `scheduled-action` job, a new job kind on the scoped-job rail, that the daemon runs as one agent session via `src/daemon/scheduled-action-executor.ts`. Missed cron slots are skipped, not backfilled. The prompt is owner-trusted config. Cron parsing uses the `cron-parser` dependency. The bot-provided `merge_readiness` MCP tool is exposed only when `SCHEDULER_ALLOW_AUTO_MERGE` env AND per-action `auto_merge` are both true; `allowed_tools` is owner-trusted config, so an action granted a merge-capable Bash tool can still merge regardless. `resolve.ts` FR-017 is untouched.
Expand Down
4 changes: 2 additions & 2 deletions docs/build/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ flowchart TD
GH["GitHub webhook<br/>POST /api/github/webhooks"]:::entry
VERIFY["Verify HMAC-SHA256"]:::guard
ACK["200 OK within 10 seconds"]:::ack
ROUTE["Router<br/>idempotency + allowlist + concurrency"]:::guard
ROUTE["Handler dispatch<br/>delivery claim + allowlist + concurrency"]:::guard
TR["Haiku triage<br/>binary heavy classifier"]:::decide
QUEUE["Orchestrator job queue<br/>Valkey list"]:::store
SCALE{{"Scale-up decision<br/>heavy OR queue >= threshold<br/>AND no persistent slots<br/>AND cooldown elapsed"}}:::fork
Expand Down Expand Up @@ -44,7 +44,7 @@ flowchart TD
## Key concepts

- **Async processing.** The webhook handler responds within ten seconds, so the router fires `processRequest` with fire-and-forget semantics after the 200 OK is queued. Every box downstream of `ACK` runs after the HTTP response is on the wire.
- **Two-layer idempotency.** The fast path is an in-memory `Map` keyed by `X-GitHub-Delivery`. The durable path (`isAlreadyProcessed` in `src/core/tracking-comment.ts`) scans GitHub issue/PR comments for the hidden delivery marker the bot embeds in the tracking comment, so duplicate deliveries are detected across pod restarts, OOM kills, and crash loops: this works **without** `DATABASE_URL`. `DATABASE_URL` is required only to persist execution / dispatch history across restarts.
- **Webhook delivery idempotency (issue #202).** GitHub is at-least-once: a delivery (auto-retry or operator redelivery) replays with the same `X-GitHub-Delivery` for up to 3 days. The four side-effecting handlers (`events/issue-comment.ts`, `events/review-comment.ts`, the label branches of `events/issues.ts` + `events/pull-request.ts`) call `claimDelivery(deliveryId)` (`src/webhook/idempotency.ts`) at the top of their dispatch path, before any LLM call, `workflow_runs` insert, or GitHub write. It is a Valkey `SET key 1 NX EX 259200` claim: `true` exactly once per delivery, `false` (and an early return) on a redelivery. It is **fail-open**, when Valkey is unconfigured or disconnected (gated on `isValkeyHealthy()`) it returns `true`, degrading to at-least-once rather than dropping or blocking webhooks. `events/review.ts` is exempt (idempotent reactor wake only). The durable backstop behind the best-effort Valkey layer is the `idx_workflow_runs_inflight` partial-unique index: the dispatcher rejects a second in-flight run for the same workflow+target even when the Valkey claim was skipped. The legacy in-memory `Map` + `isAlreadyProcessed` tracking-comment scan (`src/core/tracking-comment.ts`) now runs only on the `router.ts processRequest` path, which production handlers bypass. `DATABASE_URL` is required to persist execution / dispatch history and the in-flight guard across restarts.
- **One request, one clone.** Each delivery clones the repo into a unique temp directory under `CLONE_BASE_DIR` **on the daemon host**. Claude operates on local files via `cwd`. On PR events the checkout supplementally fetches `origin/<baseBranch>` (when it differs from the head ref) so the agent's `git diff origin/<baseBranch>...HEAD` and `git rebase origin/<baseBranch>` directives resolve first try. A sibling `${workDir}-artifacts` directory is created outside the checkout and exposed to the agent as `BOT_ARTIFACT_DIR`: workflow summary files (IMPLEMENT.md / REVIEW.md / RESOLVE.md) are written there so they can never be picked up by a `git add` inside the clone. Both directories are removed in the pipeline's `finally` block regardless of outcome.
- **GitHub credential resolution.** `src/core/github-token.ts:resolveGithubToken()` is the single source of the GitHub credential the daemon uses. Default is an App installation token minted just-in-time from the cached `App` singleton in `src/orchestrator/connection-handler.ts`. When `GITHUB_PERSONAL_ACCESS_TOKEN` is set, the helper short-circuits and returns the PAT instead, API/git authentication runs as the PAT owner. Commit author/committer metadata is **not** affected; `src/core/checkout.ts` hard-pins git `user.name`/`user.email` to `chrisleekr-bot[bot]` so commit objects still carry the bot identity. The git credential helper, executor `GH_TOKEN`/`GITHUB_TOKEN` env vars, and MCP server env all consume the resolved string without caring about its source.
- **The webhook server never runs the pipeline.** Only daemons execute `runPipeline`. The webhook server is the orchestrator: it enqueues jobs and optionally spawns ephemeral daemons.
Expand Down
6 changes: 6 additions & 0 deletions src/webhook/events/issue-comment.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import { addReaction } from "../../utils/reactions";
import { dispatchByIntent } from "../../workflows/dispatcher";
import { dispatchCommentSurface } from "../../workflows/ship/command-dispatch";
import { isOwnerAllowed } from "../authorize";
import { claimDelivery } from "../idempotency";

/**
* Handler for issue_comment.created events.
Expand Down Expand Up @@ -84,6 +85,11 @@ export function handleIssueComment(
// only on PR comments. Canonical wins; legacy `dispatchByIntent`
// runs only when canonical produced no command.
void (async (): Promise<void> => {
// Idempotency gate (issue #202): GitHub redelivers with the same
// deliveryId, so a redelivery would re-run both the canonical NL classifier
// and the legacy intent LLM call (and any chat-thread turn). Claim the
// delivery before any dispatch; a redelivery skips. Fail-open in claimDelivery.
if (!(await claimDelivery(deliveryId, log))) return;
const dispatchLog = log.child({ event_surface: eventSurface });
let canonicalHandled = false;
try {
Expand Down
3 changes: 3 additions & 0 deletions src/webhook/events/issues.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import { dispatchByLabel } from "../../workflows/dispatcher";
import { dispatchCanonicalCommand } from "../../workflows/ship/command-dispatch";
import { routeTrigger } from "../../workflows/ship/trigger-router";
import { isOwnerAllowed } from "../authorize";
import { claimDelivery } from "../idempotency";

// Permits hyphenated verbs (e.g. `bot:open-pr`, `bot:fix-thread`); the verb
// must start with a letter and may contain `-`-separated lowercase segments.
Expand Down Expand Up @@ -97,6 +98,8 @@ export function handleIssues(octokit: Octokit, payload: IssuesEvent, deliveryId:
const installationId = payload.installation?.id;

void (async (): Promise<void> => {
// Idempotency gate (issue #202): skip a redelivery before any dispatch.
if (!(await claimDelivery(deliveryId, log))) return;
if (installationId !== undefined) {
try {
const command = await routeTrigger({
Expand Down
3 changes: 3 additions & 0 deletions src/webhook/events/pull-request.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import { dispatchCanonicalCommand } from "../../workflows/ship/command-dispatch"
import { fireReactor } from "../../workflows/ship/reactor-bridge";
import { routeTrigger } from "../../workflows/ship/trigger-router";
import { isOwnerAllowed } from "../authorize";
import { claimDelivery } from "../idempotency";

// Permits the documented label shapes:
// bot:ship, bot:abort-ship, bot:fix-thread, bot:investigate, ...
Expand Down Expand Up @@ -179,6 +180,8 @@ function handlePullRequestLabeled(
const installationId = payload.installation?.id;

void (async (): Promise<void> => {
// Idempotency gate (issue #202): skip a redelivery before any dispatch.
if (!(await claimDelivery(deliveryId, log))) return;
if (installationId !== undefined) {
try {
const command = await routeTrigger({
Expand Down
3 changes: 3 additions & 0 deletions src/webhook/events/review-comment.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import { dispatchByIntent } from "../../workflows/dispatcher";
import { dispatchCommentSurface } from "../../workflows/ship/command-dispatch";
import { fireReactor } from "../../workflows/ship/reactor-bridge";
import { isOwnerAllowed } from "../authorize";
import { claimDelivery } from "../idempotency";

/**
* Handler for pull_request_review_comment.{created,edited,deleted} events.
Expand Down Expand Up @@ -112,6 +113,8 @@ export function handleReviewComment(
const threadId = String(topLevelCommentId);

void (async (): Promise<void> => {
// Idempotency gate (issue #202): skip a redelivery before any LLM dispatch.
if (!(await claimDelivery(deliveryId, log))) return;
const dispatchLog = log.child({ thread_id: threadId, event_surface: "review-comment" });
let canonicalHandled = false;
try {
Expand Down
6 changes: 6 additions & 0 deletions src/webhook/events/review.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ import { fireReactor } from "../../workflows/ship/reactor-bridge";
*
* Fires the ship reactor (T024) so any active intent on this PR wakes early
* to inspect the new review state.
*
* No `claimDelivery` idempotency gate (issue #202): unlike the comment/label
* handlers, this fires only an idempotent reactor wake (no LLM dispatch, no
* workflow_runs row, no GitHub write). A redelivery just re-pokes an already-
* awake intent, which is harmless and self-deduping, so the dedup claim would
* add a Valkey round trip with nothing to protect.
*/
export function handleReview(
_octokit: Octokit,
Expand Down
74 changes: 74 additions & 0 deletions src/webhook/idempotency.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import type { Logger } from "pino";

import { getValkeyClient, isValkeyHealthy } from "../orchestrator/valkey";

/**
* Webhook delivery idempotency (issue #202).
*
* GitHub replays a delivery (automatic retry or operator-driven manual
* redelivery) with the SAME `X-GitHub-Delivery` header for up to 3 days, so
* webhooks are at-least-once. Without a dedup gate, a redelivery re-runs the
* full handler, including the triage/intent LLM calls and any chat-thread turn,
* double-billing and posting duplicate replies. `claimDelivery` is the
* canonical SET-NX-with-TTL idempotency claim, evaluated at the top of each
* handler's dispatch path before any side-effect.
*/

// 3 days, matching GitHub's redelivery window.
const TTL_SECONDS = 259_200;
const KEY_PREFIX = "idemp:webhook:";

/**
* Claim a webhook delivery for processing.
*
* Returns `true` exactly once per `deliveryId` within the TTL window (the first
* caller proceeds); a redelivery gets `false` and must skip.
*
* Fail-OPEN: if Valkey is unavailable or errors, returns `true` so an outage
* degrades to at-least-once processing rather than dropping every webhook. The
* `idx_workflow_runs_inflight` partial-unique index remains the durable backstop
* against duplicate work when this best-effort layer is skipped: the dispatcher
* rejects a second in-flight run for the same workflow+target. (The
* tracking-comment marker scan via `isAlreadyProcessed` is NOT a backstop here:
* it runs only on the legacy `router.ts processRequest` path that production
* handlers bypass, issue #202.)
*/
export async function claimDelivery(deliveryId: string, log: Logger): Promise<boolean> {
const client = getValkeyClient();
// `getValkeyClient()` returns a non-null client even while the TCP connection
// is down (it is null only when VALKEY_URL is unset). Bun's RedisClient
// defaults to `enableOfflineQueue: true`, so issuing SET against a
// disconnected client would QUEUE and block (up to the 10s connectionTimeout)
// instead of failing open. Gate on `isValkeyHealthy()` (the same liveness
// signal `router.ts` dispatch guards use) so a configured-but-down Valkey
// takes the immediate fail-open path, leaving the durable backstops
// (`idx_workflow_runs_inflight` + tracking-comment marker scan) to dedup.
if (client === null || !isValkeyHealthy()) {
log.warn({ deliveryId }, "claimDelivery: Valkey unavailable, proceeding (fail-open)");
return true;
}
try {
// SET key 1 NX EX <ttl>: returns "OK" iff the key did not exist (we won the
// claim); returns null when it already exists (a redelivery).
// `RedisClient.send` is typed `Promise<any>`; SET-NX returns "OK" or null.
const res = (await client.send("SET", [
`${KEY_PREFIX}${deliveryId}`,
"1",
"NX",
"EX",
String(TTL_SECONDS),
])) as string | null;
if (res === "OK") return true;
log.info(
{ deliveryId, event: "dedup-skip" },
"claimDelivery: duplicate webhook delivery, skipping",
);
return false;
} catch (err) {
log.warn(
{ deliveryId, err: err instanceof Error ? err.message : String(err) },
"claimDelivery: Valkey error, proceeding (fail-open)",
);
return true;
}
}
29 changes: 24 additions & 5 deletions test/webhook/events/issues.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@
* to the dispatcher, which relies on the partial unique index at the
* runs-store layer. The handler is invoked twice → dispatcher is invoked
* twice → second call returns {status:"refused", reason:"in-flight…"}.
*
* Dispatch is fire-and-forget inside a `void (async () => {...})()` IIFE whose
* first statement is `await claimDelivery(...)` (issue #202). That await defers
* the `dispatchByLabel` call past the synchronous test body, so the positive
* cases must drain the microtask queue (`flushMicrotasks`) before asserting,
* otherwise the deferred call also leaks into the next test after `mockClear`.
*/

import type { IssuesEvent } from "@octokit/webhooks-types";
Expand Down Expand Up @@ -47,6 +53,13 @@ const { handleIssues } = await import("../../../src/webhook/events/issues");

const fakeOctokit = {} as unknown as Octokit;

// Drain the microtask queue so the fire-and-forget dispatch IIFE runs past its
// leading `await claimDelivery(...)` gate (#202) and the `await dispatchByLabel`
// call lands before assertions / the next test's `mockClear`.
async function flushMicrotasks(): Promise<void> {
for (let i = 0; i < 5; i++) await Promise.resolve();
}

function issueLabeledPayload(overrides?: {
labelName?: string;
senderLogin?: string;
Expand Down Expand Up @@ -76,8 +89,9 @@ describe("handleIssues", () => {
);
});

it("dispatches for bot:triage on open issue from allowed sender (T012)", () => {
it("dispatches for bot:triage on open issue from allowed sender (T012)", async () => {
handleIssues(fakeOctokit, issueLabeledPayload({ labelName: "bot:triage" }), "delivery-1");
await flushMicrotasks();

expect(mockDispatchByLabel).toHaveBeenCalledTimes(1);
const call = mockDispatchByLabel.mock.calls[0] as unknown as [
Expand Down Expand Up @@ -116,8 +130,13 @@ describe("handleIssues", () => {
});

it("T014: duplicate label events delegate to dispatcher, second invocation is refused by idempotency guard", async () => {
// First call: normal dispatch
// First call: normal dispatch. Drain BEFORE arming the once-impl so this
// dispatch resolves on the default "dispatched" path. Dispatch is deferred
// past the leading `await claimDelivery(...)` gate (#202), so without this
// drain the once-impl below would be consumed by THIS (first) call instead
// of the second one.
handleIssues(fakeOctokit, issueLabeledPayload({ labelName: "bot:triage" }), "delivery-dup-1");
await flushMicrotasks();

// Second call with the same (target, workflow): dispatcher reports the
// partial unique index rejection surfaced by runs-store.
Expand All @@ -130,9 +149,9 @@ describe("handleIssues", () => {
);
handleIssues(fakeOctokit, issueLabeledPayload({ labelName: "bot:triage" }), "delivery-dup-2");

// Let the micro-task queue drain so the fire-and-forget dispatch resolves.
await Promise.resolve();
await Promise.resolve();
// Let the micro-task queue drain so the second fire-and-forget dispatch
// resolves.
await flushMicrotasks();

expect(mockDispatchByLabel).toHaveBeenCalledTimes(2);
// Second call's resolved outcome is the "in-flight" refusal, the handler
Expand Down
Loading
Loading