feat(webhooks): outbound webhooks on issue status change by lml2468 · Pull Request #34 · Mininglamp-OSS/multica

lml2468 · 2026-06-16T01:54:49Z

Summary

Adds GitHub-style outbound webhooks: when an issue's status changes, Multica POSTs a signed JSON payload to subscribed external URLs. The reverse direction of the existing inbound autopilot webhooks.

Workspace-level webhooks (GitHub "org" webhook) — fire for every issue in the workspace.
Project-level webhooks (GitHub "repo" webhook) — fire only for issues in that project.
v1 event: issue.status_changed. Delivery is async fire-and-forget with bounded concurrency, short retry backoff, and HMAC-SHA256 signing (X-Multica-Signature-256, GitHub-compatible).

Backend

Migration 121 + sqlc webhook_subscription table (nullable project_id; cleartext signing secret, consistent with autopilot_trigger).
outwebhook.Dispatcher — subscription selection, event filtering, detached delivery off the event-bus/request path, semaphore-bounded concurrency.
webhooksign — shared HMAC sign/verify; inbound verifier now delegates to it.
CRUD at /api/webhook-subscriptions, owner/admin-gated, with an SSRF guard rejecting loopback/link-local/private/metadata targets.
Fix: batch issue update now publishes prev_status (was omitted).

Frontend

Settings → Webhooks tab (workspace) and a collapsible Webhook section in the project detail right panel (below Resources).
Shared useWebhookSection hook + WebhookDialogs consumed by both surfaces; secret shown once on create.
core types, lenient zod schemas (parseWithFallback), api client, query/mutation hooks; en/ja/ko/zh-Hans i18n.

Testing

Go: outwebhook (selection/filtering/HMAC/retry), webhooksign, handler validation incl. SSRF cases — all pass; go build/vet/gofmt clean.
TS: full pnpm typecheck, pnpm lint (0 errors), webhook component tests + core schema tests, i18n parity.
Manual (Dockerized self-host): verified end-to-end delivery with byte-exact HMAC, SSRF rejection (400), batch prev_status, and both UI surfaces.

Notes / deferred

No delivery-history persistence in v1 (async fire-and-forget) — asymmetric with the inbound webhook_delivery table; revisit if observability/replay is needed.
SSRF guard checks IP literals + localhost; does not resolve DNS (DNS-rebinding out of scope for v1).
Signing secret stored cleartext (matches existing autopilot_trigger.signing_secret).

Add GitHub-style outbound webhooks: when an issue's status changes, Multica POSTs a signed JSON payload to subscribed external URLs. The reverse direction of the existing inbound autopilot webhooks. - Migration 121 + sqlc: webhook_subscription table (nullable project_id distinguishes workspace-level "org" webhooks from project-level "repo" webhooks); cleartext signing secret, consistent with autopilot_trigger. - outwebhook.Dispatcher: subscription selection (workspace- vs project-scoped), event filtering, async fire-and-forget delivery with bounded concurrency (semaphore) and short retry backoff. All work is detached off the event-bus/request path. - webhooksign: shared GitHub-compatible HMAC-SHA256 sign/verify; the inbound verifier now delegates to it so the two can't drift. - webhook_listeners: subscribe to issue:updated (status_changed) and dispatch; SupportedEventTypes is the single source of truth tying the allow-list to the listener. - Handler CRUD at /api/webhook-subscriptions, owner/admin-gated, with SSRF guard rejecting loopback/link-local/private/metadata targets. - Fix batch issue update to publish prev_status (was omitted, delivering empty previous_status for batch transitions). - Dedup: shared generateCredential + secretHint helpers.

Surface outbound webhook management on two shared surfaces: - Workspace level: a new Settings → Webhooks tab. - Project level: a collapsible section in the project detail right panel, directly below Resources. - core: WebhookSubscription types, lenient zod schemas via parseWithFallback, api client methods, and TanStack Query hooks (keyed on wsId + projectId). - views: shared useWebhookSection hook + WebhookDialogs (secret-once + delete-confirm) consumed by both surfaces — list/create/toggle/delete and the signing secret shown once on create, permission-gated to owner/admin. - i18n: en/ja/ko/zh-Hans strings; parity test passes.

mochashanyao

[Octo-Q · automated review]

Verdict: Approve — no blocking findings; notes below (data-flow traced).

PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change

Reviewer: Octo-Q (automated review)
Head SHA: 003d5342bbe1e70dcb9e3fd43bc8f503e6e7cb08
Repo: Mininglamp-OSS/Octo-Q
Routing: complexity=security_sensitive (automated review)
Scope: 38 files, +2777/−33 — migration, dispatcher, CRUD handlers, SSRF guard, HMAC signing, frontend UI, i18n

1. Verification Summary

Area	Status	Evidence
Authorization (CRUD)	✅	All 4 handlers gate on `requireWorkspaceRole(..., "owner", "admin")` — `webhook_subscription.go:185,222,248,280`
Workspace scoping	✅	`loadWebhookSubscription` scopes by `workspace_id` via `GetWebhookSubscriptionInWorkspace` — `webhook_subscription.go:296-316`
Project ownership	✅	Create validates `GetProjectInWorkspace(ID, WorkspaceID)` — `webhook_subscription.go:237-247`
SSRF guard (IP literals)	✅	Rejects loopback, link-local, private, unspecified + `localhost` — `webhook_subscription.go:119-143`
HMAC signing	✅	`webhooksign.Sign/Verify` — constant-time `hmac.Equal`, GitHub-compatible `sha256=<hex>` — `webhooksign.go:27-44`
Secret entropy	✅	32 bytes crypto/rand, `whsec_` prefix, URL-safe base64 — `autopilot_webhook.go:42-48`, `webhook_subscription.go:97`
Async dispatch	✅	`DispatchIssueStatusChanged` → `go d.dispatch(ev)` — never blocks event bus — `dispatcher.go:121-123`
Concurrency bound	✅	Semaphore `chan struct{}` capped at 16 — `dispatcher.go:57,155-158`
prev_status fix (batch)	✅	`prevIssue.Status` published in batch update payload — `issue.go:2919` (matches single-update path at `issue.go:2417`)
Data flow: listener → dispatcher	✅	`webhook_listeners.go` extracts `status_changed`, `issue`, `prev_status` from event payload → `DispatchIssueStatusChanged`
Frontend admin gate	✅	`canManage = role === "owner"
Secret shown once	✅	`SecretOnce` set only in create handler response — `webhook_subscription.go:244`
Migration	✅	Proper FK cascades, partial index on `enabled`, project-scoped index — `121_webhook_subscription.up.sql`
Tests	✅	Handler validation (URL/events/secret), dispatcher (matching/filtering/retry/HMAC), webhooksign roundtrip, frontend component tests

2. Findings

F1 — SSRF via HTTP redirect (P2)

Diff-scope: new (introduced by this PR)

validateWebhookURL (webhook_subscription.go:119-143) checks the initial URL for internal targets (loopback, link-local, private, localhost). However, the delivery HTTP client (dispatcher.go:89: &http.Client{Timeout: deliveryTimeout}) uses Go's default redirect policy — it follows up to 10 redirects (301/302/303/307/308).

Attack path: An admin creates a webhook pointing at https://attacker.example.com/redirect which returns 307 Temporary Redirect → http://169.254.169.254/latest/meta-data/. Go's HTTP client follows the 307, preserving POST method and body. The signed webhook payload (including issue data) is delivered to the cloud metadata endpoint.

Go's client strips Authorization headers on cross-domain redirects but does NOT strip custom headers like X-Octo-Q-Signature-256. The body (full issue data) is also preserved on 307.

Mitigation: Admin-gated creation; DNS-rebinding already documented as v1 OOS. This redirect-based SSRF is the same class of gap.

Recommendation: Add a CheckRedirect policy that either rejects redirects to internal IPs or caps redirect count to 0 for webhook delivery:

client := &http.Client{
    Timeout: deliveryTimeout,
    CheckRedirect: func(req *http.Request, via []*http.Request) error {
        return http.ErrUseLastResponse // don't follow redirects
    },
}

F2 — Dispatch goroutine accumulation under burst (P2)

Diff-scope: new (introduced by this PR)

dispatch() (dispatcher.go:127) runs in a detached goroutine. Inside, it blocks on d.sem <- struct{}{} (line 155) when all 16 delivery slots are full. Each delivery can take up to ~38s (30s timeout × 1 initial + 2 retries with 1s+4s backoff).

Under a burst of N status changes with M matching subscriptions: N dispatch goroutines accumulate, each holding the marshaled payload (~size of IssueResponse). If N×M >> 16, dispatch goroutines queue on the semaphore.

Practical impact: Low for v1 workloads. With 16 slots and ~38s worst-case per delivery, throughput is ~25 deliveries/minute. A burst of 100 status changes × 5 webhooks = 500 deliveries would take ~20 minutes to drain, with 500 goroutines queued.

Recommendation: Consider a bounded work queue (e.g., discard events when queue depth exceeds a threshold) or a context timeout on semaphore acquisition.

F3 — Signing secret stored cleartext (P2 — pre-existing pattern)

Diff-scope: pre-existing (matches autopilot_trigger.signing_secret from migration 093)

secret TEXT NOT NULL in webhook_subscription stores the HMAC key in cleartext. The PR body and migration comment acknowledge this and reference MUL-2334 for a future secrets-at-rest infrastructure.

Recommendation: No action needed for v1 — consistent with existing pattern. Track in MUL-2334.

F4 — Full IssueResponse in outbound payload (P2 — design note)

Diff-scope: new (introduced by this PR)

outboundPayload.Issue (dispatcher.go:108) serializes the complete handler.IssueResponse into the webhook body. This includes all issue fields (description, internal metadata, assignee details, etc.).

This is by design (webhook subscribers need issue context), but workspace admins should be aware that all issue data is sent to external endpoints.

Recommendation: Consider documenting the payload schema for operators, or offering a field-level filter in v2.

3. Suggestions

Redirect policy (addresses F1): Set CheckRedirect to reject or cap redirects in the delivery HTTP client. This closes the redirect-based SSRF vector without requiring DNS resolution.
Bounded dispatch queue (addresses F2): Replace the bare semaphore with a buffered channel + discard-on-full pattern, or add a context.WithTimeout on semaphore acquisition to prevent unbounded goroutine accumulation.
Payload schema documentation (addresses F4): Document the outbound payload shape in the PR body or API docs so operators know what data is sent to subscribers.

4. Additional Findings

No delivery-history persistence: v1 is fire-and-forget with no delivery log (asymmetric with inbound webhook_delivery table). Acknowledged in PR body. Not blocking — observability can be added in v2.
No secret rotation via API: UpdateWebhookSubscription cannot rotate the signing secret. Operators must delete + recreate. Acceptable for v1.
ListWebhookSubscriptionsByProject doesn't verify project ownership: The query filters by (workspace_id, project_id) which is safe — returns empty if project doesn't belong to workspace. No information leakage.

5. Data-Flow Trace

Consumed Data	Upstream Source	Flows Correctly?
`payload["status_changed"]`	`BatchUpdateIssues` / `UpdateIssue`: `prevIssue.Status != issue.Status` → `issue.go:2912,2408`	✅ — boolean, set when status differs
`payload["issue"]` (IssueResponse)	`issueToResponse(issue, prefix)` from DB-updated row	✅ — full post-update issue
`payload["prev_status"]`	`prevIssue.Status` — loaded before update via `GetIssueForUpdate`	✅ — batch path fixed by this PR (`issue.go:2919`), single path already had it (`issue.go:2417`)
`e.WorkspaceID`	`events.Event.WorkspaceID` set by `h.publish()`	✅ — scoped to request workspace
`e.ActorType / e.ActorID`	`h.resolveActor(r, userID, workspaceID)` — resolves real actor	✅ — not observer, actual trigger
`issue.ProjectID`	`IssueResponse.ProjectID` (nullable `*string`)	✅ — listener handles nil (`webhook_listeners.go:42-44`)
`sub.Secret` (for signing)	`webhook_subscription.secret` column, loaded by `ListEnabledWebhookSubscriptionsForDispatch`	✅ — cleartext from DB, used directly for HMAC
`sub.Events` (for matching)	`webhook_subscription.events` JSONB column, unmarshaled in `subscribedToEvent`	✅ — malformed JSON handled gracefully (skip, not panic)

6. Blind-Point Checklist (R5 — security_sensitive)

C1 — Dual-path parity:

Subscribe ↔ unsubscribe: N/A — webhooks are independent CRUD operations, no paired subscribe/unsubscribe paths.
Create-gate ↔ execute-path: ✅ — Create is admin-gated; dispatch reads from DB (no re-authorization needed for system-driven delivery).
Parallel implementations: ✅ — No duplicate webhook dispatch paths.

C2 — Control-flow ordering / nesting reuse:

webhooksign.Sign/Verify shared by inbound + outbound: ✅ — single implementation, no double-signing or ordering issue.
generateCredential shared by webhook tokens + signing secrets: ✅ — pure function, no side effects.
SSRF guard non-canonical forms: ✅ — url.Parse handles encoded hosts; net.ParseIP handles IPv4/IPv6; strings.EqualFold handles LOCALHOST/Localhost. No bypass via case or encoding.

C3 — Authorization boundary ≠ capability boundary:

✅ — All /api/webhook-subscriptions endpoints are behind authenticated session + requireWorkspaceRole("owner", "admin"). No unauthenticated or member-level access path.
✅ — loadWebhookSubscription scopes by workspace_id, preventing cross-workspace access via UUID enumeration.
✅ — Dispatch runs system-internally (no user-facing endpoint), triggered only by event bus.

Verdict

No P0 or P1 findings. All identified issues are P2 (SSRF redirect gap, burst goroutine accumulation, cleartext secret, payload exposure) — none make a working path unavailable, produce user-visible incorrect data, or make the system worse than before this PR.

The authorization model is sound (admin-gated, workspace-scoped), the signing implementation is correct (constant-time HMAC), the data flow is properly traced from issue update through event bus to delivery, and the SSRF guard covers the primary attack vectors (IP literals + localhost).

[Octo-Q] verdict: APPROVE — P2 suggestions are non-blocking improvements for v2 consideration.

Jerry-Xin

In scope for Multica, but the outbound webhook SSRF guard is bypassable at delivery time.

🔴 Blocking

🔴 Critical: The SSRF protection only validates the originally configured URL, but delivery uses a default http.Client, which follows redirects and performs unrestricted DNS resolution. A subscriber can configure an allowed public URL that returns a redirect to http://127.0.0.1, http://169.254.169.254, or another private target, and d.client.Do(req) will follow it. Likewise, a hostname can resolve or later rebind to a private IP after creation. This defeats the stated guard for loopback/link-local/private/metadata targets. Fix this in the delivery path, not only create/update validation: disable redirects or validate every redirect, and use a custom Transport.DialContext that rejects resolved private/link-local/loopback/unspecified IPs for every connection. References: webhook_subscription.go, dispatcher.go, dispatcher.go.

💬 Non-blocking

🟡 Warning: Delivery concurrency is bounded, but event goroutines are still unbounded under load. DispatchIssueStatusChanged starts one goroutine per event, and each dispatch goroutine can block on d.sem <- struct{}{} while prior deliveries sleep/retry. A large batch or burst can therefore accumulate many blocked dispatch goroutines despite the 16-delivery cap. Consider a bounded worker queue or non-blocking drop/log policy for fire-and-forget v1. Reference: dispatcher.go, dispatcher.go.
🔵 Suggestion: Add handler-level CRUD/RBAC tests for the new API routes. The pure URL/event tests are useful, but they do not exercise owner/admin enforcement, project-in-workspace validation, or the “secret returned only once” API behavior. Reference: webhook_subscription_test.go.

✅ Highlights

The feature fits the repository and existing issue/workspace model.
The event listener correctly filters issue:updated to status changes and includes prev_status.
The shared HMAC signing package is a good consolidation of inbound verification and outbound signing.
Targeted verification passed: go test ./internal/handler ./internal/integrations/outwebhook ./internal/webhooksign and go test ./cmd/server -run 'TestNonExistent'.

yujiawei

Code Review — PR #34 (multica)

Reviewed against head 003d5342bbe1e70dcb9e3fd43bc8f503e6e7cb08 (merge-base ac1ba3b). This is a security-sensitive change (outbound webhooks: SSRF surface, HMAC signing, secret handling, detached delivery), so it got an extra-careful pass plus an independent second-opinion engine.

Verdict: Changes requested — one blocking SSRF gap. Everything else is non-blocking. The crypto, multi-tenancy scoping, and the entire frontend contract layer are solid.

Blocking

B1 — SSRF: the create-time URL guard is bypassed by HTTP redirects

server/internal/integrations/outwebhook/dispatcher.go:80

client: &http.Client{Timeout: deliveryTimeout},

The delivery client has no CheckRedirect policy, so it follows 3xx redirects with Go's default behavior (up to 10 hops). validateWebhookURL (server/internal/handler/webhook_subscription.go:108) runs only at create/update and only on the literal URL the operator typed. A subscriber registered at an allowed public URL can simply respond 302 Location: http://169.254.169.254/... (cloud metadata) or http://10.x / 127.0.0.1 (internal subnet), and d.post will follow it and POST the issue payload there. The SSRF guard the PR deliberately built is silently defeated.

The package comment at webhook_subscription.go:100-107 states the intent precisely: "an admin shouldn't be able to turn the server into a probe of its own metadata service or internal subnet." Redirect-following breaks exactly that property. It is blind SSRF (the response body is drained to io.Discard, so it's not reflected to the caller), and creating a subscription is owner/admin-gated — but in a multi-tenant deployment a workspace admin is a customer, not a server operator, so a customer reaching host infrastructure is a real cross-boundary issue.

Fix (closes the whole class, not just redirects): validate at the transport/dial layer so every connection — initial and every redirect hop, after DNS resolution — is checked against isInternalIP. A custom http.Transport with a DialContext that rejects internal resolved IPs is the robust form. The minimal stopgap is a CheckRedirect that re-runs the host check on each hop (or refuses redirects via http.ErrUseLastResponse), but that alone does not address B2.

Closely related (same root, fold into the same fix)

DNS-resolves-to-internal / DNS rebinding — webhook_subscription.go:104-107 documents that DNS is not resolved and rebinding is "out of scope for v1." That deferral is a defensible call on its own, but the create-time-only validation model is what makes both this and B1 possible: nothing re-checks the destination at delivery time. A transport-level DialContext guard that inspects the resolved IP on every dial closes B1 and the rebinding gap in one change, and turns the "out of scope" note into "actually mitigated." Strongly recommended to do both together rather than ship the redirect fix alone.

Non-blocking (P2 — address now or fast-follow, your call)

Subscriber URL logged on failure — dispatcher.go:195,202. Delivery failures log "url", sub.Url at WARN/ERROR. Subscriber webhook URLs frequently carry tokens in the path/query; this writes customer secrets to server logs. Log subscription ID + parsed host only, or a redacted URL.
PATCH {"events":[]} silently resets to the default — webhook_subscription.go:142-145 + :312-318. Omitting events keeps the old value (COALESCE), but sending an explicit empty array runs through validateWebhookEvents, which defaults [] to ["issue.status_changed"]. Omitted vs explicit-empty diverge, and a client can't intentionally clear events. Harmless in v1 (single event type), but it's latent design debt the moment a second event ships. Cleanest fix: drop the empty-default on the update path (default only on create), or reject empty arrays.
Dispatch goroutine fan-out is unbounded per event — dispatcher.go:120,167. maxConcurrentDeliveries bounds in-flight POSTs, but each event still spawns a dispatch() goroutine that does the DB list + marshal before hitting the semaphore. A burst of status changes can pile up dispatch() goroutines blocked on d.sem. Bounded for normal load; consider a worker queue if status-change storms are plausible.
No composite FK enforcing project_id ∈ workspace_id — migrations/121_webhook_subscription.up.sql. The handler validates this via GetProjectInWorkspace, so it's defense-in-depth only, but a (workspace_id, project_id) composite FK would make the invariant a database guarantee.
Authorization runs after input validation on create — webhook_subscription.go:223-235. URL/events validation happens before requireWorkspaceRole. Minor info-leak/fingerprinting surface; move the role check up to right after body decode.
Optional hardening: delivery goroutines have no recover() (dispatcher.go:120,168) — verified there are no currently-reachable panic vectors, so this is defensive-only, not a bug. In-flight deliveries use context.Background() and aren't cancelled on shutdown — acceptable for fire-and-forget. A constant-time-from-the-start tweak to webhooksign.Verify (compute HMAC unconditionally) closes a negligible format-vs-content timing distinction.

What's solid (verified, not assumed)

HMAC — webhooksign.go signs the exact marshaled body once and reuses it across retries; Verify uses hmac.Equal (constant-time compare). Extracting the shared package and routing inbound verifyHubSignature through it is a good de-dup that prevents inbound/outbound drift.
Secret handling — full secret returned once on create (SecretOnce), never echoed by list/get/update; only the 4-char secretHint is exposed. Not logged in the handler. Cleartext-at-rest is a documented, pre-existing pattern (mirrors autopilot_trigger, tracked as MUL-2334).
prev_status batch fix — issue.go:2919 correctly brings the batch path in line with the single-update path; both now publish prev_status, which the listener depends on. Without it, batch status changes would have fired webhooks with an empty previous_status.
Multi-tenancy — every read/write is scoped by workspace_id; updates/deletes resolve through the loader and use row.ID/row.WorkspaceID, consistent with the repo's UUID-parsing convention. Project-scope create validates project membership.
Frontend — the new API client methods all run responses through parseWithFallback + zod with explicit fallbacks (no bare as), there's a malformed-response schema test (missing/null/wrong-type cases), the create-once secret stays in React state only and is rendered as text (no XSS), i18n keys are at parity across en/ja/ko/zh-Hans, and no package-boundary violations. This fully satisfies the repo's API Response Compatibility rules.

Coverage / what I did not verify

Static review only — I did not run make test / the Go + TS suites against this checkout.
TLS certificate validation on delivery, and any reverse-proxy / egress filtering in front of the server (infra-level defense-in-depth), were not assessed.
An independent long-context second leg was attempted but timed out and is absent from this review — so cross-family corroboration here is single-engine plus this reviewer's own analysis, not two. The blocking finding (B1) was independently confirmed by direct code inspection and an adversarial verification pass, so confidence in it is high regardless.

…dening Blocking (SSRF bypass at delivery time): - New internal/netguard package: NewRestrictedHTTPClient resolves the host and refuses to dial loopback/link-local/private/unspecified addresses on EVERY dial — including each redirect hop and across DNS rebinding — then dials the exact verified IP (closes the resolve→dial TOCTOU). The dispatcher uses it, so a subscriber can no longer register a public URL that 302s to 169.254.169.254 / 127.0.0.1 / a private host, nor use a hostname that resolves to internal. - validateWebhookURL keeps the create-time literal/localhost fast-reject (now via netguard.IsBlockedIP); the transport is the authoritative enforcement. Hardening (non-blocking review items): - Bounded delivery: replace the per-event goroutine + blocking semaphore with a fixed worker pool draining a buffered queue; enqueue is non-blocking (drop + log when full) so dispatch goroutines can't pile up under a status-change storm. Workers recover() so one bad delivery can't kill a worker. - Redact logs: log subscription_id + host only, never the full subscriber URL (URLs often carry tokens). - events: validateWebhookEvents now rejects an empty list; create applies the default, update does not — so PATCH {"events":[]} is a 400 instead of a silent reset to the default. - Create authorizes (requireWorkspaceRole) before validating input. Tests: - netguard: IsBlockedIP table + restricted client refuses loopback / redirect-to- loopback. - dispatcher: production constructor blocks internal delivery; delivery-mechanism tests inject a permissive client. - handler: RBAC 403 for non-admin, secret-shown-once, project-must-be-in- workspace (404), PATCH empty-events (400).

lml2468 · 2026-06-16T03:42:05Z

Thanks for the careful review. Pushed 1eda2ae addressing the blocking SSRF finding and the non-blocking items.

🔴 Blocking — SSRF bypass at delivery time (fixed at the transport layer)

The create-time URL check was insufficient because the delivery client followed redirects and did unrestricted DNS. Fixed where it belongs — the delivery transport:

New internal/netguard package. NewRestrictedHTTPClient installs a DialContext that resolves the host, rejects if any resolved IP is loopback/link-local/private/unspecified, and then dials the exact verified IP (closes the resolve→dial TOCTOU). Because every redirect hop re-dials through this transport, redirect-based SSRF and DNS-rebinding / hostname-resolves-to-internal are all closed in one place.
The dispatcher now uses this client; validateWebhookURL keeps the create-time literal/localhost fast-reject (via netguard.IsBlockedIP) purely as a UX nicety.
Tests: netguard refuses loopback + redirect-to-loopback; a dispatcher test proves the production constructor blocks delivery to a loopback endpoint.

This directly addresses @Jerry-Xin's 🔴 and @yujiawei's B1 (incl. the folded-in DNS-rebinding note).

💬 Non-blocking — also addressed

Subscriber URL in logs (@yujiawei) → now log subscription_id + host only, never the full URL.
PATCH {"events":[]} silently resets to default (@yujiawei) → validateWebhookEvents now rejects empty; create applies the default, update returns 400.
Unbounded dispatch goroutines (@Jerry-Xin / Octo-Q F2) → replaced the per-event goroutine + blocking semaphore with a fixed worker pool + bounded queue; enqueue is non-blocking (drop+log on full) so dispatch goroutines can't accumulate. Workers recover().
Handler RBAC/behavior tests (@Jerry-Xin) → added: non-admin → 403, secret-shown-once, project-must-be-in-workspace → 404, empty-events → 400.
Auth before input validation (@yujiawei) → requireWorkspaceRole now runs immediately after body decode.

Deliberately deferred (with rationale)

Composite FK project_id ∈ workspace_id: project has no UNIQUE(id, workspace_id), so this needs altering a core table for enforcement the handler already does via GetProjectInWorkspace. Defense-in-depth only — left out.
Cleartext secret at rest: pre-existing pattern (autopilot_trigger.signing_secret), tracked in MUL-2334.
webhooksign.Verify unconditional HMAC: hmac.Equal is already constant-time; the format-vs-content timing distinction is negligible (per @yujiawei).

Verification: go build/vet/gofmt clean; full go test ./internal/handler ./internal/integrations/outwebhook ./internal/webhooksign ./internal/netguard ./cmd/server green against a local Postgres (incl. the 4 new RBAC tests).

mochashanyao

[Octo-Q · automated review]

Verdict: Approve — no blocking findings; notes below (data-flow traced).

PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change

Reviewer: Octo-Q (automated review)
Head SHA: 1eda2ae1619217ebd184ee668058afbaa25ef6c3
Repo: Mininglamp-OSS/Octo-Q
Routing: complexity=security_sensitive (automated review)

1. Verification Summary

Area	Status	Evidence
SSRF guard (netguard)	✅	`server/internal/netguard/netguard.go:22-28` — blocks loopback, link-local (incl. 169.254.169.254), private, unspecified. Dial-time enforcement with resolve→check→dial-exact-IP pattern closes TOCTOU and DNS rebinding.
SSRF redirect-hop coverage	✅	`netguard.go:38-42` — transport uses restricted dialer; Go default redirect policy re-dials through same transport, so each hop is re-checked.
HMAC signing (webhooksign)	✅	`server/internal/webhooksign/webhooksign.go:28-42` — HMAC-SHA256 + `hmac.Equal` constant-time comparison. Shared package for inbound verify + outbound sign prevents drift.
CRUD authorization	✅	`handler/webhook_subscription.go` — all 4 handlers call `requireWorkspaceRole(..., "owner", "admin")`. Create checks auth BEFORE validation (prevents probing). `loadWebhookSubscription` scopes by `(id, workspace_id)`.
Project scope validation	✅	`webhook_subscription.go:228-240` — CreateWebhookSubscription verifies project belongs to workspace via `GetProjectInWorkspace`.
Secret management	✅	`generateWebhookSecret()` → `generateCredential("whsec_")` → 32 bytes crypto/rand, URL-safe base64 (256 bits). Secret returned ONLY in create response (`SecretOnce`), never echoed on list/get.
Dispatcher async isolation	✅	`dispatcher.go:148-150` — `DispatchIssueStatusChanged` spawns goroutine, returns immediately. Never blocks the event bus or HTTP request path.
Worker pool bounding	✅	`dispatcher.go:56-60` — 16 fixed workers, 1024 queue capacity, non-blocking enqueue (drops on full with log).
Retry logic	✅	`dispatcher.go:186-207` — 3 attempts, retries on network error + 5xx, stops on 4xx. Backoff: 1s, 4s.
Batch prev_status fix	✅	`handler/issue.go:2919` — batch update now publishes `prev_status: prevIssue.Status`, matching the single-update path at line 2417.
Migration 121	✅	Proper FK with ON DELETE CASCADE on both workspace_id and project_id. Partial index on `enabled` for dispatch hot path.
Frontend RBAC	✅	`use-webhook-section.ts:63-64` — `canManage` gated on owner/admin. Queries gated on `canManage`.
Schema leniency	✅	`schemas.ts` — `.loose()` on zod schemas, `parseWithFallback` for graceful degradation.

2. Findings

No P0 or P1 issues found.

P2 — PR description understates SSRF protection (DNS rebinding)

Diff-scope: pre-existing (in the PR body text, not in code).

The PR body states: "SSRF guard checks IP literals + localhost; does not resolve DNS (DNS-rebinding out of scope for v1)."

However, the actual netguard.restrictedDialContext implementation (netguard.go:55-72) does resolve DNS (net.DefaultResolver.LookupIPAddr), checks ALL resolved IPs, and dials the exact verified IP — which does defend against DNS rebinding. The code comment at netguard.go:5-10 correctly describes this. The PR body should be corrected to match the implementation.

Impact: Documentation-only. Could mislead future security reviewers into thinking DNS rebinding is unmitigated when it actually is.

P2 — Dispatcher worker pool has no graceful shutdown

Diff-scope: new (introduced by this PR).

outwebhook.New() starts 16 worker goroutines (dispatcher.go:107-109) that run for job := range d.jobs with no stop signal. The Dispatcher has no Close()/Stop() method. On server shutdown, in-flight deliveries are abandoned and workers are killed with the process.

Impact: Consistent with the rest of the server (no graceful shutdown pattern). Acceptable for v1 fire-and-forget. Consider adding a context.Context-based shutdown for v2 if delivery persistence is added.

P2 — Silent delivery drop on queue-full

Diff-scope: new (introduced by this PR).

dispatcher.go:175-179 — when the 1024-slot delivery queue is full, new deliveries are dropped with only a slog.Warn. No metric/counter is incremented.

Impact: Under extreme load, operators have no alerting signal beyond log scraping. Consider adding a Prometheus counter or similar observable for dropped deliveries.

3. Suggestions

Correct PR body DNS-rebinding claim to match the actual implementation (it's better than stated).
Future: Add a Close() method to Dispatcher that cancels workers and drains the queue, especially before adding delivery-history persistence.
Future: Add a counter metric for queue-full drops so operators can alert on delivery loss.

4. Additional Observations

Signing secret at rest: Stored cleartext, matching existing autopilot_trigger.signing_secret convention. Acknowledged in PR notes. No general secrets-at-rest infra exists yet (MUL-2334). Not a regression.
No delivery-history persistence: v1 is fire-and-forget with log-only outcomes. Asymmetric with inbound webhook_delivery table. Acknowledged and deferred.
No secret rotation endpoint: Users must delete + recreate to rotate. Acceptable for v1.
generateCredential refactor: Clean extraction — generateWebhookToken and generateWebhookSecret now share one entropy source. Good.
signingSecretHint → secretHint dedup: Correctly eliminates the duplicate between autopilot and webhook handlers.

5. Data-Flow Tracing

Path A: Batch issue update → outbound webhook delivery

BatchUpdateIssues (issue.go:2919) publishes EventIssueUpdated with status_changed: true + prev_status: prevIssue.Status ← this PR fixes the missing prev_status
registerWebhookListeners (webhook_listeners.go:24-26) checks payload["status_changed"].(bool) → true
Extracts payload["issue"].(handler.IssueResponse) (line 29) and payload["prev_status"].(string) (line 36)
Calls d.DispatchIssueStatusChanged(...) (line 43-50) with typed struct
DispatchIssueStatusChanged (dispatcher.go:148) spawns goroutine → dispatch()
dispatch() queries ListEnabledWebhookSubscriptionsForDispatch(ctx, wsUUID) (line 160) — SQL: WHERE workspace_id = $1 AND enabled = true
Filters by subscriptionMatches(sub, ev.ProjectID) (workspace-level matches all; project-level matches own project) AND subscribedToEvent(sub, EventIssueStatusChanged) (line 166-168)
Marshals outboundPayload with event, workspace_id, actor, issue, previous_status, delivered_at (line 173-180)
Enqueues deliverJob per matched subscription (line 184-189)
Worker picks up → deliver() signs with webhooksign.Sign(sub.Secret, body) → post() via SSRF-restricted HTTP client

Verdict: Data flows correctly from issue update through event bus to signed HTTP delivery. The prev_status fix closes the batch-update gap.

Path B: Webhook subscription CRUD

POST /api/webhook-subscriptions → auth middleware → CreateWebhookSubscription
requireWorkspaceRole(owner/admin) → validateWebhookURL (SSRF fast-check) → validateWebhookEvents (allow-list) → verify project in workspace → generateWebhookSecret() → DB insert → response with SecretOnce
Subsequent GET/list: webhookSubscriptionToResponse() never populates SecretOnce (omitempty hides empty string)

Verdict: Secret lifecycle is correct — generated, returned once, never re-exposed.

6. R5 Blind-Spot Checklist (security_sensitive)

C1 — Dual-path parity: CLEAR

Create ↔ Delete: Both gated on owner/admin, both scoped by workspace_id. ✅
Create ↔ Update: Both gated, both validate URL/events. Update cannot change project_id (by design). ✅
List (workspace) ↔ List (project): Both gated, both scoped. ✅
Single issue update ↔ Batch issue update: Both now publish prev_status. This PR fixes the batch path parity. ✅

C2 — Control-flow ordering / nested reuse: CLEAR

webhooksign.Sign and webhooksign.Verify are leaf functions with no nested call chains. ✅
netguard.restrictedDialContext is called once per connection (including redirect hops). No double-application risk. ✅
validateWebhookURL is called at create-time and update-time (when URL changes). No ordering dependency with other validators. ✅
No security control is applied in a "check-then-act" pattern that could be bypassed by reordering. ✅

C3 — Authorization boundary ≠ capability boundary: CLEAR

All CRUD endpoints are behind auth middleware + workspace role check. ✅
The dispatcher runs server-side only — no user-facing endpoint to trigger delivery directly. ✅
loadWebhookSubscription uses (id, workspace_id) composite lookup — cross-workspace access impossible. ✅
The webhook listener on the event bus is registered server-side — no external caller can inject events. ✅

[Octo-Q] verdict: APPROVE

No P0 or P1 issues found. The security architecture is well-designed: SSRF guard at dial-time with DNS rebinding defense, HMAC signing in a shared package, owner/admin-gated CRUD with workspace scoping, and async delivery that never blocks the request path. Three P2 observations (PR body documentation mismatch, no graceful shutdown, silent queue-full drops) are non-blocking.

Jerry-Xin

Summary: The PR is relevant to Multica, but the outbound webhook SSRF protection has a proxy bypass that must be fixed before merge.

🔴 Blocking

🔴 Critical: server/internal/netguard/netguard.go:38-40 uses http.ProxyFromEnvironment in the “restricted” webhook client. When HTTP_PROXY/HTTPS_PROXY is set, Go dials the proxy and the custom DialContext validates only the proxy address, not the user-supplied webhook target. That bypasses the DNS/internal-IP guard for hostnames accepted by validateWebhookURL at server/internal/handler/webhook_subscription.go:121-126, letting a webhook target that resolves to private/link-local infrastructure be reached through the proxy. For this security boundary, either disable proxies for webhook delivery (Proxy: nil) or add target-host validation that runs independently for every request and redirect before proxy use. Add a regression test with a proxy configured and an internal-resolving target.

💬 Non-blocking

🟡 Warning: server/internal/integrations/outwebhook/dispatcher.go:160-175 starts an unbounded goroutine per status-change event, and each goroutine can hold a DB query for up to 10 seconds. The worker pool bounds HTTP delivery, but not subscription lookup pressure. Consider routing dispatch lookup through a bounded queue or semaphore if status changes can spike.

✅ Highlights

The PR keeps subscription CRUD owner/admin-gated and scopes writes by workspace_id.
HMAC signing is centralized in webhooksign, reducing inbound/outbound drift.
Targeted Go tests for netguard, outwebhook, and webhooksign pass locally.

Superseded by re-review on new head 1eda2ae (review #4503139076). Original SSRF redirect/transport blocker is fixed via netguard dial-time IP guard; new CR is for a remaining proxy-path bypass.

yujiawei

Code Review — PR #34 (multica)

Reviewed against head 1eda2ae1619217ebd184ee668058afbaa25ef6c3 (merge-base ac1ba3b). This is a security-sensitive change (outbound webhooks: SSRF surface, HMAC signing, secret handling, detached delivery), so it received an extra-careful pass plus an independent second-opinion engine and a fresh adversarial verification of the SSRF guard at this head.

This is a re-review. The prior round's blocking finding — the create-time-only SSRF guard being bypassed by HTTP redirects / DNS rebinding — is now fixed by the new netguard package: the delivery client resolves the host, rejects any blocked resolved IP, and dials the exact verified IP on every hop (initial + each redirect). That closes the redirect and rebinding gaps cleanly, and netguard_test.go proves it. Nice work.

Verdict: Changes requested. One residual SSRF bypass in that same new guard (proxy-enabled deployments) keeps it from being airtight. Everything else below is non-blocking. The crypto, multi-tenancy scoping, RBAC, and the entire frontend contract layer are solid and independently verified.

Blocking

B1 — SSRF guard is bypassed when an egress proxy is configured

server/internal/netguard/netguard.go:39

transport := &http.Transport{
    Proxy:                 http.ProxyFromEnvironment,
    DialContext:           restrictedDialContext(dialer),
    ...
}

The whole security guarantee of this package is enforced in restrictedDialContext — it resolves the destination host and refuses to dial a blocked internal IP. But with Proxy: http.ProxyFromEnvironment, when HTTP_PROXY / HTTPS_PROXY / ALL_PROXY is set in the server's environment, Go dials the proxy address, not the target. restrictedDialContext then only validates the proxy's IP; the user-configured webhook host is sent to the proxy as a CONNECT/absolute-URI target and never passes through IsBlockedIP. A subscriber can register http://169.254.169.254/... (cloud metadata) or an internal host, and the proxy will happily fetch it — the dial-time guard the package promises is silently a no-op for the destination.

This is narrower than the redirect bug that was just fixed (it requires a proxy to be configured), but it matters here specifically because:

The package docstring states the guarantee is enforced "at dial time … on every dial," which is not true under a proxy.
The PR's own testing context is a Dockerized self-host deployment, which is exactly where an egress/forward proxy is commonly present.
It re-opens the precise SSRF class (loopback / link-local / metadata / private) the prior round blocked on.
netguard is used only by this webhook delivery path, so disabling the proxy here has no collateral impact.

Fix (one line): for the restricted client, set Proxy: nil so connections always go direct and always flow through restrictedDialContext. If proxied egress is a real requirement for this deployment, the host check must instead run against the target host before delegating to the proxy (and that path needs its own test). Given v1 scope, Proxy: nil is the clean choice. A regression test that sets HTTP_PROXY and asserts an internal target is still refused would lock this down.

Non-blocking (P2 — address now or fast-follow, your call)

One-time signing secret can leak into logs on schema drift — packages/core/api/client.ts:2108-2120 + packages/core/api/schema.ts:46-54. The create response carries the full signing secret (returned once). It is parsed via parseWithFallback, which on any validation failure logs received: data — i.e. the entire response body, including the secret — to the schema logger / telemetry. A future shape drift (e.g. a malformed created_at) would write the live secret to logs, and the user would also silently get the empty-fallback object instead of their secret. Consider a redacted parse path for this secret-bearing endpoint (strip secret before logging, or parse without raw-body logging here). Low likelihood, but the blast radius is a live credential.
Per-event dispatch goroutine is unbounded before backpressure — server/internal/integrations/outwebhook/dispatcher.go:160-179. The delivery queue (numWorkers=16, queueCapacity=1024, non-blocking enqueue) is well designed and bounds in-flight POSTs. But DispatchIssueStatusChanged still does go d.dispatch(ev) per event, and each dispatch goroutine runs the subscription DB query + marshal before reaching the bounded queue. Under a status-change storm (e.g. a large batch update), this can spawn many concurrent dispatch goroutines each issuing a DB query for up to listTimeout. Bounded for normal load; consider funneling the lookup/enqueue step through a bounded worker too if storms are plausible. (Raised in the prior round as well — still open.)
No composite FK enforcing project_id ∈ workspace_id — server/migrations/121_webhook_subscription.up.sql. The handler validates this at create via GetProjectInWorkspace, so it's defense-in-depth only; a (workspace_id, project_id) composite FK would make the cross-tenant invariant a database guarantee rather than an application one.

What's solid (verified, not assumed)

SSRF guard core (the fix) — netguard.go: resolves host, rejects any blocked resolved IP, dials the exact verified IP, and re-runs on every redirect hop via Go's default redirect policy re-dialing the transport. TLS verification is left intact (not disabled). Redirect-to-loopback and split-horizon DNS are covered by tests. This is the right shape; only the proxy path (B1) escapes it.
HMAC — webhooksign.go signs the exact marshaled body once and reuses it across retries; Verify uses hmac.Equal (constant-time). Consolidating inbound verifyHubSignature onto the shared package removes inbound/outbound drift risk — a good de-dup.
Secret handling — full secret returned once on create (SecretOnce), never echoed by list/get/update; only the 4-char secret_hint is exposed. Not logged server-side. Cleartext-at-rest is a documented, pre-existing pattern (mirrors autopilot_trigger, tracked separately).
RBAC / multi-tenancy — all CRUD is owner/admin-gated; reads/writes scoped by workspace_id; update/delete resolve through the loader and use row.ID / row.WorkspaceID per the repo's UUID convention; project-scoped create validates project membership. The handler test covers the member-forbidden, secret-shown-once, foreign-project-404, and empty-events-400 cases — the RBAC/CRUD gap flagged last round is now closed.
prev_status batch fix — issue.go:2919 brings the batch path in line with the single-update path; both publish prev_status, which the listener depends on. Without it, batch status changes would have fired webhooks with an empty previous_status.
Event wiring — listener correctly filters issue:updated on status_changed, the dispatcher's SupportedEventTypes is the single source of truth feeding the handler allow-list, and a malformed events column fails closed (treated as "no events", never a panic). Delivery worker has a recover() so one bad delivery can't kill a worker.
Frontend — new API client methods all run responses through parseWithFallback + lenient zod with explicit fallbacks (no bare as); there's a malformed-response schema test (missing / null / wrong-type); the once-shown secret stays in React state and is rendered as text (no XSS); i18n keys are at parity across en/ja/ko/zh-Hans; no package-boundary violations. Satisfies the repo's API Response Compatibility rules.

Coverage / what I did not verify

Targeted Go suites pass on this checkout: go test ./internal/handler ./internal/integrations/outwebhook ./internal/netguard ./internal/webhooksign — all green; go build / go vet / gofmt clean on the new files. I did not run the full make check (TS unit + E2E).
TLS pinning and any reverse-proxy / egress filtering in front of the server (infra-level defense) were not assessed beyond confirming cert validation isn't disabled in code.
A second independent long-context review leg was attempted but timed out with no output and is therefore absent from this review — cross-family corroboration here is single-engine plus this reviewer's own analysis, not two. B1 was independently confirmed by direct code inspection regardless, so confidence in it is high.

lml2468 added 2 commits June 16, 2026 09:53

yujiawei added the needs-human-review label Jun 16, 2026

mochashanyao approved these changes Jun 16, 2026

View reviewed changes

Jerry-Xin previously requested changes Jun 16, 2026

View reviewed changes

yujiawei requested changes Jun 16, 2026

View reviewed changes

lml2468 requested review from Jerry-Xin and yujiawei June 16, 2026 03:42

mochashanyao approved these changes Jun 16, 2026

View reviewed changes

Jerry-Xin requested changes Jun 16, 2026

View reviewed changes

yujiawei requested changes Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(webhooks): outbound webhooks on issue status change#34

feat(webhooks): outbound webhooks on issue status change#34
lml2468 wants to merge 3 commits into
mainfrom
feat/outbound-webhooks

lml2468 commented Jun 16, 2026

Uh oh!

mochashanyao left a comment

Uh oh!

Jerry-Xin left a comment

Uh oh!

yujiawei left a comment

Uh oh!

lml2468 commented Jun 16, 2026

Uh oh!

mochashanyao left a comment

Uh oh!

Jerry-Xin left a comment

Uh oh!

yujiawei left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lml2468 commented Jun 16, 2026

Summary

Backend

Frontend

Testing

Notes / deferred

Uh oh!

mochashanyao left a comment

Choose a reason for hiding this comment

PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change

1. Verification Summary

2. Findings

F1 — SSRF via HTTP redirect (P2)

F2 — Dispatch goroutine accumulation under burst (P2)

F3 — Signing secret stored cleartext (P2 — pre-existing pattern)

F4 — Full IssueResponse in outbound payload (P2 — design note)

3. Suggestions

4. Additional Findings

5. Data-Flow Trace

6. Blind-Point Checklist (R5 — security_sensitive)

Verdict

Uh oh!

Jerry-Xin left a comment

Choose a reason for hiding this comment

Uh oh!

yujiawei left a comment

Choose a reason for hiding this comment

Code Review — PR #34 (multica)

Blocking

B1 — SSRF: the create-time URL guard is bypassed by HTTP redirects

Closely related (same root, fold into the same fix)

Non-blocking (P2 — address now or fast-follow, your call)

What's solid (verified, not assumed)

Coverage / what I did not verify

Uh oh!

lml2468 commented Jun 16, 2026

🔴 Blocking — SSRF bypass at delivery time (fixed at the transport layer)

💬 Non-blocking — also addressed

Deliberately deferred (with rationale)

Uh oh!

mochashanyao left a comment

Choose a reason for hiding this comment

PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change

1. Verification Summary

2. Findings

No P0 or P1 issues found.

P2 — PR description understates SSRF protection (DNS rebinding)

P2 — Dispatcher worker pool has no graceful shutdown

P2 — Silent delivery drop on queue-full

3. Suggestions

4. Additional Observations

5. Data-Flow Tracing

Path A: Batch issue update → outbound webhook delivery

Path B: Webhook subscription CRUD

6. R5 Blind-Spot Checklist (security_sensitive)

C1 — Dual-path parity: CLEAR

C2 — Control-flow ordering / nested reuse: CLEAR

C3 — Authorization boundary ≠ capability boundary: CLEAR

Uh oh!

Jerry-Xin left a comment

Choose a reason for hiding this comment

Uh oh!

yujiawei left a comment

Choose a reason for hiding this comment

Code Review — PR #34 (multica)

Blocking

B1 — SSRF guard is bypassed when an egress proxy is configured

Non-blocking (P2 — address now or fast-follow, your call)

What's solid (verified, not assumed)

Coverage / what I did not verify

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants