feat(webhooks): outbound webhooks on issue status change#34
Conversation
Add GitHub-style outbound webhooks: when an issue's status changes, Multica POSTs a signed JSON payload to subscribed external URLs. The reverse direction of the existing inbound autopilot webhooks. - Migration 121 + sqlc: webhook_subscription table (nullable project_id distinguishes workspace-level "org" webhooks from project-level "repo" webhooks); cleartext signing secret, consistent with autopilot_trigger. - outwebhook.Dispatcher: subscription selection (workspace- vs project-scoped), event filtering, async fire-and-forget delivery with bounded concurrency (semaphore) and short retry backoff. All work is detached off the event-bus/request path. - webhooksign: shared GitHub-compatible HMAC-SHA256 sign/verify; the inbound verifier now delegates to it so the two can't drift. - webhook_listeners: subscribe to issue:updated (status_changed) and dispatch; SupportedEventTypes is the single source of truth tying the allow-list to the listener. - Handler CRUD at /api/webhook-subscriptions, owner/admin-gated, with SSRF guard rejecting loopback/link-local/private/metadata targets. - Fix batch issue update to publish prev_status (was omitted, delivering empty previous_status for batch transitions). - Dedup: shared generateCredential + secretHint helpers.
Surface outbound webhook management on two shared surfaces: - Workspace level: a new Settings → Webhooks tab. - Project level: a collapsible section in the project detail right panel, directly below Resources. - core: WebhookSubscription types, lenient zod schemas via parseWithFallback, api client methods, and TanStack Query hooks (keyed on wsId + projectId). - views: shared useWebhookSection hook + WebhookDialogs (secret-once + delete-confirm) consumed by both surfaces — list/create/toggle/delete and the signing secret shown once on create, permission-gated to owner/admin. - i18n: en/ja/ko/zh-Hans strings; parity test passes.
mochashanyao
left a comment
There was a problem hiding this comment.
[Octo-Q · automated review]
Verdict: Approve — no blocking findings; notes below (data-flow traced).
PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change
Reviewer: Octo-Q (automated review)
Head SHA: 003d5342bbe1e70dcb9e3fd43bc8f503e6e7cb08
Repo: Mininglamp-OSS/Octo-Q
Routing: complexity=security_sensitive (automated review)
Scope: 38 files, +2777/−33 — migration, dispatcher, CRUD handlers, SSRF guard, HMAC signing, frontend UI, i18n
1. Verification Summary
| Area | Status | Evidence |
|---|---|---|
| Authorization (CRUD) | ✅ | All 4 handlers gate on requireWorkspaceRole(..., "owner", "admin") — webhook_subscription.go:185,222,248,280 |
| Workspace scoping | ✅ | loadWebhookSubscription scopes by workspace_id via GetWebhookSubscriptionInWorkspace — webhook_subscription.go:296-316 |
| Project ownership | ✅ | Create validates GetProjectInWorkspace(ID, WorkspaceID) — webhook_subscription.go:237-247 |
| SSRF guard (IP literals) | ✅ | Rejects loopback, link-local, private, unspecified + localhost — webhook_subscription.go:119-143 |
| HMAC signing | ✅ | webhooksign.Sign/Verify — constant-time hmac.Equal, GitHub-compatible sha256=<hex> — webhooksign.go:27-44 |
| Secret entropy | ✅ | 32 bytes crypto/rand, whsec_ prefix, URL-safe base64 — autopilot_webhook.go:42-48, webhook_subscription.go:97 |
| Async dispatch | ✅ | DispatchIssueStatusChanged → go d.dispatch(ev) — never blocks event bus — dispatcher.go:121-123 |
| Concurrency bound | ✅ | Semaphore chan struct{} capped at 16 — dispatcher.go:57,155-158 |
| prev_status fix (batch) | ✅ | prevIssue.Status published in batch update payload — issue.go:2919 (matches single-update path at issue.go:2417) |
| Data flow: listener → dispatcher | ✅ | webhook_listeners.go extracts status_changed, issue, prev_status from event payload → DispatchIssueStatusChanged |
| Frontend admin gate | ✅ | `canManage = role === "owner" |
| Secret shown once | ✅ | SecretOnce set only in create handler response — webhook_subscription.go:244 |
| Migration | ✅ | Proper FK cascades, partial index on enabled, project-scoped index — 121_webhook_subscription.up.sql |
| Tests | ✅ | Handler validation (URL/events/secret), dispatcher (matching/filtering/retry/HMAC), webhooksign roundtrip, frontend component tests |
2. Findings
F1 — SSRF via HTTP redirect (P2)
Diff-scope: new (introduced by this PR)
validateWebhookURL (webhook_subscription.go:119-143) checks the initial URL for internal targets (loopback, link-local, private, localhost). However, the delivery HTTP client (dispatcher.go:89: &http.Client{Timeout: deliveryTimeout}) uses Go's default redirect policy — it follows up to 10 redirects (301/302/303/307/308).
Attack path: An admin creates a webhook pointing at https://attacker.example.com/redirect which returns 307 Temporary Redirect → http://169.254.169.254/latest/meta-data/. Go's HTTP client follows the 307, preserving POST method and body. The signed webhook payload (including issue data) is delivered to the cloud metadata endpoint.
Go's client strips Authorization headers on cross-domain redirects but does NOT strip custom headers like X-Octo-Q-Signature-256. The body (full issue data) is also preserved on 307.
Mitigation: Admin-gated creation; DNS-rebinding already documented as v1 OOS. This redirect-based SSRF is the same class of gap.
Recommendation: Add a CheckRedirect policy that either rejects redirects to internal IPs or caps redirect count to 0 for webhook delivery:
client := &http.Client{
Timeout: deliveryTimeout,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse // don't follow redirects
},
}F2 — Dispatch goroutine accumulation under burst (P2)
Diff-scope: new (introduced by this PR)
dispatch() (dispatcher.go:127) runs in a detached goroutine. Inside, it blocks on d.sem <- struct{}{} (line 155) when all 16 delivery slots are full. Each delivery can take up to ~38s (30s timeout × 1 initial + 2 retries with 1s+4s backoff).
Under a burst of N status changes with M matching subscriptions: N dispatch goroutines accumulate, each holding the marshaled payload (~size of IssueResponse). If N×M >> 16, dispatch goroutines queue on the semaphore.
Practical impact: Low for v1 workloads. With 16 slots and ~38s worst-case per delivery, throughput is ~25 deliveries/minute. A burst of 100 status changes × 5 webhooks = 500 deliveries would take ~20 minutes to drain, with 500 goroutines queued.
Recommendation: Consider a bounded work queue (e.g., discard events when queue depth exceeds a threshold) or a context timeout on semaphore acquisition.
F3 — Signing secret stored cleartext (P2 — pre-existing pattern)
Diff-scope: pre-existing (matches autopilot_trigger.signing_secret from migration 093)
secret TEXT NOT NULL in webhook_subscription stores the HMAC key in cleartext. The PR body and migration comment acknowledge this and reference MUL-2334 for a future secrets-at-rest infrastructure.
Recommendation: No action needed for v1 — consistent with existing pattern. Track in MUL-2334.
F4 — Full IssueResponse in outbound payload (P2 — design note)
Diff-scope: new (introduced by this PR)
outboundPayload.Issue (dispatcher.go:108) serializes the complete handler.IssueResponse into the webhook body. This includes all issue fields (description, internal metadata, assignee details, etc.).
This is by design (webhook subscribers need issue context), but workspace admins should be aware that all issue data is sent to external endpoints.
Recommendation: Consider documenting the payload schema for operators, or offering a field-level filter in v2.
3. Suggestions
-
Redirect policy (addresses F1): Set
CheckRedirectto reject or cap redirects in the delivery HTTP client. This closes the redirect-based SSRF vector without requiring DNS resolution. -
Bounded dispatch queue (addresses F2): Replace the bare semaphore with a buffered channel + discard-on-full pattern, or add a
context.WithTimeouton semaphore acquisition to prevent unbounded goroutine accumulation. -
Payload schema documentation (addresses F4): Document the outbound payload shape in the PR body or API docs so operators know what data is sent to subscribers.
4. Additional Findings
- No delivery-history persistence: v1 is fire-and-forget with no delivery log (asymmetric with inbound
webhook_deliverytable). Acknowledged in PR body. Not blocking — observability can be added in v2. - No secret rotation via API:
UpdateWebhookSubscriptioncannot rotate the signing secret. Operators must delete + recreate. Acceptable for v1. ListWebhookSubscriptionsByProjectdoesn't verify project ownership: The query filters by(workspace_id, project_id)which is safe — returns empty if project doesn't belong to workspace. No information leakage.
5. Data-Flow Trace
| Consumed Data | Upstream Source | Flows Correctly? |
|---|---|---|
payload["status_changed"] |
BatchUpdateIssues / UpdateIssue: prevIssue.Status != issue.Status → issue.go:2912,2408 |
✅ — boolean, set when status differs |
payload["issue"] (IssueResponse) |
issueToResponse(issue, prefix) from DB-updated row |
✅ — full post-update issue |
payload["prev_status"] |
prevIssue.Status — loaded before update via GetIssueForUpdate |
✅ — batch path fixed by this PR (issue.go:2919), single path already had it (issue.go:2417) |
e.WorkspaceID |
events.Event.WorkspaceID set by h.publish() |
✅ — scoped to request workspace |
e.ActorType / e.ActorID |
h.resolveActor(r, userID, workspaceID) — resolves real actor |
✅ — not observer, actual trigger |
issue.ProjectID |
IssueResponse.ProjectID (nullable *string) |
✅ — listener handles nil (webhook_listeners.go:42-44) |
sub.Secret (for signing) |
webhook_subscription.secret column, loaded by ListEnabledWebhookSubscriptionsForDispatch |
✅ — cleartext from DB, used directly for HMAC |
sub.Events (for matching) |
webhook_subscription.events JSONB column, unmarshaled in subscribedToEvent |
✅ — malformed JSON handled gracefully (skip, not panic) |
6. Blind-Point Checklist (R5 — security_sensitive)
C1 — Dual-path parity:
- Subscribe ↔ unsubscribe: N/A — webhooks are independent CRUD operations, no paired subscribe/unsubscribe paths.
- Create-gate ↔ execute-path: ✅ — Create is admin-gated; dispatch reads from DB (no re-authorization needed for system-driven delivery).
- Parallel implementations: ✅ — No duplicate webhook dispatch paths.
C2 — Control-flow ordering / nesting reuse:
webhooksign.Sign/Verifyshared by inbound + outbound: ✅ — single implementation, no double-signing or ordering issue.generateCredentialshared by webhook tokens + signing secrets: ✅ — pure function, no side effects.- SSRF guard non-canonical forms: ✅ —
url.Parsehandles encoded hosts;net.ParseIPhandles IPv4/IPv6;strings.EqualFoldhandlesLOCALHOST/Localhost. No bypass via case or encoding.
C3 — Authorization boundary ≠ capability boundary:
- ✅ — All
/api/webhook-subscriptionsendpoints are behind authenticated session +requireWorkspaceRole("owner", "admin"). No unauthenticated or member-level access path. - ✅ —
loadWebhookSubscriptionscopes byworkspace_id, preventing cross-workspace access via UUID enumeration. - ✅ — Dispatch runs system-internally (no user-facing endpoint), triggered only by event bus.
Verdict
No P0 or P1 findings. All identified issues are P2 (SSRF redirect gap, burst goroutine accumulation, cleartext secret, payload exposure) — none make a working path unavailable, produce user-visible incorrect data, or make the system worse than before this PR.
The authorization model is sound (admin-gated, workspace-scoped), the signing implementation is correct (constant-time HMAC), the data flow is properly traced from issue update through event bus to delivery, and the SSRF guard covers the primary attack vectors (IP literals + localhost).
[Octo-Q] verdict: APPROVE — P2 suggestions are non-blocking improvements for v2 consideration.
Jerry-Xin
left a comment
There was a problem hiding this comment.
In scope for Multica, but the outbound webhook SSRF guard is bypassable at delivery time.
🔴 Blocking
- 🔴 Critical: The SSRF protection only validates the originally configured URL, but delivery uses a default
http.Client, which follows redirects and performs unrestricted DNS resolution. A subscriber can configure an allowed public URL that returns a redirect tohttp://127.0.0.1,http://169.254.169.254, or another private target, andd.client.Do(req)will follow it. Likewise, a hostname can resolve or later rebind to a private IP after creation. This defeats the stated guard for loopback/link-local/private/metadata targets. Fix this in the delivery path, not only create/update validation: disable redirects or validate every redirect, and use a customTransport.DialContextthat rejects resolved private/link-local/loopback/unspecified IPs for every connection. References: webhook_subscription.go, dispatcher.go, dispatcher.go.
💬 Non-blocking
- 🟡 Warning: Delivery concurrency is bounded, but event goroutines are still unbounded under load.
DispatchIssueStatusChangedstarts one goroutine per event, and eachdispatchgoroutine can block ond.sem <- struct{}{}while prior deliveries sleep/retry. A large batch or burst can therefore accumulate many blocked dispatch goroutines despite the 16-delivery cap. Consider a bounded worker queue or non-blocking drop/log policy for fire-and-forget v1. Reference: dispatcher.go, dispatcher.go. - 🔵 Suggestion: Add handler-level CRUD/RBAC tests for the new API routes. The pure URL/event tests are useful, but they do not exercise owner/admin enforcement, project-in-workspace validation, or the “secret returned only once” API behavior. Reference: webhook_subscription_test.go.
✅ Highlights
- The feature fits the repository and existing issue/workspace model.
- The event listener correctly filters
issue:updatedto status changes and includesprev_status. - The shared HMAC signing package is a good consolidation of inbound verification and outbound signing.
- Targeted verification passed:
go test ./internal/handler ./internal/integrations/outwebhook ./internal/webhooksignandgo test ./cmd/server -run 'TestNonExistent'.
yujiawei
left a comment
There was a problem hiding this comment.
Code Review — PR #34 (multica)
Reviewed against head 003d5342bbe1e70dcb9e3fd43bc8f503e6e7cb08 (merge-base ac1ba3b). This is a security-sensitive change (outbound webhooks: SSRF surface, HMAC signing, secret handling, detached delivery), so it got an extra-careful pass plus an independent second-opinion engine.
Verdict: Changes requested — one blocking SSRF gap. Everything else is non-blocking. The crypto, multi-tenancy scoping, and the entire frontend contract layer are solid.
Blocking
B1 — SSRF: the create-time URL guard is bypassed by HTTP redirects
server/internal/integrations/outwebhook/dispatcher.go:80
client: &http.Client{Timeout: deliveryTimeout},The delivery client has no CheckRedirect policy, so it follows 3xx redirects with Go's default behavior (up to 10 hops). validateWebhookURL (server/internal/handler/webhook_subscription.go:108) runs only at create/update and only on the literal URL the operator typed. A subscriber registered at an allowed public URL can simply respond 302 Location: http://169.254.169.254/... (cloud metadata) or http://10.x / 127.0.0.1 (internal subnet), and d.post will follow it and POST the issue payload there. The SSRF guard the PR deliberately built is silently defeated.
The package comment at webhook_subscription.go:100-107 states the intent precisely: "an admin shouldn't be able to turn the server into a probe of its own metadata service or internal subnet." Redirect-following breaks exactly that property. It is blind SSRF (the response body is drained to io.Discard, so it's not reflected to the caller), and creating a subscription is owner/admin-gated — but in a multi-tenant deployment a workspace admin is a customer, not a server operator, so a customer reaching host infrastructure is a real cross-boundary issue.
Fix (closes the whole class, not just redirects): validate at the transport/dial layer so every connection — initial and every redirect hop, after DNS resolution — is checked against isInternalIP. A custom http.Transport with a DialContext that rejects internal resolved IPs is the robust form. The minimal stopgap is a CheckRedirect that re-runs the host check on each hop (or refuses redirects via http.ErrUseLastResponse), but that alone does not address B2.
Closely related (same root, fold into the same fix)
DNS-resolves-to-internal / DNS rebinding — webhook_subscription.go:104-107 documents that DNS is not resolved and rebinding is "out of scope for v1." That deferral is a defensible call on its own, but the create-time-only validation model is what makes both this and B1 possible: nothing re-checks the destination at delivery time. A transport-level DialContext guard that inspects the resolved IP on every dial closes B1 and the rebinding gap in one change, and turns the "out of scope" note into "actually mitigated." Strongly recommended to do both together rather than ship the redirect fix alone.
Non-blocking (P2 — address now or fast-follow, your call)
-
Subscriber URL logged on failure —
dispatcher.go:195,202. Delivery failures log"url", sub.Urlat WARN/ERROR. Subscriber webhook URLs frequently carry tokens in the path/query; this writes customer secrets to server logs. Log subscription ID + parsed host only, or a redacted URL. -
PATCH {"events":[]}silently resets to the default —webhook_subscription.go:142-145+:312-318. Omittingeventskeeps the old value (COALESCE), but sending an explicit empty array runs throughvalidateWebhookEvents, which defaults[]to["issue.status_changed"]. Omitted vs explicit-empty diverge, and a client can't intentionally clear events. Harmless in v1 (single event type), but it's latent design debt the moment a second event ships. Cleanest fix: drop the empty-default on the update path (default only on create), or reject empty arrays. -
Dispatch goroutine fan-out is unbounded per event —
dispatcher.go:120,167.maxConcurrentDeliveriesbounds in-flight POSTs, but each event still spawns adispatch()goroutine that does the DB list + marshal before hitting the semaphore. A burst of status changes can pile updispatch()goroutines blocked ond.sem. Bounded for normal load; consider a worker queue if status-change storms are plausible. -
No composite FK enforcing
project_id∈workspace_id—migrations/121_webhook_subscription.up.sql. The handler validates this viaGetProjectInWorkspace, so it's defense-in-depth only, but a(workspace_id, project_id)composite FK would make the invariant a database guarantee. -
Authorization runs after input validation on create —
webhook_subscription.go:223-235. URL/events validation happens beforerequireWorkspaceRole. Minor info-leak/fingerprinting surface; move the role check up to right after body decode. -
Optional hardening: delivery goroutines have no
recover()(dispatcher.go:120,168) — verified there are no currently-reachable panic vectors, so this is defensive-only, not a bug. In-flight deliveries usecontext.Background()and aren't cancelled on shutdown — acceptable for fire-and-forget. A constant-time-from-the-start tweak towebhooksign.Verify(compute HMAC unconditionally) closes a negligible format-vs-content timing distinction.
What's solid (verified, not assumed)
- HMAC —
webhooksign.gosigns the exact marshaled body once and reuses it across retries;Verifyuseshmac.Equal(constant-time compare). Extracting the shared package and routing inboundverifyHubSignaturethrough it is a good de-dup that prevents inbound/outbound drift. - Secret handling — full secret returned once on create (
SecretOnce), never echoed by list/get/update; only the 4-charsecretHintis exposed. Not logged in the handler. Cleartext-at-rest is a documented, pre-existing pattern (mirrorsautopilot_trigger, tracked as MUL-2334). prev_statusbatch fix —issue.go:2919correctly brings the batch path in line with the single-update path; both now publishprev_status, which the listener depends on. Without it, batch status changes would have fired webhooks with an emptyprevious_status.- Multi-tenancy — every read/write is scoped by
workspace_id; updates/deletes resolve through the loader and userow.ID/row.WorkspaceID, consistent with the repo's UUID-parsing convention. Project-scope create validates project membership. - Frontend — the new API client methods all run responses through
parseWithFallback+ zod with explicit fallbacks (no bareas), there's a malformed-response schema test (missing/null/wrong-type cases), the create-once secret stays in React state only and is rendered as text (no XSS), i18n keys are at parity across en/ja/ko/zh-Hans, and no package-boundary violations. This fully satisfies the repo's API Response Compatibility rules.
Coverage / what I did not verify
- Static review only — I did not run
make test/ the Go + TS suites against this checkout. - TLS certificate validation on delivery, and any reverse-proxy / egress filtering in front of the server (infra-level defense-in-depth), were not assessed.
- An independent long-context second leg was attempted but timed out and is absent from this review — so cross-family corroboration here is single-engine plus this reviewer's own analysis, not two. The blocking finding (B1) was independently confirmed by direct code inspection and an adversarial verification pass, so confidence in it is high regardless.
…dening
Blocking (SSRF bypass at delivery time):
- New internal/netguard package: NewRestrictedHTTPClient resolves the host and
refuses to dial loopback/link-local/private/unspecified addresses on EVERY
dial — including each redirect hop and across DNS rebinding — then dials the
exact verified IP (closes the resolve→dial TOCTOU). The dispatcher uses it, so
a subscriber can no longer register a public URL that 302s to 169.254.169.254
/ 127.0.0.1 / a private host, nor use a hostname that resolves to internal.
- validateWebhookURL keeps the create-time literal/localhost fast-reject (now via
netguard.IsBlockedIP); the transport is the authoritative enforcement.
Hardening (non-blocking review items):
- Bounded delivery: replace the per-event goroutine + blocking semaphore with a
fixed worker pool draining a buffered queue; enqueue is non-blocking (drop +
log when full) so dispatch goroutines can't pile up under a status-change
storm. Workers recover() so one bad delivery can't kill a worker.
- Redact logs: log subscription_id + host only, never the full subscriber URL
(URLs often carry tokens).
- events: validateWebhookEvents now rejects an empty list; create applies the
default, update does not — so PATCH {"events":[]} is a 400 instead of a silent
reset to the default.
- Create authorizes (requireWorkspaceRole) before validating input.
Tests:
- netguard: IsBlockedIP table + restricted client refuses loopback / redirect-to-
loopback.
- dispatcher: production constructor blocks internal delivery; delivery-mechanism
tests inject a permissive client.
- handler: RBAC 403 for non-admin, secret-shown-once, project-must-be-in-
workspace (404), PATCH empty-events (400).
|
Thanks for the careful review. Pushed 🔴 Blocking — SSRF bypass at delivery time (fixed at the transport layer)The create-time URL check was insufficient because the delivery client followed redirects and did unrestricted DNS. Fixed where it belongs — the delivery transport:
This directly addresses @Jerry-Xin's 🔴 and @yujiawei's B1 (incl. the folded-in DNS-rebinding note). 💬 Non-blocking — also addressed
Deliberately deferred (with rationale)
Verification: |
mochashanyao
left a comment
There was a problem hiding this comment.
[Octo-Q · automated review]
Verdict: Approve — no blocking findings; notes below (data-flow traced).
PR#34 Review Report — feat(webhooks): outbound webhooks on issue status change
Reviewer: Octo-Q (automated review)
Head SHA: 1eda2ae1619217ebd184ee668058afbaa25ef6c3
Repo: Mininglamp-OSS/Octo-Q
Routing: complexity=security_sensitive (automated review)
1. Verification Summary
| Area | Status | Evidence |
|---|---|---|
| SSRF guard (netguard) | ✅ | server/internal/netguard/netguard.go:22-28 — blocks loopback, link-local (incl. 169.254.169.254), private, unspecified. Dial-time enforcement with resolve→check→dial-exact-IP pattern closes TOCTOU and DNS rebinding. |
| SSRF redirect-hop coverage | ✅ | netguard.go:38-42 — transport uses restricted dialer; Go default redirect policy re-dials through same transport, so each hop is re-checked. |
| HMAC signing (webhooksign) | ✅ | server/internal/webhooksign/webhooksign.go:28-42 — HMAC-SHA256 + hmac.Equal constant-time comparison. Shared package for inbound verify + outbound sign prevents drift. |
| CRUD authorization | ✅ | handler/webhook_subscription.go — all 4 handlers call requireWorkspaceRole(..., "owner", "admin"). Create checks auth BEFORE validation (prevents probing). loadWebhookSubscription scopes by (id, workspace_id). |
| Project scope validation | ✅ | webhook_subscription.go:228-240 — CreateWebhookSubscription verifies project belongs to workspace via GetProjectInWorkspace. |
| Secret management | ✅ | generateWebhookSecret() → generateCredential("whsec_") → 32 bytes crypto/rand, URL-safe base64 (256 bits). Secret returned ONLY in create response (SecretOnce), never echoed on list/get. |
| Dispatcher async isolation | ✅ | dispatcher.go:148-150 — DispatchIssueStatusChanged spawns goroutine, returns immediately. Never blocks the event bus or HTTP request path. |
| Worker pool bounding | ✅ | dispatcher.go:56-60 — 16 fixed workers, 1024 queue capacity, non-blocking enqueue (drops on full with log). |
| Retry logic | ✅ | dispatcher.go:186-207 — 3 attempts, retries on network error + 5xx, stops on 4xx. Backoff: 1s, 4s. |
| Batch prev_status fix | ✅ | handler/issue.go:2919 — batch update now publishes prev_status: prevIssue.Status, matching the single-update path at line 2417. |
| Migration 121 | ✅ | Proper FK with ON DELETE CASCADE on both workspace_id and project_id. Partial index on enabled for dispatch hot path. |
| Frontend RBAC | ✅ | use-webhook-section.ts:63-64 — canManage gated on owner/admin. Queries gated on canManage. |
| Schema leniency | ✅ | schemas.ts — .loose() on zod schemas, parseWithFallback for graceful degradation. |
2. Findings
No P0 or P1 issues found.
P2 — PR description understates SSRF protection (DNS rebinding)
Diff-scope: pre-existing (in the PR body text, not in code).
The PR body states: "SSRF guard checks IP literals + localhost; does not resolve DNS (DNS-rebinding out of scope for v1)."
However, the actual netguard.restrictedDialContext implementation (netguard.go:55-72) does resolve DNS (net.DefaultResolver.LookupIPAddr), checks ALL resolved IPs, and dials the exact verified IP — which does defend against DNS rebinding. The code comment at netguard.go:5-10 correctly describes this. The PR body should be corrected to match the implementation.
Impact: Documentation-only. Could mislead future security reviewers into thinking DNS rebinding is unmitigated when it actually is.
P2 — Dispatcher worker pool has no graceful shutdown
Diff-scope: new (introduced by this PR).
outwebhook.New() starts 16 worker goroutines (dispatcher.go:107-109) that run for job := range d.jobs with no stop signal. The Dispatcher has no Close()/Stop() method. On server shutdown, in-flight deliveries are abandoned and workers are killed with the process.
Impact: Consistent with the rest of the server (no graceful shutdown pattern). Acceptable for v1 fire-and-forget. Consider adding a context.Context-based shutdown for v2 if delivery persistence is added.
P2 — Silent delivery drop on queue-full
Diff-scope: new (introduced by this PR).
dispatcher.go:175-179 — when the 1024-slot delivery queue is full, new deliveries are dropped with only a slog.Warn. No metric/counter is incremented.
Impact: Under extreme load, operators have no alerting signal beyond log scraping. Consider adding a Prometheus counter or similar observable for dropped deliveries.
3. Suggestions
- Correct PR body DNS-rebinding claim to match the actual implementation (it's better than stated).
- Future: Add a
Close()method toDispatcherthat cancels workers and drains the queue, especially before adding delivery-history persistence. - Future: Add a counter metric for queue-full drops so operators can alert on delivery loss.
4. Additional Observations
- Signing secret at rest: Stored cleartext, matching existing
autopilot_trigger.signing_secretconvention. Acknowledged in PR notes. No general secrets-at-rest infra exists yet (MUL-2334). Not a regression. - No delivery-history persistence: v1 is fire-and-forget with log-only outcomes. Asymmetric with inbound
webhook_deliverytable. Acknowledged and deferred. - No secret rotation endpoint: Users must delete + recreate to rotate. Acceptable for v1.
generateCredentialrefactor: Clean extraction —generateWebhookTokenandgenerateWebhookSecretnow share one entropy source. Good.signingSecretHint→secretHintdedup: Correctly eliminates the duplicate between autopilot and webhook handlers.
5. Data-Flow Tracing
Path A: Batch issue update → outbound webhook delivery
BatchUpdateIssues(issue.go:2919) publishesEventIssueUpdatedwithstatus_changed: true+prev_status: prevIssue.Status← this PR fixes the missing prev_statusregisterWebhookListeners(webhook_listeners.go:24-26) checkspayload["status_changed"].(bool)→ true- Extracts
payload["issue"].(handler.IssueResponse)(line 29) andpayload["prev_status"].(string)(line 36) - Calls
d.DispatchIssueStatusChanged(...)(line 43-50) with typed struct DispatchIssueStatusChanged(dispatcher.go:148) spawns goroutine →dispatch()dispatch()queriesListEnabledWebhookSubscriptionsForDispatch(ctx, wsUUID)(line 160) — SQL:WHERE workspace_id = $1 AND enabled = true- Filters by
subscriptionMatches(sub, ev.ProjectID)(workspace-level matches all; project-level matches own project) ANDsubscribedToEvent(sub, EventIssueStatusChanged)(line 166-168) - Marshals
outboundPayloadwith event, workspace_id, actor, issue, previous_status, delivered_at (line 173-180) - Enqueues
deliverJobper matched subscription (line 184-189) - Worker picks up →
deliver()signs withwebhooksign.Sign(sub.Secret, body)→post()via SSRF-restricted HTTP client
Verdict: Data flows correctly from issue update through event bus to signed HTTP delivery. The prev_status fix closes the batch-update gap.
Path B: Webhook subscription CRUD
- POST
/api/webhook-subscriptions→ auth middleware →CreateWebhookSubscription requireWorkspaceRole(owner/admin)→validateWebhookURL(SSRF fast-check) →validateWebhookEvents(allow-list) → verify project in workspace →generateWebhookSecret()→ DB insert → response withSecretOnce- Subsequent GET/list:
webhookSubscriptionToResponse()never populatesSecretOnce(omitempty hides empty string)
Verdict: Secret lifecycle is correct — generated, returned once, never re-exposed.
6. R5 Blind-Spot Checklist (security_sensitive)
C1 — Dual-path parity: CLEAR
- Create ↔ Delete: Both gated on owner/admin, both scoped by workspace_id. ✅
- Create ↔ Update: Both gated, both validate URL/events. Update cannot change project_id (by design). ✅
- List (workspace) ↔ List (project): Both gated, both scoped. ✅
- Single issue update ↔ Batch issue update: Both now publish
prev_status. This PR fixes the batch path parity. ✅
C2 — Control-flow ordering / nested reuse: CLEAR
webhooksign.Signandwebhooksign.Verifyare leaf functions with no nested call chains. ✅netguard.restrictedDialContextis called once per connection (including redirect hops). No double-application risk. ✅validateWebhookURLis called at create-time and update-time (when URL changes). No ordering dependency with other validators. ✅- No security control is applied in a "check-then-act" pattern that could be bypassed by reordering. ✅
C3 — Authorization boundary ≠ capability boundary: CLEAR
- All CRUD endpoints are behind auth middleware + workspace role check. ✅
- The dispatcher runs server-side only — no user-facing endpoint to trigger delivery directly. ✅
loadWebhookSubscriptionuses(id, workspace_id)composite lookup — cross-workspace access impossible. ✅- The webhook listener on the event bus is registered server-side — no external caller can inject events. ✅
[Octo-Q] verdict: APPROVE
No P0 or P1 issues found. The security architecture is well-designed: SSRF guard at dial-time with DNS rebinding defense, HMAC signing in a shared package, owner/admin-gated CRUD with workspace scoping, and async delivery that never blocks the request path. Three P2 observations (PR body documentation mismatch, no graceful shutdown, silent queue-full drops) are non-blocking.
Jerry-Xin
left a comment
There was a problem hiding this comment.
Summary: The PR is relevant to Multica, but the outbound webhook SSRF protection has a proxy bypass that must be fixed before merge.
🔴 Blocking
- 🔴 Critical:
server/internal/netguard/netguard.go:38-40useshttp.ProxyFromEnvironmentin the “restricted” webhook client. WhenHTTP_PROXY/HTTPS_PROXYis set, Go dials the proxy and the customDialContextvalidates only the proxy address, not the user-supplied webhook target. That bypasses the DNS/internal-IP guard for hostnames accepted byvalidateWebhookURLatserver/internal/handler/webhook_subscription.go:121-126, letting a webhook target that resolves to private/link-local infrastructure be reached through the proxy. For this security boundary, either disable proxies for webhook delivery (Proxy: nil) or add target-host validation that runs independently for every request and redirect before proxy use. Add a regression test with a proxy configured and an internal-resolving target.
💬 Non-blocking
- 🟡 Warning:
server/internal/integrations/outwebhook/dispatcher.go:160-175starts an unbounded goroutine per status-change event, and each goroutine can hold a DB query for up to 10 seconds. The worker pool bounds HTTP delivery, but not subscription lookup pressure. Consider routing dispatch lookup through a bounded queue or semaphore if status changes can spike.
✅ Highlights
- The PR keeps subscription CRUD owner/admin-gated and scopes writes by
workspace_id. - HMAC signing is centralized in
webhooksign, reducing inbound/outbound drift. - Targeted Go tests for
netguard,outwebhook, andwebhooksignpass locally.
Superseded by re-review on new head 1eda2ae (review #4503139076). Original SSRF redirect/transport blocker is fixed via netguard dial-time IP guard; new CR is for a remaining proxy-path bypass.
yujiawei
left a comment
There was a problem hiding this comment.
Code Review — PR #34 (multica)
Reviewed against head 1eda2ae1619217ebd184ee668058afbaa25ef6c3 (merge-base ac1ba3b). This is a security-sensitive change (outbound webhooks: SSRF surface, HMAC signing, secret handling, detached delivery), so it received an extra-careful pass plus an independent second-opinion engine and a fresh adversarial verification of the SSRF guard at this head.
This is a re-review. The prior round's blocking finding — the create-time-only SSRF guard being bypassed by HTTP redirects / DNS rebinding — is now fixed by the new netguard package: the delivery client resolves the host, rejects any blocked resolved IP, and dials the exact verified IP on every hop (initial + each redirect). That closes the redirect and rebinding gaps cleanly, and netguard_test.go proves it. Nice work.
Verdict: Changes requested. One residual SSRF bypass in that same new guard (proxy-enabled deployments) keeps it from being airtight. Everything else below is non-blocking. The crypto, multi-tenancy scoping, RBAC, and the entire frontend contract layer are solid and independently verified.
Blocking
B1 — SSRF guard is bypassed when an egress proxy is configured
server/internal/netguard/netguard.go:39
transport := &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: restrictedDialContext(dialer),
...
}The whole security guarantee of this package is enforced in restrictedDialContext — it resolves the destination host and refuses to dial a blocked internal IP. But with Proxy: http.ProxyFromEnvironment, when HTTP_PROXY / HTTPS_PROXY / ALL_PROXY is set in the server's environment, Go dials the proxy address, not the target. restrictedDialContext then only validates the proxy's IP; the user-configured webhook host is sent to the proxy as a CONNECT/absolute-URI target and never passes through IsBlockedIP. A subscriber can register http://169.254.169.254/... (cloud metadata) or an internal host, and the proxy will happily fetch it — the dial-time guard the package promises is silently a no-op for the destination.
This is narrower than the redirect bug that was just fixed (it requires a proxy to be configured), but it matters here specifically because:
- The package docstring states the guarantee is enforced "at dial time … on every dial," which is not true under a proxy.
- The PR's own testing context is a Dockerized self-host deployment, which is exactly where an egress/forward proxy is commonly present.
- It re-opens the precise SSRF class (loopback / link-local / metadata / private) the prior round blocked on.
netguardis used only by this webhook delivery path, so disabling the proxy here has no collateral impact.
Fix (one line): for the restricted client, set Proxy: nil so connections always go direct and always flow through restrictedDialContext. If proxied egress is a real requirement for this deployment, the host check must instead run against the target host before delegating to the proxy (and that path needs its own test). Given v1 scope, Proxy: nil is the clean choice. A regression test that sets HTTP_PROXY and asserts an internal target is still refused would lock this down.
Non-blocking (P2 — address now or fast-follow, your call)
-
One-time signing secret can leak into logs on schema drift —
packages/core/api/client.ts:2108-2120+packages/core/api/schema.ts:46-54. The create response carries the full signingsecret(returned once). It is parsed viaparseWithFallback, which on any validation failure logsreceived: data— i.e. the entire response body, including the secret — to the schema logger / telemetry. A future shape drift (e.g. a malformedcreated_at) would write the live secret to logs, and the user would also silently get the empty-fallback object instead of their secret. Consider a redacted parse path for this secret-bearing endpoint (stripsecretbefore logging, or parse without raw-body logging here). Low likelihood, but the blast radius is a live credential. -
Per-event dispatch goroutine is unbounded before backpressure —
server/internal/integrations/outwebhook/dispatcher.go:160-179. The delivery queue (numWorkers=16,queueCapacity=1024, non-blocking enqueue) is well designed and bounds in-flight POSTs. ButDispatchIssueStatusChangedstill doesgo d.dispatch(ev)per event, and eachdispatchgoroutine runs the subscription DB query + marshal before reaching the bounded queue. Under a status-change storm (e.g. a large batch update), this can spawn many concurrentdispatchgoroutines each issuing a DB query for up tolistTimeout. Bounded for normal load; consider funneling the lookup/enqueue step through a bounded worker too if storms are plausible. (Raised in the prior round as well — still open.) -
No composite FK enforcing
project_id∈workspace_id—server/migrations/121_webhook_subscription.up.sql. The handler validates this at create viaGetProjectInWorkspace, so it's defense-in-depth only; a(workspace_id, project_id)composite FK would make the cross-tenant invariant a database guarantee rather than an application one.
What's solid (verified, not assumed)
- SSRF guard core (the fix) —
netguard.go: resolves host, rejects any blocked resolved IP, dials the exact verified IP, and re-runs on every redirect hop via Go's default redirect policy re-dialing the transport. TLS verification is left intact (not disabled). Redirect-to-loopback and split-horizon DNS are covered by tests. This is the right shape; only the proxy path (B1) escapes it. - HMAC —
webhooksign.gosigns the exact marshaled body once and reuses it across retries;Verifyuseshmac.Equal(constant-time). Consolidating inboundverifyHubSignatureonto the shared package removes inbound/outbound drift risk — a good de-dup. - Secret handling — full secret returned once on create (
SecretOnce), never echoed by list/get/update; only the 4-charsecret_hintis exposed. Not logged server-side. Cleartext-at-rest is a documented, pre-existing pattern (mirrorsautopilot_trigger, tracked separately). - RBAC / multi-tenancy — all CRUD is owner/admin-gated; reads/writes scoped by
workspace_id; update/delete resolve through the loader and userow.ID/row.WorkspaceIDper the repo's UUID convention; project-scoped create validates project membership. The handler test covers the member-forbidden, secret-shown-once, foreign-project-404, and empty-events-400 cases — the RBAC/CRUD gap flagged last round is now closed. prev_statusbatch fix —issue.go:2919brings the batch path in line with the single-update path; both publishprev_status, which the listener depends on. Without it, batch status changes would have fired webhooks with an emptyprevious_status.- Event wiring — listener correctly filters
issue:updatedonstatus_changed, the dispatcher'sSupportedEventTypesis the single source of truth feeding the handler allow-list, and a malformedeventscolumn fails closed (treated as "no events", never a panic). Delivery worker has arecover()so one bad delivery can't kill a worker. - Frontend — new API client methods all run responses through
parseWithFallback+ lenient zod with explicit fallbacks (no bareas); there's a malformed-response schema test (missing / null / wrong-type); the once-shown secret stays in React state and is rendered as text (no XSS); i18n keys are at parity across en/ja/ko/zh-Hans; no package-boundary violations. Satisfies the repo's API Response Compatibility rules.
Coverage / what I did not verify
- Targeted Go suites pass on this checkout:
go test ./internal/handler ./internal/integrations/outwebhook ./internal/netguard ./internal/webhooksign— all green;go build/go vet/gofmtclean on the new files. I did not run the fullmake check(TS unit + E2E). - TLS pinning and any reverse-proxy / egress filtering in front of the server (infra-level defense) were not assessed beyond confirming cert validation isn't disabled in code.
- A second independent long-context review leg was attempted but timed out with no output and is therefore absent from this review — cross-family corroboration here is single-engine plus this reviewer's own analysis, not two. B1 was independently confirmed by direct code inspection regardless, so confidence in it is high.
Summary
Adds GitHub-style outbound webhooks: when an issue's status changes, Multica POSTs a signed JSON payload to subscribed external URLs. The reverse direction of the existing inbound autopilot webhooks.
issue.status_changed. Delivery is async fire-and-forget with bounded concurrency, short retry backoff, and HMAC-SHA256 signing (X-Multica-Signature-256, GitHub-compatible).Backend
webhook_subscriptiontable (nullableproject_id; cleartext signing secret, consistent withautopilot_trigger).outwebhook.Dispatcher— subscription selection, event filtering, detached delivery off the event-bus/request path, semaphore-bounded concurrency.webhooksign— shared HMAC sign/verify; inbound verifier now delegates to it./api/webhook-subscriptions, owner/admin-gated, with an SSRF guard rejecting loopback/link-local/private/metadata targets.prev_status(was omitted).Frontend
useWebhookSectionhook +WebhookDialogsconsumed by both surfaces; secret shown once on create.parseWithFallback), api client, query/mutation hooks; en/ja/ko/zh-Hans i18n.Testing
outwebhook(selection/filtering/HMAC/retry),webhooksign, handler validation incl. SSRF cases — all pass;go build/vet/gofmtclean.pnpm typecheck,pnpm lint(0 errors), webhook component tests + core schema tests, i18n parity.prev_status, and both UI surfaces.Notes / deferred
webhook_deliverytable; revisit if observability/replay is needed.localhost; does not resolve DNS (DNS-rebinding out of scope for v1).autopilot_trigger.signing_secret).