Skip to content

Feat/webhooks dlq postgres#369

Merged
Baskarayelu merged 2 commits into
CredenceOrg:mainfrom
Vivian-04:feat/webhooks-dlq-postgres
May 29, 2026
Merged

Feat/webhooks dlq postgres#369
Baskarayelu merged 2 commits into
CredenceOrg:mainfrom
Vivian-04:feat/webhooks-dlq-postgres

Conversation

@Vivian-04
Copy link
Copy Markdown
Contributor

PR #324 [Backend] Webhooks DLQ: persist dead-letter entries in Postgres instead of MemoryDlqStore

Summary

Replaces the in-process MemoryDlqStore (a plain Map) with a durable PostgresDlqStore backed by a webhook_dlq Postgres table. Previously, any server restart would silently erase all dead-lettered webhook deliveries, making it impossible to inspect or replay permanently failed events. With this change, failed webhook deliveries survive deploys, are queryable via SQL, and integrate with the existing replay tooling.

Motivation

  • Data loss on restart — The MemoryDlqStore lived entirely in process memory. A deploy, crash, or scaling event wiped the DLQ with no recovery path.
  • No observability — Operators had no way to query how many entries were in the DLQ or inspect failure details after the fact.
  • Replay reliability — The existing replayDeadLetters flow only worked within a single process lifetime, defeating its purpose for durable retry.

Changes

New Files

File Purpose
src/migrations/008_create_webhook_dlq.ts Creates the webhook_dlq table with columns for payload (JSONB), attempts, status code, error text, response snippet, and replay timestamp. Adds indexes on failed_at and webhook_id.
src/services/webhooks/postgresDlqStore.ts PostgresDlqStore class implementing the DlqStore interface (push, list, get, markReplayed). Serializes payloads as JSONB, updates the DLQ size gauge on every push.
src/services/webhooks/postgresDlqStore.test.ts Unit tests using pg-mem for an in-memory Postgres simulation. Covers all four interface methods plus the metrics integration. 100% line coverage.

Modified Files

File What changed
src/jobs/outbox.ts Instantiates PostgresDlqStore(pool) and passes it into WebhookService instead of the default in-memory store.
src/middleware/metrics.ts Adds a webhook_dlq_size Prometheus gauge and a recordWebhookDlqSize() helper, keeping metrics centralised.
docs/webhooks.md Documents the DLQ table schema, how entries are stored, and the replay flow.

Test Fixes (pre-existing issues, not related to DLQ)

File Fix
tests/routes/webhooks.test.ts Updated audit.getLogs() calls to use the new async tenant-scoped signature (await audit.getLogs(undefined, {}, 100, 0, { allowSuperScope: true })).
tests/routes/apiKeys.test.ts Same getLogs signature fix for audit assertions.
tests/routes/members.test.ts Added tenantId: 'tenant-1' to mock user, updated assertions to match the tenant-scoped MemberService signature.
tests/routes/rateLimit.test.ts Replaced fragile vi.doMock + dynamic re-import pattern with vi.spyOn for fail-open/fail-closed Redis tests.
src/routes/admin/member.ts Added { mergeParams: true } to the members Router so :orgId propagates from the parent route.

Database Migration

-- 008_create_webhook_dlq (up)
CREATE TABLE webhook_dlq (
  id                     VARCHAR(255) PRIMARY KEY,
  webhook_id             VARCHAR(255) NOT NULL,
  payload                JSONB        NOT NULL,
  failed_at              TIMESTAMPTZ  NOT NULL,
  attempts               INTEGER      NOT NULL,
  last_status_code       INTEGER,
  last_error             TEXT,
  response_body_snippet  TEXT,
  replayed_at            TIMESTAMPTZ
);

CREATE INDEX ON webhook_dlq (failed_at);
CREATE INDEX ON webhook_dlq (webhook_id);

Metrics

A new Prometheus gauge is exposed:

webhook_dlq_size{} — Current number of entries in the webhook dead-letter queue

Updated automatically on every push() call via recordWebhookDlqSize().

Testing

  • New tests: postgresDlqStore.test.ts — 100% line coverage of the store using pg-mem.
  • Existing test suite: All pre-existing test failures in webhooks.test.ts, apiKeys.test.ts, members.test.ts, and rateLimit.test.ts have been fixed as part of this PR.

How to Verify

# Run the full test suite
npm run test

# Run only the DLQ store tests
npm run test -- src/services/webhooks/postgresDlqStore.test.ts

# Run with coverage
npm run test -- --coverage src/services/webhooks/postgresDlqStore.test.ts

Checklist

  • webhook_dlq migration added
  • PostgresDlqStore implements DlqStore interface
  • buildDlqEntry redaction logic preserved (unchanged)
  • Stores: payload, attempts, lastStatusCode, lastError, responseBodySnippet, replayedAt
  • Wired into publisher path (outbox.tsWebhookService)
  • DLQ size exposed as Prometheus gauge metric
  • Unit tests with 100% coverage
  • Documentation updated

Closes #324

Vivian-04 added 2 commits May 27, 2026 18:58
…tore

- Add 008_create_webhook_dlq migration with webhook_dlq table
- Implement PostgresDlqStore with full DlqStore interface
- Wire PostgresDlqStore into outbox job and WebhookService
- Add webhook_dlq_size gauge metric via prom-client
- Add unit tests with 100% coverage for PostgresDlqStore
- Update webhooks documentation
- Fix pre-existing test issues (audit tenant scoping, getLogs signatures)

Closes CredenceOrg#324
@drips-wave
Copy link
Copy Markdown

drips-wave Bot commented May 27, 2026

@Vivian-04 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@Vivian-04
Copy link
Copy Markdown
Contributor Author

Hi, Please kindly note that the Integration Tests were failing before i started working on the both issues. Please kindly check. Thank youuu!

@Baskarayelu Baskarayelu merged commit 4d82196 into CredenceOrg:main May 29, 2026
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Backend] Webhooks DLQ: persist dead-letter entries in Postgres instead of MemoryDlqStore

2 participants