Skip to content

Breach Scanning

Melvin PETIT edited this page Jun 16, 2026 · 1 revision

Breach Scanning

The scan engine checks every employee email against every configured breach-intelligence provider, persists matches, and fires notifications. It lives in src/lib/scan/.

Providers

A provider implements one contract (src/lib/scan/types.ts):

interface BreachProvider {
  id: ApiProvider          // HIBP | DEHASHED | LEAKCHECK | INTELX | SNUSBASE
  source: BreachSource     // HIBP | MANUAL | DARK_WEB
  lookup(email: string, apiKey: string): Promise<Finding[]>
}

A Finding is the normalized result shared across providers:

interface Finding {
  name: string        // breach identifier, used as the Breach unique key
  breachDate: Date    // epoch (1970) when the provider does not expose it
  dataTypes: string[] // exposed data types, normalized to snake_case
}

Wired providers are registered in src/lib/scan/registry.ts:

Provider ApiProvider id Source
Have I Been Pwned HIBP HIBP
LeakCheck LEAKCHECK dark web
DeHashed DEHASHED dark web
Intelligence X INTELX dark web
Snusbase SNUSBASE dark web

Adding a source = implement BreachProvider, add it to the PROVIDERS array in the registry, and add a value to the ApiProvider enum.

How a scan runs

runScan(companyId, providers) in src/lib/scan/runner.ts:

  1. Loads all employees for the company with their existing breachRecords.
  2. Resolves alert recipients (company admins, only if email is enabled) and active webhooks once, up front.
  3. For each employee, for each active provider, calls lookup(). Provider errors are isolated (caught and skipped) so one failing provider never aborts the scan.
  4. Each new finding is persisted by persistFinding: upsert the Breach, skip if the employee is already linked, otherwise create the BreachRecord + Alert, then send email and dispatch webhooks.
  5. Sleeps RATE_LIMIT_MS (1500 ms) between employees to stay within provider rate limits.

Returns { scanned, newRecords, newAlerts }.

Active providers are loaded by loadActiveProviders, which decrypts each stored API key server-side and stamps lastUsedAt.

Severity scoring

Severity is derived from exposed data types (severityFor in the runner). The critical set is:

password, hashed_password, credit_card, ssn, bank_account
Critical types in the finding Severity
2 or more CRITICAL
exactly 1 HIGH
0 MEDIUM

Triggering a scan

POST /api/employees/scan (any authenticated user). The route enforces three guards:

  • Rate limit: 5 scans per company per minute, else 429.
  • No provider configured: 503 with a prompt to add a key in Data API.
  • Concurrency: one running scan per company at a time, else 409.

See API Reference for the full endpoint list and Configuration for where keys come from.

Clone this wiki locally