Skip to content

[Feature]: typed WaId identity value object + a lid <-> phone resolution table #349

@tobiasstrebitzer

Description

@tobiasstrebitzer

Pre-flight

  • I searched existing issues and this isn't already requested

Problem / motivation

Every WhatsApp identity in the app is a raw JID string today (from, to, chatId, contact/chat
id, stored values, send DTOs). Two problems fall out of that:

  • The neutral @c.us convention is enforced only by discipline plus the wa-id helpers from feat(engine): engine-neutral WhatsApp identities (Baileys inbound conformance) #342 -
    nothing actually stops a string in the wrong dialect from slipping through.
  • The hard case, a lid whose phone we don't know, isn't visible in the type. To the code, 123@lid
    and 628...@c.us are interchangeable strings, so a phone-based match against a lid contact quietly
    fails and nothing signals that resolution was even needed. Concretely: a webhook from-filter that
    matches a DM today silently misses the same person's already-resolved group message.

Note this is a behavior gap, not just a typing nicety: response shapes don't change, but filter and
matching behavior does (misses become hits once a lid is resolved).

Proposed solution

A small value object that makes the kind, the (maybe-unknown) phone, and the lid explicit, so the type
carries the rules instead of every caller re-deriving them.

type WaIdKind = 'user' | 'group' | 'lid' | 'status' | 'newsletter' | 'broadcast';

class WaId {
  readonly kind: WaIdKind;
  readonly phone?: string;    // E.164 digits, when known
  readonly lid?: string;      // lid number, when this id is/carries a lid
  readonly groupId?: string;  // for groups
  readonly raw: string;       // original engine jid; debug-only, excluded from matching

  toNeutral(): string;
  refersToSamePerson(other: WaId): boolean;   // relational, NOT a hashable key

  static fromEngineJid(jid: string, resolvePhone?: (jid: string) => string | null): WaId;
  static fromUserInput(value: string): WaId;
}

Key properties:

  • WaId is in-memory only. It is never persisted as rows and introduces no table of identities.
    The wire/storage format stays the neutral string. The only new persistent artifact is the
    resolution table below.
  • Matching is relational, not structural equality: two WaIds refer to the same person if they share a
    known phone or share a lid. It's not a clean equivalence relation while lids are unresolved
    ({phone:1, lid:X} matches both {phone:1} and {lid:X}, but those two don't match each other), so
    there's a deliberate third outcome: matched / didn't match / couldn't tell (phone unknown). raw
    is excluded from matching so the same person seen via two engines doesn't split.
  • Adapters build WaId at the engine boundary; the contract documented as prose on IWhatsAppEngine
    in feat(engine): engine-neutral WhatsApp identities (Baileys inbound conformance) #342 becomes type-enforced. The wire stays a neutral string in both directions, so REST/webhook
    output is unchanged.

The lid <-> phone resolution table

WaId resolution only helps if lid -> phone lookup is synchronous (filters/dispatch can't await a
network call). Resolution reads from an in-memory map loaded from a persisted table on boot and
written through on every new mapping
. Two write sources with opposite costs:

  • Passive (free): mappings WhatsApp pushes to us - history sync (lidPnMappings),
    contacts.upsert, and message senderPn/participantPn. Captured eagerly, batched writes.
  • Active (rate-limited): getContactLidAndPhone (whatsapp-web.js) / the Baileys lid-mapping store,
    only when something needs a phone we don't have. Done lazily and written through immediately
    (including negative results) so it's one network call per unknown lid, ever.

Today this is scattered and mostly ephemeral (Baileys' in-memory lidToPn, the session service's
per-session lidPhoneCache gated by RESOLVE_LID_TO_PHONE, whatsapp-web.js's inline lookup). The
proposal consolidates them into one persisted, cross-session table on the data connection (so a
TypeORM migration under src/database/migrations/). A stored mapping is "best known, not forever"
(numbers get recycled), so it's treated as a cache WhatsApp can correct.

Open questions (where I'd like input before building)

  1. Wire/storage format: keep the neutral JID string (<phone>@c.us) as the serialized form (least
    churn, human-readable, matches existing DTO examples and stored values), or store bare digits? I'm
    leaning toward keeping the neutral string - WaId is the in-memory type, the string is the boundary
    format.
  2. Active resolution on Baileys: the eager-passive + lazy-active write-through model feels settled.
    The open bit is whether the installed Baileys exposes a lid-mapping lookup
    (signalRepository.lidMapping) to rely on for active resolution, since resolveContactPhone is
    cache-only today.
  3. RESOLVE_LID_TO_PHONE semantics: today it gates senderPhone. Should WaId resolution respect
    it (privacy), or always resolve internally into the table and just gate what's exposed?

Scope

Staged so nothing observable breaks; the wire stays a neutral string throughout.

  1. Land wa-id.ts parse/format primitives - done in feat(engine): engine-neutral WhatsApp identities (Baileys inbound conformance) #342.
  2. Add WaId as a thin wrapper over those primitives, keeping the public string fields.
  3. Move consumers over one module at a time (id-comparison spots first: contact matching, then
    groups/messages).
  4. Parse DTOs / API inputs into WaId at the edge, keeping the wire format neutral.
  5. Switch IncomingMessage (and friends) to typed WaId fields internally; delete scattered string
    normalization. REST/webhook output keeps emitting toNeutral(), so nothing observable changes.
  6. Fold isLidSender / senderPhone into WaId and deprecate them.

WaId type + tests is small (~1 file, builds on wa-id.ts). Consumer migration is medium and spread
out (~50 files mention a dialect literal, about half specs/fixtures), done module-by-module behind the
existing string fields so each step is reviewable and reversible. Worth starting only once the #342
contract has settled in main.

Alternatives considered

  1. Status quo - raw strings + a mapping side-table + dual webhook fields (the WAHA/Evolution shape).
    Rejected as the primary model: it pushes "is this the same person" onto every consumer and every
    internal call site, which is the discipline-dependent setup that produces the silent-miss bug here
    and the recurring lid-leak bugs in both peers. We reuse the parts that work (a persisted lid<->pn
    table, eager senderPn/*Alt capture) but put them behind one typed identity instead of exposing
    raw strings.
  2. Variant-set + substring matching (Evolution's jidOptions). Rejected: brittle and only works for
    variants already expanded into the set.
  3. Lean entirely on the engine's mapping store, no neutral type. Rejected: two different mechanisms
    across engines (Baileys' persistent store vs wwebjs' per-call query), the helpers won't unify forms
    anyway, and raw differs per engine - so identity would split when switching engines.
  4. Chat-level merge toggle (WAHA's merge). Solves duplicate threads, not filter/matching
    correctness - orthogonal, and a symptom-level fix rather than an identity model.
  5. Keep raw strings + lean harder on the wa-id helpers. Rejected: the unresolved-lid case stays
    invisible to the type, so the silent-miss bug remains a matter of caller discipline.

Net: WaId + a unified resolution table is the ecosystem's converged building blocks (a lid<->pn
table, capture of pushed mappings) exposed as one neutral typed identity with an explicit "couldn't
tell" state
, rather than leaving "same person" to each consumer - which is where both peers keep
getting bitten.

(Two narrower in-design variants, separate from the above: a discriminated union on kind vs the bag of
optionals, and storing bare digits vs the neutral string - see open question 1.)

Scope

  • I understand some features are limited by the underlying WhatsApp engine (e.g. interactive Buttons/List messages are not supported on whatsapp-web.js).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions