Skip to content

feat(engine): engine-neutral WhatsApp identities (Baileys inbound conformance)#342

Merged
rmyndharis merged 2 commits into
rmyndharis:mainfrom
tobiasstrebitzer:baileys-jid-canonicalization
Jun 20, 2026
Merged

feat(engine): engine-neutral WhatsApp identities (Baileys inbound conformance)#342
rmyndharis merged 2 commits into
rmyndharis:mainfrom
tobiasstrebitzer:baileys-jid-canonicalization

Conversation

@tobiasstrebitzer

@tobiasstrebitzer tobiasstrebitzer commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Why

WhatsApp addresses the same entity through several dialects:

  • <phone>@c.us and <phone>@s.whatsapp.net - the same user, two dialects (whatsapp-web.js's vs
    the raw protocol's). Pure cosmetic difference.
  • <lid>@lid - a user addressed by a privacy id (LID). The number is not a phone; WhatsApp is
    migrating to LIDs so phone numbers aren't exposed in groups/communities. There's a lid → phone
    mapping that is sometimes known and sometimes not.
  • <id>@g.us (group), status@broadcast, <id>@newsletter, plus :device suffixes.

Because whatsapp-web.js was the original engine, its @c.us dialect became the de-facto neutral
format
the whole application layer assumes (send DTOs, the dashboard, group/contact/message modules,
plugins) - but it was never declared, just inherited. The Baileys adapter is the new engine that
doesn't conform: it leaks @s.whatsapp.net / @lid into neutral payloads, so under Baileys the same
contact surfaces under a different id than under whatsapp-web.js, @lid contacts can't be resolved to
a phone for display, and any consumer that compares a message's sender against a known contact id sees
a mismatch.

What this PR does

Turns that accidental convention into a declared identity contract at the engine boundary (an
anti-corruption layer), and makes the Baileys adapter conform to it on the inbound read path.

  • engine/identity/wa-id.ts - the shared, documented home of the contract:
    • parseWaId(jid) classifies any JID into { kind, userPart, device, raw }
      (user | group | lid | status | newsletter | broadcast | unknown).
    • toNeutralJid(jid, resolvePhone?) reduces it to the neutral dialect: <phone>@c.us, <id>@g.us,
      <lid>@lid, special channels - never @s.whatsapp.net, never a :device suffix. It prefers
      @c.us
      (resolves a lid to its phone when the engine knows the mapping) and keeps an
      unresolved lid as @lid
      - a phone-unknown privacy id is a first-class state, not faked into a
      phone.
  • IWhatsAppEngine now documents this contract as the rule every adapter's emitted ids must follow.
  • Baileys adapter applies it to inbound message from / to / chatId / author, plus the
    revoked and reaction events. BaileysSessionStore.toNeutralJid delegates to the shared module,
    passing its session's lid → pn resolver.

Net change

  • 9 files, +307 / −44. No migration, no API change, no new dependencies.
  • Behavior change: Baileys inbound message / message.revoked / reaction payloads now carry the
    neutral dialect (@c.us / @g.us / resolved-or-raw @lid) instead of @s.whatsapp.net / raw
    @lid, matching the whatsapp-web.js engine.

Scope - deliberately step 1 of an arc

This is the first installment of "engine-neutral identities," intentionally narrow so it can land on
its own. Left as follow-ups (same direction):

  • Outbound symmetry - normalize the chatId accepted by sendXxx() so the app can address a
    neutral id and each adapter de-normalizes (Baileys → @s.whatsapp.net). Outbound currently assumes
    @c.us works, which won't hold cleanly for lid-only contacts.
  • Contact/chat list ids through the same canonicalization, so lookups and display stay consistent
    with message ids.
  • whatsapp-web.js conformance audit - it already emits @c.us, but route its lid handling through
    the shared wa-id module so the contract is enforced symmetrically.
  • Typed WaId value object across the application layer (DTOs/services) - the endgame; its own
    initiative. See the design note.
  • (Minor) fold baileys-group-mapper's ad-hoc user-part helper into wa-id.

toNeutralJid's lid resolution depends on the lid → pn map being populated (history sync); this
pairs with the Baileys session-store persistence branch so the map survives restarts, but does not
depend on it.

Testing / verification

  • npm run build, npm run lint, full unit suite (742) - all green.
  • New coverage:
    • wa-id: parseWaId per dialect (incl. @s.whatsapp.net/@c.us folding into one user kind,
      device suffix, group/lid/status/newsletter/unknown); toNeutralJid for dialect folding,
      idempotence, lid resolved vs. kept-raw, status/empty/unknown passthrough.
    • mapper: the supplied normalizer is applied to from / to / chatId / author.
    • adapter integration: a plain @s.whatsapp.net → @c.us inbound message, and an @lid sender
      resolved to <phone>@c.us via a history-sync lidPnMappings event.

Notes for reviewers

  • This brings the Baileys engine to id-parity with the whatsapp-web.js engine across the neutral
    payloads.
  • The default-identity normalizer on buildIncomingMessageFromBaileys() means its existing behavior
    is unchanged unless the adapter opts in - canonicalization is entirely driven by the normalizer the
    adapter supplies.
  • Shares two files (baileys-session-store.ts, baileys.adapter.ts) with the session-store
    persistence branch, so whichever merges second needs a trivial rebase.

@tobiasstrebitzer tobiasstrebitzer force-pushed the baileys-jid-canonicalization branch from 0cdae12 to 4cd9ab6 Compare June 19, 2026 10:23
@tobiasstrebitzer tobiasstrebitzer marked this pull request as ready for review June 19, 2026 10:26
@tobiasstrebitzer tobiasstrebitzer force-pushed the baileys-jid-canonicalization branch from 4cd9ab6 to d05e219 Compare June 19, 2026 10:41
@tobiasstrebitzer tobiasstrebitzer marked this pull request as draft June 19, 2026 10:45
@tobiasstrebitzer tobiasstrebitzer marked this pull request as ready for review June 19, 2026 10:45

@rmyndharis rmyndharis left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @tobiasstrebitzer — declaring the neutral-identity contract as an anti-corruption layer is exactly the right direction, and the wa-id module is clean. CI is green and it's conflict-free against the rest of the open work. A deep review did surface two behavioral regressions that both stem from the same root cause — the inbound author is canonicalized (lid→phone, @c.us) but the other side of each comparison stays in the old dialect — so I'd like to hold merge until these are addressed.

Blocking

1. Group admin/controller recognition breaks (one-sided canonicalization).
After this PR, in a LID-addressed group a resolved author becomes <phone>@c.us, but the group participant list + owner stay raw @lid (mapBaileysGroupInfo is untouched). The translation coordinator decides admin/controller via widEquals(adminId, msg.author), which compares user-parts and deliberately does not bridge lid↔phone. So widEquals('111@lid', '628111@c.us')111 vs 628111no match → the legitimate group admin/owner is no longer recognized and /tr admin commands silently deny them. Pre-PR both sides were @lid and matched.
→ Canonicalize the Baileys group participant + owner ids through the same sessionStore.toNeutralJid (in mapBaileysGroupInfo/mapBaileysGroup, via the adapter that holds the store) so both sides of widEquals share the dialect. Please add a coordinator test with a lid author + a lid-resolved-to-phone admin list.

2. senderPhone regresses to null for resolved-lid senders (RESOLVE_LID_TO_PHONE).
The PR rewrites author/from to <phone>@c.us but keeps isLidSender=true; session.service then calls resolveSenderPhone('<phone>@c.us'), and resolvePhone() only handles @s.whatsapp.net/@lid — it returns null for @c.us. So the dedicated senderPhone field is now null for exactly the case the feature exists to surface (the phone is still in from, so it's recoverable, but the field silently regresses).
→ Have resolvePhone return the user-part for the @c.us branch (it already is a phone), or derive senderPhone from the resolved from/author. Please add a test for the RESOLVE_LID_TO_PHONE + resolved-lid path.

Please also address

3. This is a value-level contract change for Baileys webhook/WS payloads — not purely internal. message.received / revoked / reaction ids flip @s.whatsapp.net (and resolved @lid) → @c.us. That's the intended fix, but a consumer that stored/compared the old ids will see a different value. Please call it out explicitly in the CHANGELOG/PR (it's currently described as an internal fix) and confirm the OpenWA-n8n node tolerates @c.us ids from Baileys.

Nice-to-have (tests)

A few uncovered branches that would be cheap to lock in: an unresolved lid staying @lid end-to-end through the adapter; a :device-suffixed lid (it's currently passed raw to resolvePhone, so a device-suffixed lid misses a known mapping — worth a parseWaId→strip-device before lookup); the broadcast/newsletter build branches; and the lowercasing behavior in parseWaId.

Happy to pair on the group-participant canonicalization if useful — once 1 & 2 land with tests, this is a merge for me.

…formance)

WhatsApp addresses the same entity through several dialects (@c.us, @s.whatsapp.net,
@lid, plus group/status/channel and :device variants). The application layer and the
whatsapp-web.js engine assume the @c.us dialect, but the Baileys adapter leaked its
native @s.whatsapp.net / @lid into neutral payloads - so under Baileys the same contact
surfaces under a different id than under whatsapp-web.js, @lid contacts can't be resolved
to a phone for display, and any consumer that compares a message's sender against a known
contact id sees a mismatch.

Establish a single neutral identity contract at the engine boundary and make the Baileys
adapter conform on the inbound read path:
- engine/identity/wa-id.ts: the shared, documented home of the contract. parseWaId()
  classifies any JID; toNeutralJid() reduces it to the neutral dialect (@c.us / @g.us /
  @lid + special channels), resolving a lid to its phone when the mapping is known and
  keeping an unresolved lid as a first-class @lid (never pretending it's a phone).
- IWhatsAppEngine documents the contract every adapter must emit.
- The Baileys adapter applies it to inbound message from/to/chatId/author + revoked +
  reaction; BaileysSessionStore.toNeutralJid delegates to the shared module with its
  lid->pn resolver.

Scope: inbound read path only. Contact/chat list ids, the outbound send path (which
needs neutral -> engine de-normalization), and a whatsapp-web.js conformance audit are
follow-ups in the same arc. Pairs with the session-store persistence branch so the
lid->pn map survives restarts.
…rmyndharis#342 review)

Addresses two one-sided-canonicalization regressions surfaced in review of the
engine-neutral identity work: the inbound author was canonicalized while the
other side of each comparison stayed in the old dialect.

1. Group admin/owner recognition. mapBaileysGroup/mapBaileysGroupInfo now take a
   normalizer and the adapter passes its session-store canonicalizer, so group
   participant + owner ids share the neutral dialect of message authors. Fixes
   widEquals(adminId, author) silently denying a legit admin/owner (e.g. the /tr
   translation commands) in a lid-addressed group.

2. senderPhone for resolved-lid senders. BaileysSessionStore.resolvePhone now
   returns the user-part for an already-neutral @c.us id (a resolved lid arrives
   as <phone>@c.us once authors are canonicalized) and strips a :device suffix
   before the lid lookup. Fixes senderPhone regressing to null under
   RESOLVE_LID_TO_PHONE for exactly the case the field exists to surface.

Also folds the group-mapper's ad-hoc user-part helper into wa-id, and adds
coverage: coordinator admin recognition across the dialect split, the resolved-
lid senderPhone path, @c.us/device-suffixed resolvePhone, an unresolved lid kept
end-to-end, and the broadcast/newsletter + lowercasing branches in wa-id.
@tobiasstrebitzer tobiasstrebitzer force-pushed the baileys-jid-canonicalization branch from d05e219 to 47a4269 Compare June 19, 2026 23:50
@tobiasstrebitzer

Copy link
Copy Markdown
Contributor Author

Thanks for the review @rmyndharis . Both blockers are fixed and the branch is rebased onto main (v0.4.4); pushed as a separate review-fix commit on top of the original.

Blocker 1 (group admin recognition): mapBaileysGroup/mapBaileysGroupInfo now take a normalizer and the adapter passes its sessionStore.toNeutralJid, so group participant + owner ids share the dialect of the canonicalized author. Both sides of widEquals now fold together (both resolve to @c.us, or both stay @lid when unresolved). Added a coordinator test (lid author + lid-resolved-to-phone admin list) plus a regression guard, and mapper/adapter coverage.

Blocker 2 (senderPhone -> null): resolvePhone now returns the user-part for the @c.us branch (it already is a phone) and strips a :device suffix before the lid lookup. Added a store unit test and a session.service test for the RESOLVE_LID_TO_PHONE + resolved-lid path wired to a real store, so the @c.us branch is genuinely exercised rather than mocked.

Also addressed 3 (consumer-visible): CHANGELOG reworded to flag that Baileys message.received / revoked / reaction ids flip @s.whatsapp.net (and resolved @lid) -> @c.us. n8n confirmation is in progress (static read of the OpenWA-n8n nodes looks compatible: the trigger emits payloads verbatim, the action node already documents @c.us as the chatId format; live end-to-end test still to run).

Nice-to-haves all added: unresolved lid kept @lid end-to-end, :device-suffixed lid, broadcast/ newsletter build branches, parseWaId lowercasing. Also folded the group-mapper's user-part helper into wa-id. Full unit suite (928) + lint + build green.

I've ran an additional consumer audit. Since both blockers were one-sided-canonicalization, I traced the other consumers of the changed ids. No further hard regressions: webhook dispatch filters by event name (not JID, so the @c.us flip actually lets Baileys JID filters fire where they previously didn't), the events gateway is session-scoped, persistence/group controllers pass ids through, and the outbound round-trip is safe - Baileys' jidNormalizedUser folds c.us -> s.whatsapp.net, so replying to a now-@c.us sender still delivers.

One known limitation (not blocking): lid resolution is non-deterministic over time - a lid sender can be unresolved (@lid) early and resolved (@c.us) later once the lid -> pn map populates from history sync. Any consumer that keys persistent state on author/senderId can therefore see the key flip for lid users and split a record - specifically the translation participant map, the delegatedControllers list, and the in-memory reaction map. Low severity (lid-only, self-healing, and whatsapp-web.js never had it), and it's subsumed by the typed-WaId follow-up.

@rmyndharis

Copy link
Copy Markdown
Owner

Thank you for the thorough follow-up, @tobiasstrebitzer — this is excellent work, and the way you traced the other consumers after the two one-sided-canonicalization blockers (instead of just patching the reported two) is exactly the rigor I was hoping for.

I re-checked the fix commit against source and both blockers are genuinely closed:

  • Blocker 1: mapBaileysGroup/mapBaileysGroupInfo now take the normalizer and the adapter passes sessionStore.toNeutralJid, so getGroupAdmins (which reads from getGroupInfo().participants + owner) and the canonicalized author land on the same @c.us dialect. The coordinator test plus the regression guard for the raw-@lid-admin-list case make the intent unambiguous.
  • Blocker 2: resolvePhone returning the user-part for the user kind, plus the :device strip before the lid lookup, fixes the @c.us regression — and wiring the session.service test to a real store rather than a mock is the right call; that branch is now actually exercised.
  • The CHANGELOG now clearly flags the consumer-visible @s.whatsapp.net / resolved-@lid@c.us payload flip, and all the nice-to-have branches (unresolved-lid end-to-end, :device lid, broadcast/newsletter, parseWaId lowercasing) are covered. Folding the group-mapper's ad-hoc userPart into wa-id is a nice cleanup too.

I ran lint, build, and the touched suites locally on the branch — all green, and CI is clean/mergeable against main.

The one item worth recording is the limitation you flagged: lid resolution being non-deterministic over time means a consumer keying persistent state on author/senderId can split a record for lid-only users. I agree it's low severity (lid-only, self-healing, and wwjs never had it) and out of scope here — I'll track it against the typed-WaId initiative (#349) so it isn't lost. The n8n live end-to-end can also land after merge; the static read looks compatible.

Marking this ready to merge on my end. Thanks again for landing a clean, well-documented anti-corruption layer — this is a strong foundation for the rest of the identity arc.

@rmyndharis rmyndharis dismissed their stale review June 20, 2026 01:33

Dismissing my earlier review — the two one-sided-canonicalization blockers and all nice-to-have items were addressed in the fix commit (verified against source; lint/build/touched suites green locally, CI clean). Re-reviewed and approving.

@rmyndharis rmyndharis merged commit 8edecdf into rmyndharis:main Jun 20, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants