Skip to content

fix(session): reconcile late acks, serialize concurrent reactions, harden hook + ack error paths#348

Closed
rmyndharis wants to merge 1 commit into
mainfrom
fix/message-ack-reaction-hooks
Closed

fix(session): reconcile late acks, serialize concurrent reactions, harden hook + ack error paths#348
rmyndharis wants to merge 1 commit into
mainfrom
fix/message-ack-reaction-hooks

Conversation

@rmyndharis

Copy link
Copy Markdown
Owner

Four small reliability fixes on the engine→session event path (delivery status, reactions, hooks). No wire-format change.

1. A delivery/read receipt that arrives before the send is recorded is no longer lost

A send stores the row optimistically (PENDING, no waMessageId) and only writes waMessageId + SENT in a second save after the engine returns. The ack callbacks fire from independent engine event handlers, not gated on that send promise — so a fast delivered/read/failed ack could run the status UPDATE while the row's waMessageId was still null, matching 0 rows. Each ack is one-shot, so the message stayed stuck at "sent" forever. onMessageAck now retries the forward-only, session-scoped advance once after a short delay, reconciling the status. The transition guard is unchanged, so the retry can't downgrade a higher status.

2. Concurrent reactions on the same message no longer overwrite each other

onMessageReaction did a read-modify-write of the single metadata.reactions JSON column with an await between the read and the full-row save. Two reactions on the same message interleaved across that await: both read the same snapshot, both saved the whole map, last writer won — silently dropping a reaction. Reaction writes are now serialized per (sessionId, waMessageId) through a small promise-chain map that cleans up its own entries (no leak), so every reaction is applied on top of the previous one.

3. The ack DB write no longer escapes to the global unhandled-rejection backstop

The ack status UPDATE was the only message.* handler without a .catch — a connection-level rejection (pool exhaustion, SQLITE_BUSY) surfaced through the global backstop with no per-message context. It now logs locally like its sibling handlers.

4. A plugin hook that reports an error no longer has its partial output applied

HookResult allows returning data and error together. The pipeline applied currentData = result.data before checking result.error, so a handler that mutated the payload and signalled failure had its (possibly half-transformed) data propagated to persistence, webhooks, and the WebSocket as if it succeeded. The data assignment is now gated on result.error === undefined — an errored handler's mutation is discarded (on both the continue and the stop path). Error isolation (catch-and-continue) is unchanged.

Tests

New unit tests: ack retry fires once on affected:0 and not when a row is advanced (fake timers); a rejected ack update produces no unhandled rejection; concurrent reactions from two senders both survive (DB-aliasing simulated with per-read snapshots); the reaction remove/delete branch; a failed reaction write doesn't block a later one; the serialization map is emptied after draining; the hook gate discards an errored mutation on both the continue and continue:false paths. Full backend 737/737; dashboard build + lint clean.

An adversarial review of this diff found no must-fix issues and mutation-verified that the regression tests genuinely fail without each fix; the added remove-branch / chain-after-failure / map-cleanup / continue:false tests close the coverage gaps it surfaced.

Deferred (not in this PR)

Persisting phone-composed self-messages (sends typed on the linked phone) into the local messages table is intentionally left out: it requires a UNIQUE(sessionId, waMessageId) index migration, a boot-time dedup of legacy rows, and an upsert across two write paths (a naive insert double-persists API sends). That's a higher-risk change better done as its own PR with the migration.

…rden hook + ack error paths

- A fast delivered/read/failed ack could match 0 rows when it arrived before the send's 2nd save committed waMessageId, leaving the message stuck at SENT. onMessageAck now retries the forward-only, session-scoped status advance once after a short delay (ACK_RECONCILE_DELAY_MS) to close that race.

- Two reactions on the same message both read the same stored snapshot and the last full-row save won, dropping a reaction. Reaction writes are now serialized per (sessionId, waMessageId) via a self-cleaning promise-chain map, preserving every reaction.

- The ack status UPDATE was the one message handler without a .catch, so a connection-level rejection escaped to the global unhandled-rejection backstop without per-message context; it now logs locally.

- A plugin hook returning both mutated data and an error had the (possibly partial/corrupted) data applied before the error was thrown, propagating it to persistence/webhooks/WS as success. The data mutation of an errored hook is now discarded.
rmyndharis added a commit that referenced this pull request Jun 19, 2026
* fix(webhook): deliver session lifecycle events and key webhook hardening (#335)

* fix(security): pin outbound webhook and media fetches to validated IP (#338)

* fix(plugins): persist plugin enable/config and restore (#339)

* fix(message): persist bulk-sent messages, sanitize SSRF (#340)

* fix(engine): return the real id for forwarded messages (#341)

* fix(security): harden outbound requests, IPv6 SSRF (#344)

* fix(security): secret-file perms, key pepper, allowedIps (#345)

* fix(storage): bound tar imports, contain storage keys (#346)

* fix(session): reconcile late acks, serialize reactions (#348)

* fix(contract): webhook timeout, bounded shutdown, 501 (#350)

* feat(session): force-kill a stuck session (#352)

* merge #343

* merge #351

* chore(release): v0.4.3 — CHANGELOG, version bump (package.json/dashboard/swagger), README + docs
@rmyndharis

Copy link
Copy Markdown
Owner Author

Shipped in v0.4.3 (integrated via the release PR #354 and tagged v0.4.3). Thanks! 🎉

@rmyndharis rmyndharis closed this Jun 19, 2026
@rmyndharis rmyndharis deleted the fix/message-ack-reaction-hooks branch June 19, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant