Skip to content

Fix re-INVITE dialog matching using RFC 3261 compliant To tag lookup#582

Open
briankwest wants to merge 8 commits intolivekit:mainfrom
briankwest:fix/reinvite-dialog-matching
Open

Fix re-INVITE dialog matching using RFC 3261 compliant To tag lookup#582
briankwest wants to merge 8 commits intolivekit:mainfrom
briankwest:fix/reinvite-dialog-matching

Conversation

@briankwest
Copy link

Fixes a critical bug where mid-dialog re-INVITEs from the remote party were incorrectly processed as new inbound calls instead of being recognized as part of an existing dialog.

Problem

When a SIP provider (e.g., SignalWire) sends a re-INVITE during an active call, LiveKit was treating it as a new call, causing:

  • Authentication challenges (401/407)
  • "No trunk found" errors (404)
  • Call failures and disconnections

This primarily affected outbound calls where From/To headers are "swapped" from LiveKit's perspective (remote is UAC for the re-INVITE).

Root Cause

The existing dialog matching happened too late (after processInvite) and only checked byCallID, which is unreliable. The code missed the definitive proof: the To tag in the request matches our local tag.

Per RFC 3261 Section 12.2, a dialog is identified by:

  • Call-ID
  • Local tag (in To header for UAS)
  • Remote tag (in From header for UAS)

Solution

Add early dialog matching in onInvite() before processing as new call:

  1. Extract To tag from incoming INVITE
  2. Check if tag exists in byLocalTag map
  3. If match found, delegate to existing call's handleReinvite()
  4. Otherwise, proceed with normal new call processing

New handleReinvite() method on inboundCall:

  • Validates CSeq (detect retransmissions vs new re-INVITEs)
  • Responds with current SDP for session refresh
  • Handles edge cases (old CSeq, missing headers)

Impact

✅ Fixes outbound call re-INVITEs (broken → working) ✅ Improves inbound call re-INVITEs (performance + reliability) ✅ RFC 3261 compliant dialog identification
✅ Reduced CPU usage (early exit avoids processInvite) ✅ Better logging for debugging

Testing

Tested with SignalWire re-INVITEs on both inbound and outbound calls. Verified proper 200 OK responses and continued call operation.

Fixes a critical bug where mid-dialog re-INVITEs from the remote party
were incorrectly processed as new inbound calls instead of being
recognized as part of an existing dialog.

## Problem
When a SIP provider (e.g., SignalWire) sends a re-INVITE during an active
call, LiveKit was treating it as a new call, causing:
- Authentication challenges (401/407)
- "No trunk found" errors (404)
- Call failures and disconnections

This primarily affected outbound calls where From/To headers are "swapped"
from LiveKit's perspective (remote is UAC for the re-INVITE).

## Root Cause
The existing dialog matching happened too late (after processInvite) and
only checked byCallID, which is unreliable. The code missed the definitive
proof: the To tag in the request matches our local tag.

Per RFC 3261 Section 12.2, a dialog is identified by:
- Call-ID
- Local tag (in To header for UAS)
- Remote tag (in From header for UAS)

## Solution
Add early dialog matching in onInvite() before processing as new call:
1. Extract To tag from incoming INVITE
2. Check if tag exists in byLocalTag map
3. If match found, delegate to existing call's handleReinvite()
4. Otherwise, proceed with normal new call processing

New handleReinvite() method on inboundCall:
- Validates CSeq (detect retransmissions vs new re-INVITEs)
- Responds with current SDP for session refresh
- Handles edge cases (old CSeq, missing headers)

## Impact
✅ Fixes outbound call re-INVITEs (broken → working)
✅ Improves inbound call re-INVITEs (performance + reliability)
✅ RFC 3261 compliant dialog identification
✅ Reduced CPU usage (early exit avoids processInvite)
✅ Better logging for debugging

## Testing
Tested with SignalWire re-INVITEs on both inbound and outbound calls.
Verified proper 200 OK responses and continued call operation.
@briankwest briankwest requested a review from a team as a code owner February 6, 2026 22:33
@CLAassistant
Copy link

CLAassistant commented Feb 6, 2026

CLA assistant check
All committers have signed the CLA.

@briankwest
Copy link
Author

briankwest commented Feb 6, 2026

Re-INVITE Flow: Before vs After Fix

🔴 BEFORE FIX (Broken)

┌─────────────────────────────────────────────────────────────────┐
│ SignalWire sends re-INVITE on established outbound call         │
│                                                                 │
│  From: <sip:+16048158957@sw.com>;tag=gK69j24Mj789F              │
│  To: <sip:+14373654058@lk.com>;tag=SCL_4upGMjQw3oCb  ← OUR TAG  │
│  Call-ID: abc123                                                │
│  CSeq: 2 INVITE                                                 │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
                    ┌──────────────┐
                    │  onInvite()  │
                    └──────┬───────┘
                           │
                           ▼
                ┌──────────────────────┐
                │  processInvite()     │  ← Treats as NEW call!
                │  (line 282)          │
                └──────────┬───────────┘
                           │
                           ▼
        ┌──────────────────────────────────────┐
        │ Creates new sipInbound context       │
        │ Generates NEW call ID: xyz789        │
        │ Assigns NEW local tag                │
        └──────────┬───────────────────────────┘
                   │
                   ▼
        ┌──────────────────────────────────────┐
        │ ValidateInvite()                     │
        └──────────┬───────────────────────────┘
                   │
                   ▼
        ┌──────────────────────────────────────┐
        │ ? Check byCallID (line 347)          │  ← Too late!
        │                                      │
        │ existing := s.byCallID["abc123"]     │
        │ if existing != nil {                 │
        │   // This MIGHT work for inbound     │
        │   // but FAILS for outbound!         │
        │ }                                    │
        └──────────┬───────────────────────────┘
                   │
                   ├─── IF MATCHED (lucky) ────┐
                   │                           │
                   │                           ▼
                   │                   ┌─────────────────┐
                   │                   │ Accept as       │
                   │                   │ keep-alive      │
                   │                   └─────────────────┘
                   │
                   └─── IF NOT MATCHED (bug!) ──┐
                                                │
                                                ▼
                        ┌──────────────────────────────────┐
                        │ X GetAuthCredentials()           │
                        │ X Checks trunks/dispatch rules   │
                        │ X Sends 401/407 challenge        │
                        │ X Or 404 No Trunk Found          │
                        └──────────────────────────────────┘
                                     │
                                     ▼
                        ┌──────────────────────────────────┐
                        │ * RE-INVITE FAILS                │
                        │ * Call may drop                  │
                        │ * Provider confused              │
                        └──────────────────────────────────┘

* AFTER FIX (Correct)

┌─────────────────────────────────────────────────────────────────┐
│ SignalWire sends re-INVITE on established outbound call         │
│                                                                 │
│  From: <sip:+16048158957@sw.com>;tag=gK69j24Mj789F              │
│  To: <sip:+14373654058@lk.com>;tag=SCL_4upGMjQw3oCb  ← OUR TAG  │
│  Call-ID: abc123                                                │
│  CSeq: 2 INVITE                                                 │
└─────────────────────────────────────────────────────────────────┘
                            │
                            ▼
                    ┌──────────────────┐
                    │  onInvite()      │  ← NEW: Early dialog check
                    └──────┬───────────┘
                           │
                           ▼
        ┌──────────────────────────────────────────────┐
        │ * Extract To tag: "SCL_4upGMjQw3oCb"         │
        │ s.cmu.RLock()                                │
        │ existing := s.byLocalTag[LocalTag(toTag)]    │
        │ s.cmu.RUnlock()                              │
        └──────────┬───────────────────────────────────┘
                   │
                   ├─── To tag NOT in byLocalTag ───┐
                   │    (new call)                  │
                   │                                ▼
                   │                    ┌─────────────────────┐
                   │                    │ processInvite()     │
                   │                    │ (normal flow)       │
                   │                    └─────────────────────┘
                   │
                   └─── To tag FOUND in byLocalTag ──┐
                        (existing dialog!)           │
                                                     ▼
                        ┌──────────────────────────────────────┐
                        │ * existing.handleReinvite(req, tx)   │
                        │                                      │
                        │ LOG: "received mid-dialog re-INVITE" │
                        │      localTag=SCL_4upGMjQw3oCb       │
                        └──────────┬───────────────────────────┘
                                   │
                                   ▼
                        ┌──────────────────────────────────────┐
                        │ Validate CSeq                        │
                        │                                      │
                        │ cseq := req.CSeq()                   │
                        │ inviteCSeq := c.cc.InviteCSeq()      │
                        └──────────┬───────────────────────────┘
                                   │
                ┌──────────────────┼──────────────────┐
                │                  │                  │
                ▼                  ▼                  ▼
        ┌────────────┐   ┌─────────────────┐   ┌──────────────┐
        │ CSeq < old │   │ CSeq == old     │   │ CSeq > old   │
        │ (ignore)   │   │ (retransmit)    │   │ (re-INVITE)  │
        └────────────┘   └────────┬────────┘   └──────┬───────┘
                                  │                   │
                                  ▼                   ▼
                        ┌──────────────────┐   ┌──────────────────┐
                        │ Resend 200 OK    │   │ Accept with      │
                        │ with same SDP    │   │ current SDP      │
                        └──────────────────┘   └──────┬───────────┘
                                                      │
                                                      ▼
                        ┌──────────────────────────────────────┐
                        │ * 200 OK sent                        │
                        │ * Call continues normally            │
                        │ * No new call created                │
                        │ * No authentication challenge        │
                        └──────────────────────────────────────┘

🔑 Key Differences

Detection Point

Before After
Line 347 (after processing) Line 277 (immediately in onInvite)
Inside processInvite() Before processInvite()

Matching Strategy

Before After
Only Call-ID + CSeq To tag (definitive proof)
Unreliable for outbound calls Works for all calls

Performance

Before After
Full INVITE processing overhead Early exit on tag match
Creates temporary objects Zero overhead

Correctness

Before After
X Fails for outbound re-INVITEs * RFC 3261 compliant
X Sends auth challenges * Immediate 200 OK

📊 Dialog State Diagram

                        ┌─────────────────────┐
                        │   Call Established  │
                        │                     │
                        │  byLocalTag:        │
                        │  [SCL_xyz] → call   │
                        │                     │
                        │  byCallID:          │
                        │  [abc123] → call    │
                        │                     │
                        │  byRemoteTag:       │
                        │  [gK69...] → call   │
                        └──────────┬──────────┘
                                   │
                    ┌──────────────┼──────────────┐
                    │              │              │
         ┌──────────▼────────┐     │    ┌─────────▼─────────┐
         │  New INVITE       │     │    │  re-INVITE        │
         │  (no To tag)      │     │    │  (has To tag)     │
         └──────────┬────────┘     │    └─────────┬─────────┘
                    │              │              │
                    ▼              │              ▼
         ┌──────────────────────┐  │    ┌────────────────────┐
         │ processInvite()      │  │    │ handleReinvite()   │
         │ - New dialog         │  │    │ - Same dialog      │
         │ - New tags           │  │    │ - Same tags        │
         │ - New CallID         │  │    │ - Same CallID      │
         └──────────────────────┘  │    └────────────────────┘
                                   │
                                   ▼
                        ┌─────────────────────┐
                        │   Multiple Dialogs  │
                        │                     │
                        │  byLocalTag:        │
                        │  [SCL_xyz] → call1  │
                        │  [SCL_abc] → call2  │
                        │                     │
                        └─────────────────────┘

🎭 From/To Direction Examples

Outbound Call Initial INVITE (LiveKit → Provider)

┌────────────────────────────────────────────────┐
│ INVITE sip:+16048158957@provider.com SIP/2.0   │
│                                                │
│ From: <sip:+14373654058@lk>;tag=SCL_xyz   ←────┼── LiveKit (UAC)
│ To: <sip:+16048158957@provider>           ←────┼── Provider (UAS, no tag yet)
│ Call-ID: abc123                                │
│ CSeq: 1 INVITE                                 │
└────────────────────────────────────────────────┘

Provider's 200 OK Response

┌────────────────────────────────────────────────┐
│ SIP/2.0 200 OK                                 │
│                                                │
│ From: <sip:+14373654058@lk>;tag=SCL_xyz   ←────┼── Same as INVITE
│ To: <sip:+16048158957@p>;tag=gK69j24      ←────┼── Provider adds its tag
│ Call-ID: abc123                                │
│ CSeq: 1 INVITE                                 │
└────────────────────────────────────────────────┘

Dialog now established:
  - Call-ID: abc123
  - Local tag (LiveKit): SCL_xyz
  - Remote tag (Provider): gK69j24

Provider's Mid-Dialog re-INVITE

┌────────────────────────────────────────────────┐
│ INVITE sip:+14373654058@lk SIP/2.0             │
│                                                │
│ From: <sip:+16048158957@p>;tag=gK69j24    ←────┼── Provider (UAC for re-INVITE!)
│ To: <sip:+14373654058@lk>;tag=SCL_xyz     ←────┼── LiveKit (UAS for re-INVITE!)
│ Call-ID: abc123                           ←────┼── Same Call-ID
│ CSeq: 2 INVITE                                 │
└────────────────────────────────────────────────┘

Key observation:
  * To tag = SCL_xyz = LiveKit's tag = PROVES this is our dialog!
  * From/To "swapped" because Provider is now the UAC
  * This is CORRECT per RFC 3261

🧩 Why byLocalTag Works

Server State:
  byLocalTag = {
    "SCL_xyz": call1,  ← Outbound call to Provider
    "LK_abc":  call2,  ← Inbound call from User
    "SCL_def": call3,  ← Another outbound call
  }

Incoming re-INVITE:
  To: <...>;tag=SCL_xyz

Lookup:
  byLocalTag["SCL_xyz"] → call1 * MATCH!

Conclusion:
  This re-INVITE belongs to call1!
  Handle it within that call's context.

🎯 Summary

The fix is simple but profound:

Instead of asking: "Does this Call-ID exist?" (ambiguous)
We now ask: "Is this To tag mine?" (definitive)

The To tag is like a session ID that we generated. If we see our own session ID in an incoming request, we KNOW it's for an existing session, not a new one.

This is faster, simpler, and RFC-compliant. *

The nonce was using only time.Now().UnixMicro() which could produce
identical values for calls arriving in the same microsecond, causing
the TestDigestAuthSimultaneousCalls test to fail.

Now includes the Call-ID in the nonce to ensure uniqueness even when
multiple authentication challenges are generated simultaneously.

This is a defensive fix to prevent race conditions in digest auth.
@codecov
Copy link

codecov bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 52.30769% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.22%. Comparing base (0460b40) to head (12f7b49).
⚠️ Report is 219 commits behind head on main.

Files with missing lines Patch % Lines
pkg/sip/inbound.go 52.30% 20 Missing and 11 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #582      +/-   ##
==========================================
- Coverage   65.25%   64.22%   -1.03%     
==========================================
  Files          51       34      -17     
  Lines        6588     6617      +29     
==========================================
- Hits         4299     4250      -49     
- Misses       1915     1935      +20     
- Partials      374      432      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adds comprehensive test coverage for the new handleReinvite() method:
- Missing CSeq header (400 Bad Request)
- Retransmission detection (same CSeq → resend 200 OK)
- Out-of-order INVITE (lower CSeq → ignore)
- New re-INVITE (higher CSeq → accept with current SDP)
- No SDP available error handling (500 Internal Server Error)
- Nonce uniqueness verification

These tests cover the main code paths added by the re-INVITE fix,
improving code coverage for the new functionality.
@briankwest briankwest force-pushed the fix/reinvite-dialog-matching branch from 5547a81 to 1f3bbee Compare February 6, 2026 23:20
Copy link
Contributor

@dennwc dennwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @briankwest! The change look good, just a few minor comments:

briankwest and others added 5 commits February 7, 2026 10:23
- Extract nonce generation into generateNonce() helper function
- Add nil check and logging when SDP unavailable during retransmission
- Use AcceptAsKeepAlive() helper to ensure proper SIP headers
- Remove redundant Call-ID based re-INVITE fallback in processInvite()
- Improve TestNonceUniqueness to test actual helper function

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove unused tr variable
- Extract To tag from re-INVITE request as local tag for newInbound
- Fixes type mismatch (RemoteTag vs LocalTag)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Instead of creating a temporary sipInbound (which requires server
infrastructure and fails in unit tests), directly create the response
following the same pattern as AcceptAsKeepAlive/respondWithData:
- Add Content-Type header
- Add Allow header with supported methods
- Add Contact header
- Add extra headers via addExtraHeaders

This satisfies the intent of using proper SIP headers while working
in both production and test scenarios.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add defensive nil checks before accessing contact and calling
addExtraHeaders to prevent panics in test scenarios where the
full server infrastructure isn't initialized.

- Check c.cc.contact != nil before appending Contact header
- Check c.s != nil && c.s.conf != nil before calling addExtraHeaders

This allows unit tests to work with minimal inboundCall setup while
maintaining full functionality in production.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@dennwc dennwc force-pushed the fix/reinvite-dialog-matching branch from 7f0452e to 12f7b49 Compare February 10, 2026 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants