fix: handle outbound re-INVITE routing, SDP echo, and direction negotiation#603
Closed
rtnpro wants to merge 18 commits intolivekit:mainfrom
Closed
fix: handle outbound re-INVITE routing, SDP echo, and direction negotiation#603rtnpro wants to merge 18 commits intolivekit:mainfrom
rtnpro wants to merge 18 commits intolivekit:mainfrom
Conversation
Fixes a critical bug where mid-dialog re-INVITEs from the remote party were incorrectly processed as new inbound calls instead of being recognized as part of an existing dialog. ## Problem When a SIP provider (e.g., SignalWire) sends a re-INVITE during an active call, LiveKit was treating it as a new call, causing: - Authentication challenges (401/407) - "No trunk found" errors (404) - Call failures and disconnections This primarily affected outbound calls where From/To headers are "swapped" from LiveKit's perspective (remote is UAC for the re-INVITE). ## Root Cause The existing dialog matching happened too late (after processInvite) and only checked byCallID, which is unreliable. The code missed the definitive proof: the To tag in the request matches our local tag. Per RFC 3261 Section 12.2, a dialog is identified by: - Call-ID - Local tag (in To header for UAS) - Remote tag (in From header for UAS) ## Solution Add early dialog matching in onInvite() before processing as new call: 1. Extract To tag from incoming INVITE 2. Check if tag exists in byLocalTag map 3. If match found, delegate to existing call's handleReinvite() 4. Otherwise, proceed with normal new call processing New handleReinvite() method on inboundCall: - Validates CSeq (detect retransmissions vs new re-INVITEs) - Responds with current SDP for session refresh - Handles edge cases (old CSeq, missing headers) ## Impact ✅ Fixes outbound call re-INVITEs (broken → working) ✅ Improves inbound call re-INVITEs (performance + reliability) ✅ RFC 3261 compliant dialog identification ✅ Reduced CPU usage (early exit avoids processInvite) ✅ Better logging for debugging ## Testing Tested with SignalWire re-INVITEs on both inbound and outbound calls. Verified proper 200 OK responses and continued call operation.
The nonce was using only time.Now().UnixMicro() which could produce identical values for calls arriving in the same microsecond, causing the TestDigestAuthSimultaneousCalls test to fail. Now includes the Call-ID in the nonce to ensure uniqueness even when multiple authentication challenges are generated simultaneously. This is a defensive fix to prevent race conditions in digest auth.
Adds comprehensive test coverage for the new handleReinvite() method: - Missing CSeq header (400 Bad Request) - Retransmission detection (same CSeq → resend 200 OK) - Out-of-order INVITE (lower CSeq → ignore) - New re-INVITE (higher CSeq → accept with current SDP) - No SDP available error handling (500 Internal Server Error) - Nonce uniqueness verification These tests cover the main code paths added by the re-INVITE fix, improving code coverage for the new functionality.
- Extract nonce generation into generateNonce() helper function - Add nil check and logging when SDP unavailable during retransmission - Use AcceptAsKeepAlive() helper to ensure proper SIP headers - Remove redundant Call-ID based re-INVITE fallback in processInvite() - Improve TestNonceUniqueness to test actual helper function Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove unused tr variable - Extract To tag from re-INVITE request as local tag for newInbound - Fixes type mismatch (RemoteTag vs LocalTag) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Instead of creating a temporary sipInbound (which requires server infrastructure and fails in unit tests), directly create the response following the same pattern as AcceptAsKeepAlive/respondWithData: - Add Content-Type header - Add Allow header with supported methods - Add Contact header - Add extra headers via addExtraHeaders This satisfies the intent of using proper SIP headers while working in both production and test scenarios. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add defensive nil checks before accessing contact and calling addExtraHeaders to prevent panics in test scenarios where the full server infrastructure isn't initialized. - Check c.cc.contact != nil before appending Contact header - Check c.s != nil && c.s.conf != nil before calling addExtraHeaders This allows unit tests to work with minimal inboundCall setup while maintaining full functionality in production. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Introduce a graceful shutdown mechanism for the SIP service. - Add `StartDrain` methods to `sip.Client`, `sip.Server`, and `sip.Service`. These methods stop accepting new inbound SIP INVITEs and new outbound `CreateSIPParticipant` requests, respectively, while allowing existing calls to complete. - Integrate the `StartDrain` call into the service shutdown sequence in `service.Run()` to immediately prevent new call establishments. - Add a `DisableOutboundCalls` configuration option to prevent new outbound SIP calls. When enabled, `CreateSIPParticipant` requests will be rejected, and affinity will be zeroed out. - Enhance the health check endpoint: - Expose `/healthz` in addition to `/`. - Implement a 30-second grace period after service stop before exiting, allowing load balancers to detect unhealthy status. - Update `.gitignore` to include `.claude`. Changes: - .gitignore (2 lines, 7 characters) - cmd/livekit-sip/main.go (2 lines, 219 characters) - pkg/config/config.go (4 lines, 274 characters) - pkg/service/service.go (26 lines, 1142 characters) - pkg/sip/client.go (6 lines, 213 characters) - pkg/sip/inbound.go (7 lines, 429 characters) - pkg/sip/server.go (6 lines, 220 characters) - pkg/sip/service.go (14 lines, 522 characters) - test/cloud/service.go (2 lines, 263 characters)
…matching Fix re-INVITE dialog matching using RFC 3261 compliant To tag lookup
Add TestSIPOutboundReInvite that proves mid-dialog re-INVITEs on outbound calls are misidentified as new inbound calls and rejected as "flood" (status 486), killing the active call. Also fix pre-existing compilation error in sip_test.go where service.NewService was missing the StartDrain parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…NVITE Add TestSIPOutboundReInviteCallDrop that reproduces the production scenario where a mid-dialog re-INVITE on an outbound call gets 486 Rejected and the call is torn down with BYE. This happens when an inbound trunk matches the reversed call direction (common with SBC session refresh). Also refactor shared UAS setup and re-INVITE helpers into reusable functions (setupUAS, sendReInvite, waitForFinalResponse). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Re-INVITEs for outbound calls were misidentified as new inbound calls because Server.onInvite() only checked its own byLocalTag map (inbound calls) and never delegated to the Client. This adds the missing link: Server delegates unmatched To-tag INVITEs to sipUnhandled (Client), Client looks up outbound calls by SIP Call-ID, and responds with 200 OK. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add SendReInvite() to siptest.Client for sending mid-dialog re-INVITEs within an established call, and TestSIPReInvite integration test that verifies re-INVITEs are accepted with 200 OK and the call stays active. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…E check The second ExpectRoomWithParticipants call after re-INVITE was missing the SIP participant's Metadata, Name, and full Attributes, causing the strict equality check in compareParticipants to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…P in re-INVITE response The AcceptReInvite handler used c.invite.Body() which returns a reference to the SIP request's internal buffer. This buffer can become stale after the transaction terminates, causing the 200 OK to echo the carrier's SDP back instead of our own — telling the carrier to loop audio to itself. Store our SDP offer as an independent byte copy (ownSDP) on the sipOutbound struct, mirroring the pattern used by sipInbound.lastSDP. Add debug logging of both request and response SDP for production troubleshooting. Add unit and integration tests verifying the response SDP contains LiveKit's session (s=LiveKit), not the carrier's. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle RFC 3264 direction attributes in re-INVITE responses to support call hold/resume/transfer scenarios. When a carrier sends a=sendonly (hold), we now respond with a=recvonly instead of always using a=sendrecv. Also updates the RTP destination when the carrier's media address changes after a call transfer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Biswajit Pain seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #603 +/- ##
==========================================
- Coverage 65.25% 63.63% -1.62%
==========================================
Files 51 36 -15
Lines 6588 7156 +568
==========================================
+ Hits 4299 4554 +255
- Misses 1915 2162 +247
- Partials 374 440 +66 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan