-
Notifications
You must be signed in to change notification settings - Fork 12
[BUG]: Downstream spans are lost when nginx returns 499 (client closed connection) #310
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Module Version(s)
1.12.0
Bug Report
Description
When using nginx-datadog for head-based sampling on an nginx ingress, downstream spans are silently dropped after nginx records a 499 (client closed connection) status. The trace becomes incomplete even though downstream services actually processed the request (confirmed by correlated logs).
Environment
- Architecture: Kubernetes with nginx Ingress Controller + nginx-datadog module
- Sampling: Head-based sampling at the nginx (ingress) level
- Tracing propagation: Trace context is correctly propagated through the full call chain
Call chain
ingress-public (nginx) → service-A → ingress-private (nginx) → service-B
Reproduction Code
Steps to reproduce
service-Asends an HTTP request throughingress-privatetoservice-Bservice-Ahas an HTTP client timeout of 800msservice-Btakes longer than 800ms to respondservice-Acloses the connection after its timeoutingress-private(nginx) records a 499 status
Observed behavior
- The trace only contains spans up to the 499 event on
ingress-public (nginx)&service-A - No spans are reported after the 499 timestamp (i.e., spans from
ingress-private (nginx)andservice-Bare missing) - However, logs with the same
trace_idexist foringress-private (nginx)andservice-B, proving the request was received and processed downstream - The trace appears truncated/incomplete in Datadog APM
Expected behavior
All spans from the full call chain should be reported to Datadog, regardless of whether the originating client closed the connection. The 499 on nginx should not prevent downstream spans from being flushed.
Hypothesis
When nginx handles a 499, it appears that:
- nginx aborts the upstream connection to
service-B - The nginx-datadog module may stop propagating or flushing spans for that request upon client disconnection
- Even if
service-Bcontinues to run and generates spans, the trace context chain is broken at the nginx level
Questions
- Does the nginx-datadog module explicitly handle the 499 case? Does it flush its own span before aborting?
- Is there a way to ensure that spans generated by downstream services (after the 499) are still correctly associated with the original trace?
- Could the module be enhanced to always flush/finalize spans on client disconnection rather than silently dropping them?
Error Logs
No response
Operating System
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working