fix(llmobs): respect configured writer timeout in _send_payload#18575
Open
jeong-hasang wants to merge 2 commits into
Open
fix(llmobs): respect configured writer timeout in _send_payload#18575jeong-hasang wants to merge 2 commits into
jeong-hasang wants to merge 2 commits into
Conversation
56fef3b to
ba97162
Compare
BaseLLMObsWriter._send_payload called get_connection(self._intake) without a timeout, so it ignored the writer's configured self._timeout (_DD_LLMOBS_WRITER_TIMEOUT, default 5s) and fell back to the 2s connection default (DEFAULT_TIMEOUT). On high-latency links the 2s socket timeout was exceeded at the tail, all retries hit the same ceiling, and the payload was dropped. Every other get_connection call in the module passes timeout= explicitly; this was the only one that did not. Pass timeout=self._timeout to match sibling code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ba97162 to
192d87d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
BaseLLMObsWriter._send_payloadopens its connection withget_connection(self._intake), passing no timeout argument.As a result, it ignores the writer's own
self._timeoutand falls back to get_connection's DEFAULT_TIMEOUT of 2s.On a high-latency connection to the agent/intake, the 2s socket timeout is exceeded at the tail. All retries hit the same 2s ceiling, so the batch is dropped (failed to send N LLMObs span events).
No env var works around this.
_DD_LLMOBS_WRITER_TIMEOUTonly sets self._timeout, which this path ignored.DD_TRACE_AGENT_TIMEOUT_SECONDSis not consulted by the LLMObs writer.In production, the only effective mitigation was monkeypatching get_connection to raise its default timeout — after which the timeouts disappeared, confirming the 2s default was the cause.
The fix passes timeout=self._timeout, restoring the 5s default that #9438 intended for the writer.
Testing
tests/llmobs/test_llmobs_span_agentless_writer.py::test_send_payload_uses_configured_timeout, which mocksget_connectionand asserts_send_payloadopens the connection with the writer's configured timeout (notDEFAULT_TIMEOUT).llmobssuite locally; the new test passes and no related tests regress.Risks
Low. One-line change in a background flush thread — it does not block the request path. Behavior change: the effective LLMObs socket timeout goes from 2s to 5s (the intended
_DD_LLMOBS_WRITER_TIMEOUTdefault). No public API change.Additional Notes
Reproduced on 4.10.3; the bug is unchanged on
main. Sibling timeout work for reference: #9438 (writer default 2s→5s)