Skip to content

fix(langchain): align LangChain/LangGraph tracing with the Python SDK#209

Merged
gustavocidornelas merged 1 commit into
mainfrom
fix/langchain-trace-payload-size
Jun 11, 2026
Merged

fix(langchain): align LangChain/LangGraph tracing with the Python SDK#209
gustavocidornelas merged 1 commit into
mainfrom
fix/langchain-trace-payload-size

Conversation

@viniciusdsmello

@viniciusdsmello viniciusdsmello commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

LangGraph agent traces produced by the OpenlayerHandler callback routinely exceeded the platform's 10MB request limit and were rejected wholesale. Investigating the handler surfaced three distinct problem areas — payload bloat, trace corruption under concurrency, and poor dashboard rendering — all stemming from the same root cause: the TypeScript handler had drifted from the Python SDK's behavior, which is the reference implementation the platform is built around.

This PR aligns the TypeScript LangChain integration with langchain_callback.py. No consumer-side changes required — upgrading the SDK is enough.

1. Payload size

A representative agent run (50KB response, 15 bound tools, 8-message history) shrinks from 1604KB to 659KB (-59%); the gain grows with the number of LLM calls per run.

  • Filter LangGraph internal runs. LANGSMITH_HIDDEN_TAG was defined but never used. Internal plumbing runs (ChannelWrite, branch runnables — all tagged langsmith:hidden) each became a step carrying the full graph state as both inputs and output. They are now skipped, like in Python.
  • Compact rawOutput. It was JSON.stringify({generation, llmOutput, fullResponse}, null, 2)fullResponse duplicated generation + llmOutput, and the pretty-printed string was then escaped inside the outer JSON. Now {generation, llmOutput}, compact.
  • Deduplicate invocation params. Bound tool schemas were serialized 4x per chat completion step (modelParameters, metadata.invocation_params, metadata.model_parameters, metadata.extra_params.invocation_params). Now only modelParameters.

No client-side truncation is performed, matching Python: size is addressed by not serializing redundant data in the first place.

2. Concurrency isolation

The handler relied on the tracer's module-global step stack, so concurrent graph runs in the same process nested into one ever-growing merged trace. Steps are now assembled per run keyed by runId/parentRunId (mirroring Python's run_id/parent_run_id maps), so concurrent executions upload as separate traces. When an ambient trace context exists (e.g. a trace()-wrapped function), steps still nest under it as before.

3. Trace shape & step types

  • Chain runs become USER_CALL steps named after the chain (no more "Handoffs: " prefix); the LangGraph root records inputs.prompt and surfaces the final message content as the trace output, so the dashboard shows the actual answer instead of a serialized state object.
  • Before upload, LangChain objects are converted recursively (Python's _convert_langchain_objects): messages render as {role, content} instead of raw lc/kwargs constructor JSON.
  • Agent steps use Python's Agent Tool: <tool> naming with structured {tool, tool_input, log} inputs; retriever inputs become {query}.

4. New step types: HANDOFF and GUARDRAIL

The Python SDK and the platform support handoff and guardrail step types that the TypeScript SDK could not emit:

  • StepType.HANDOFF / HandoffStep (fromComponent, toComponent, handoffData) and StepType.GUARDRAIL / GuardrailStep (action, reason, blocked/detected/redacted entities, confidenceThreshold, blockStrategy, dataType), serializing the same wire fields as Python's steps.py.
  • addHandoffStepToTrace now emits a real handoff step (it previously emitted a chain step with a name prefix); new addGuardrailStepToTrace helper.
  • The LangChain handler maps LangGraph multi-agent handoff tools (transfer_to_<agent> / transfer_back_to_<agent>) to HANDOFF steps with from/to components.

5. Tracer

Upload logic extracted into a reusable processAndUploadTrace() (analogous to Python's _upload_and_publish_trace). Upload failures now log a compact summary — pipeline id, inference id, payload size — instead of dumping the entire pretty-printed trace into the logs.

Validation

Test files are intentionally not part of this PR; validation was performed locally:

  • Unit/integration tests at the handler→upload boundary covering each fix, written test-first; the concurrency test reproduces the merged-trace bug against the old implementation
  • End-to-end tests running a real LangGraph StateGraph via graph.withConfig({callbacks}) — real callback events including LangGraph's hidden internal runs — with only the HTTP boundary mocked; these fail against the previous implementation and pass with this branch
  • Live validation: real LangGraph runs traced with this branch uploaded successfully to a real inference pipeline, confirming ingestion and the improved trace rendering; a probe trace confirmed the platform persists all step types used here, including handoff and guardrail
  • tests/integrations/claudeAgentSdk.test.ts (17 tests) green — it shares the tracer internals
  • tsc --noEmit, eslint and prettier clean; remaining jest failures are pre-existing on main (generated api-resources tests require the mock Steady server; openai-tracer.test.ts has one pre-existing failure)

Out of scope

  • Migrating tracedTool from function_call to tool steps (behavior change for existing users — needs a product decision)
  • Gzip compression of the data-stream POST (requires confirming backend support for Content-Encoding: gzip)

🤖 Generated with Claude Code

@viniciusdsmello viniciusdsmello force-pushed the fix/langchain-trace-payload-size branch from 4ad2caa to 27236a6 Compare June 11, 2026 15:21
@viniciusdsmello viniciusdsmello changed the title fix(langchain): shrink trace payloads and isolate concurrent runs fix(langchain): align LangChain/LangGraph tracing with the Python SDK Jun 11, 2026
LangGraph agent traces routinely exceeded the platform's 10MB request
limit and were rejected wholesale, concurrent runs corrupted each
other's traces, and steps rendered poorly in the dashboard. All of it
stemmed from the TypeScript handler drifting from langchain_callback.py,
the reference implementation. This aligns the two:

Payload size (-59% on a representative agent run):
- Skip LangGraph internal runs tagged langsmith:hidden (ChannelWrite,
  branch runnables), which carried the full graph state as both inputs
  and output. The constant existed but was never used.
- Compact rawOutput: drop the duplicated fullResponse blob and the
  pretty-printing that inflated the escaped string.
- Serialize invocation params (including bound tool schemas) once per
  chat completion step, in modelParameters, instead of four times.

Concurrency:
- Assemble steps per run keyed by runId/parentRunId, mirroring the
  Python handler's run_id/parent_run_id maps, instead of relying on the
  tracer's module-global step stack. Concurrent graph executions now
  upload as separate traces; ambient trace contexts still nest as
  before.

Trace shape:
- Chain runs become USER_CALL steps named after the chain; the LangGraph
  root records inputs.prompt and surfaces the final message content as
  the trace output.
- LangChain objects are converted recursively before upload, so messages
  render as {role, content} instead of raw lc/kwargs constructor JSON.
- Agent steps use Python's "Agent Tool: <tool>" naming with structured
  inputs; retriever inputs become {query}.

Step types:
- Add HANDOFF/GUARDRAIL step types with HandoffStep and GuardrailStep
  serializing the same wire fields as Python's steps.py.
- addHandoffStepToTrace now emits a real handoff step (previously a
  chain step with a name prefix); add addGuardrailStepToTrace.
- Map LangGraph multi-agent handoff tools (transfer_to_<agent>) to
  HANDOFF steps with from/to components.

Tracer:
- Extract upload logic into a reusable processAndUploadTrace(),
  analogous to Python's _upload_and_publish_trace.
- Log a compact summary on upload failure instead of dumping the entire
  pretty-printed trace into the logs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@viniciusdsmello viniciusdsmello force-pushed the fix/langchain-trace-payload-size branch from cd37a78 to 61704b1 Compare June 11, 2026 17:11
@gustavocidornelas gustavocidornelas merged commit 11d1039 into main Jun 11, 2026
5 checks passed
@gustavocidornelas gustavocidornelas deleted the fix/langchain-trace-payload-size branch June 11, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants