feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption#133
feat!: Add per-execution runId, at-most-once tracking, and cross-process tracker resumption#133jsonbailey wants to merge 28 commits intomainfrom
Conversation
- Each tracker now carries a runId (UUIDv4) included in all emitted events, scoping every metric to a single execution - At-most-once semantics: duplicate calls to track_duration, track_tokens, track_success/track_error, track_feedback, and track_time_to_first_token on the same tracker are dropped with a warning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ess tracker resumption Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bdf7384 to
211ead4
Compare
…osure The run_id parameter on LDAIConfigTracker is now required (no default). UUID generation happens in the tracker_factory closure in client.py, keeping the tracker itself a plain data holder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Break long tuple lines in client.py to stay under 120 char limit - Add required run_id parameter to LDAIConfigTracker calls in openai and langchain provider tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the redundant _tracked dict from LDAIConfigTracker. The summary already stores each metric with None as the unset sentinel, so the nil-check on summary properties serves as the at-most-once guard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New order: ld_client, run_id, config_key, variation_key, version, model_name, provider_name, context, graph_key. All call sites converted to keyword arguments for resilience against future reorders. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…oken Reorder LDAIConfigTracker.__init__ to match updated spec: context now comes before model_name and provider_name. Also fix resumption_token to omit variationKey from the JSON when it is empty, and handle the absent key when reconstructing from a token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All six at-most-once guard warnings in tracker.py now log the track data dict (runId, configKey, etc.) to aid debugging duplicate-track scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the resumption token decoding logic from LDAIClient.create_tracker into a classmethod on LDAIConfigTracker per spec 1.1.20.2. The client method now delegates to LDAIConfigTracker.from_resumption_token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Match the resumption token behavior: only include variationKey in the track data dict when it has a non-empty value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The create_tracker field on AIConfig is now always a callable that returns a working tracker, even when the config is disabled. The factory is always set to tracker_factory — callers use the enabled flag to decide whether to proceed, not the factory result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BREAKING CHANGE: The `tracker` field has been removed from all config dataclasses (AICompletionConfig, AIJudgeConfig, AIAgentConfig). Users must now call `config.create_tracker()` to obtain a tracker instance. ManagedModel and ManagedAgent no longer accept a tracker constructor parameter — they call `create_tracker()` from the config on each invocation. The `__evaluate` return tuple no longer includes a pre-created tracker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add graphKey to the resumption token following the spec key order: runId, configKey, variationKey (if set), version, graphKey (if set). The from_resumption_token classmethod now decodes and passes graphKey to the tracker constructor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Judge now calls self._ai_config.create_tracker() per evaluate() invocation instead of receiving a tracker at construction time. ManagedAgentGraph no longer stores or exposes a tracker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace logging.getLogger(__name__) with the SDK's shared log instance (from ldai import log) for consistency with the rest of the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Migrate langchain and openai provider packages from config.tracker to config.create_tracker() and fix test signatures to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… factory Per AIGRAPH spec 1.4.3, AgentGraphDefinition now has a create_tracker callable that returns a new AIGraphTracker per invocation instead of storing a pre-created instance. Removes get_tracker() method entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_flush_final_segment and _track_tool_calls were each calling create_tracker() independently, generating new runIds that broke per-execution event correlation. Now build_node creates one tracker per node, cached in _node_trackers, and reused by all tracking methods. Adds test_same_run_id_across_token_success_and_tool_call_events to verify all node-level events for a single execution share one runId. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
run() and _build_agents() each called create_tracker() on the graph, producing two tracker instances. Now run() creates the tracker once and passes it to _build_agents() so handoff callbacks and run-level tracking share the same instance. Tests now assert graph.create_tracker is called exactly once per run and node create_tracker is called exactly once per node. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
from_resumption_token and LDAIClient.create_tracker now return ldclient.Result instead of raising ValueError on invalid tokens, letting callers handle errors without try/except. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change AgentGraphDefinition.create_tracker from Callable[[], AIGraphTracker] with default lambda: None to Optional[Callable[[], AIGraphTracker]] with default None. Guard call sites in both runners with `is not None` before invoking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The disabled() factory on AIConfigDefault and subclasses created configs without tracker factories, breaking the spec requirement. Replace with private module-level constants in client.py, matching how js-core handles disabled configs as an internal concern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raise a clear RuntimeError if create_tracker returns None rather than letting it crash with AttributeError on track_metrics_of_async. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Cache node trackers in langgraph_callback_handler flush() to avoid creating multiple trackers per node with different runIds - Read graph key directly from config instead of instantiating a tracker just for debug logging in langgraph_agent_graph_runner - Simplify redundant except (json.JSONDecodeError, Exception) to except Exception in tracker.py from_resumption_token Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AIConfig.create_tracker is now a required field with no default value. The SDK client always injects a real tracker factory, so any direct construction of AIConfig subclasses must now provide one explicitly. This eliminates the entire class of null-safety issues around tracker factories. Reverts the RuntimeError guard in Judge.evaluate() since it is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convenience factory for the common fallback case. Added to AIConfigDefault, AICompletionConfigDefault, AIAgentConfigDefault, and AIJudgeConfigDefault. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace private _DISABLED_*_DEFAULT constants and inline AIXxxConfigDefault(enabled=False) calls with the new disabled() classmethod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit d721142. Configure here.
|
|
||
| @classmethod | ||
| def disabled(cls) -> 'AICompletionConfigDefault': | ||
| return cls(enabled=False) |
There was a problem hiding this comment.
Redundant disabled classmethod overrides in config subclasses
Low Severity
AICompletionConfigDefault.disabled(), AIAgentConfigDefault.disabled(), and AIJudgeConfigDefault.disabled() all override the base AIConfigDefault.disabled() with identical logic (return cls(enabled=False)). The base class already uses cls, so calling disabled() on any subclass correctly returns an instance of that subclass. The overrides only add narrower return-type annotations, which could instead be achieved with typing.Self.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit d721142. Configure here.


Summary
runId: Every tracker now includes a uniquerunId(UUID) in all track event payloads, enabling billing isolation per executioncreate_tracker()factory on config objects:AICompletionConfig,AIAgentConfig, andAIJudgeConfignow carry an optionalcreate_trackercallable that returns a freshLDAIConfigTrackerwith a newrunIdeach time it's called. Set toNonewhen the config is disabled.ManagedModel.invoke(),ManagedAgent.run(), andJudge.evaluate()now callcreate_tracker()at the start of each invocation to get a fresh tracker, fixing the multi-turn tracking issue where at-most-once guards blocked metrics from second+ invocationsresumption_tokenproperty on tracker: URL-safe Base64-encoded (no padding) JSON string containing{runId, configKey, variationKey, version}for cross-process tracker reconstructionLDAIClient.create_tracker(token, context): Reconstructs a tracker from a resumption token for deferred feedback scenarios. Validates required fields and raisesValueErrorfor invalid tokens.Test plan
create_trackercallable; disabled config hasNonecreate_tracker()call returns a new tracker with a distinctrunIdManagedAgent.run()usescreate_tracker()when available, falls back to stored trackercreate_tracker(token, context)reconstructs tracker with originalrunIdand empty model/providerValueError🤖 Generated with Claude Code
Note
High Risk
High risk because it refactors core tracking APIs across configs, managed wrappers, and both LangChain/OpenAI graph runners, changing tracker lifecycles and event payloads. Incorrect factory usage or caching could break metrics emission or correlation across runs.
Overview
Moves tracking from a stored
trackerinstance to a per-invocationcreate_tracker()factory across configs, managed wrappers (ManagedModel,ManagedAgent,ManagedAgentGraph), judges, and both LangChain/OpenAI agent-graph runners, ensuring each execution gets a fresh tracker.Adds a per-execution
runIdto allLDAIConfigTrackerevents, introduces at-most-once guards for key metrics (duration, tokens, success/error, feedback, TTF), and implements cross-process tracker resumption viaLDAIConfigTracker.resumption_token+LDAIClient.create_tracker(token, context).Updates agent-graph tracking to cache per-node tracker instances during a run (so tool calls/durations/tokens share the same
runId) and adjusts tests to validate factory call counts,runIdconsistency, and resumption-token behavior.Reviewed by Cursor Bugbot for commit d721142. Bugbot is set up for automated code reviews on this repo. Configure here.