Skip to content

Improve Parent Resolving in Opentelemetry tracing adding support for configure Batch Span Processor (BSP)#2561

Open
KalinduGandara wants to merge 1 commit into
wso2:masterfrom
KalinduGandara:improve_otel
Open

Improve Parent Resolving in Opentelemetry tracing adding support for configure Batch Span Processor (BSP)#2561
KalinduGandara wants to merge 1 commit into
wso2:masterfrom
KalinduGandara:improve_otel

Conversation

@KalinduGandara

@KalinduGandara KalinduGandara commented May 15, 2026

Copy link
Copy Markdown
Contributor

This pull request introduces enhancements to the OpenTelemetry tracing infrastructure in Synapse, focusing on improved parent span tracking, configurability, and context safety. The main changes include the introduction of a stack-based mechanism for tracking parent spans, making batch span processor parameters configurable, and ensuring correct cloning of tracing context.

Adding support for configuring the Batch Span Processor (BSP) to handle high-volume trace scenarios.

Tracing infrastructure improvements:

  • Introduced ParentSpanWrapperStackManager to maintain a stack of span wrapper IDs in the message context, allowing accurate tracking of open tracing spans and correct parent resolution for new spans. (ParentSpanWrapperStackManager.java, SynapseConstants.java)
  • Updated span creation and completion logic to push and pop span IDs on the stack, ensuring the stack accurately reflects the current span hierarchy. (SpanStore.java)

Parent span resolution:

  • Modified parent span resolution to use the top of the stack for determining the parent, falling back to the latest active span if the stack is empty. (LatestActiveParentResolver.java, ParentResolver.java)
    Configurability and robustness:

  • Made batch span processor parameters (queue size, batch size, schedule delay, export timeout) configurable via properties, with validation and sensible defaults. (OTLPTelemetryManager.java, TelemetryConstants.java)

Message context cloning:

  • Ensured that the parent stack is properly cloned when the message context is duplicated, preventing cross-branch contamination of span hierarchies. (MessageHelper.java)

These changes collectively improve the accuracy, configurability, and safety of distributed tracing in Synapse.

@KalinduGandara KalinduGandara requested a review from chanikag as a code owner May 15, 2026 06:59
@KalinduGandara KalinduGandara requested review from Copilot and removed request for chanikag May 15, 2026 06:59
@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This pull request enhances OpenTelemetry tracing in Synapse by introducing a message-context-backed stack for tracking parent span wrapper IDs. It adds a new ParentSpanWrapperStackManager utility that manages per-MessageContext span ID stacks with push, pop, peek, and clone operations. The parent resolution path is updated to consult this stack before falling back to the latest active span wrapper. Additionally, batch span processor configuration is made tunable through properties (queue size, batch size, schedule delay, timeout), and the stack is integrated into span lifecycle operations and message cloning to ensure consistent parent tracking across distributed spans.

Sequence Diagram

sequenceDiagram
  participant Client
  participant SpanStore
  participant ParentSpanWrapperStackManager
  participant LatestActiveParentResolver
  participant BatchSpanProcessor

  Client->>SpanStore: addSpanWrapper(synCtx)
  SpanStore->>ParentSpanWrapperStackManager: push(spanWrapperId, synCtx)
  ParentSpanWrapperStackManager-->>SpanStore: stack updated

  Client->>LatestActiveParentResolver: resolveParent(spanStore, synCtx)
  LatestActiveParentResolver->>ParentSpanWrapperStackManager: peekParentSpanWrapperId(synCtx)
  ParentSpanWrapperStackManager-->>LatestActiveParentResolver: parentSpanWrapperId or null
  alt Parent found
    LatestActiveParentResolver->>SpanStore: getSpanWrapper(parentId)
    SpanStore-->>LatestActiveParentResolver: SpanWrapper
  else Not found
    LatestActiveParentResolver->>SpanStore: resolveLatestActiveSpanWrapper()
    SpanStore-->>LatestActiveParentResolver: SpanWrapper
  end

  Client->>SpanStore: finishSpan(synCtx)
  SpanStore->>ParentSpanWrapperStackManager: pop(synCtx)
  ParentSpanWrapperStackManager-->>SpanStore: stack updated

  Client->>BatchSpanProcessor: configured with tuned properties
  Note over BatchSpanProcessor: max queue, batch size, delay, timeout
Loading
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description covers the main technical changes and improvements, but lacks required sections from the template such as Purpose/Goals, User stories, Release notes, Documentation, Security checks, and Test environment details. Complete the PR description by adding Purpose, Goals, User stories, Release notes, Documentation links, Security checks (secure coding, FindSecurityBugs, secrets verification), and Test environment details including JDK versions and operating systems tested.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: improved parent span resolution in OpenTelemetry tracing and configurability of Batch Span Processor parameters.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/stores/SpanStore.java (1)

96-155: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Address parent stack management inconsistency in MessageContext overloads.

The addSpanWrapper overload for Synapse MessageContext (line 96) calls ParentSpanWrapperStackManager.push (line 118), but the org.apache.axis2.context.MessageContext overload (line 132) does not. The same inconsistency exists in finishSpan, where the Synapse variant calls pop but the Axis2 variant does not. Both overloads are actively used in production code (SpanHandler.java lines 295 and 381), which means this inconsistency could cause stack corruption or incorrect parent span resolution if the Axis2 path should also integrate with the parent span wrapper stack. Verify whether the Axis2 overloads intentionally exclude stack tracking or if this is an oversight that needs to be addressed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/stores/SpanStore.java`
around lines 96 - 155, The Axis2 MessageContext overload of addSpanWrapper (the
method with parameter org.apache.axis2.context.MessageContext) is missing a
ParentSpanWrapperStackManager.push call that the Synapse MessageContext overload
invokes, and its corresponding finishSpan overload also lacks the matching
ParentSpanWrapperStackManager.pop; update the Axis2 addSpanWrapper to call
ParentSpanWrapperStackManager.push(spanId, msgCtx) after
activeSpanWrappers.add(...) and update the Axis2 finishSpan variant to call
ParentSpanWrapperStackManager.pop(msgCtx) at the appropriate point to mirror the
Synapse MessageContext flow (ensure you edit the methods named addSpanWrapper
and finishSpan that take org.apache.axis2.context.MessageContext and keep
behavior for anonymous sequences and componentUniqueIdWiseSpanWrappers
consistent).
🧹 Nitpick comments (1)
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/OTLPTelemetryManager.java (1)

259-274: ⚡ Quick win

Consider consolidating duplicate validation logic.

The getIntProperty() method follows the same validation pattern as the existing getMetricIntervalSeconds() method (lines 234-256). Both load a property, parse to int, validate positivity, and fall back to a default with warnings. While the current implementation is correct, consolidating this logic would reduce duplication and improve maintainability.

♻️ Proposed refactor to consolidate logic

The existing getMetricIntervalSeconds() could be refactored to use getIntProperty():

 private int getMetricIntervalSeconds() {
-    String metricIntervalString = SynapsePropertiesLoader.getPropertyValue(
+    return getIntProperty(
             TelemetryConstants.OPENTELEMETRY_METRIC_PUSH_INTERVAL_SECONDS,
             TelemetryConstants.OPENTELEMETRY_METRIC_DEFAULT_PUSH_INTERVAL_SECONDS);
-
-    int metricIntervalSeconds;
-    try {
-        metricIntervalSeconds = Integer.parseInt(metricIntervalString);
-    } catch (NumberFormatException e) {
-        String message = "Invalid OpenTelemetry metric push interval: " + metricIntervalString + ". Using default value: "
-                + TelemetryConstants.OPENTELEMETRY_METRIC_DEFAULT_PUSH_INTERVAL_SECONDS + " seconds.";
-        logger.warn(message);
-        metricIntervalSeconds = Integer.parseInt(
-                TelemetryConstants.OPENTELEMETRY_METRIC_DEFAULT_PUSH_INTERVAL_SECONDS);
-    }
-    if (metricIntervalSeconds <= 0) {
-        logger.warn("OpenTelemetry metric push interval must be positive. Got: " + metricIntervalSeconds
-                + ". Using default value: " + TelemetryConstants.OPENTELEMETRY_METRIC_DEFAULT_PUSH_INTERVAL_SECONDS + " seconds.");
-        metricIntervalSeconds = Integer.parseInt(
-                TelemetryConstants.OPENTELEMETRY_METRIC_DEFAULT_PUSH_INTERVAL_SECONDS);
-    }
-    return metricIntervalSeconds;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/OTLPTelemetryManager.java`
around lines 259 - 274, Both getIntProperty and getMetricIntervalSeconds
implement the same load-parse-validate-positive-and-fallback pattern;
consolidate the logic by extracting a single helper or by making
getMetricIntervalSeconds call getIntProperty so there’s one canonical
implementation for loading, parsing (Integer.parseInt), checking >0, logging the
warning (including propertyName, value and default), and returning the default
on error/invalid value; update references to use the shared method (keep method
names getIntProperty and getMetricIntervalSeconds as entry points or make
getMetricIntervalSeconds delegate to getIntProperty) so duplication is removed
while preserving existing warnings and default behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/ParentSpanWrapperStackManager.java`:
- Around line 95-99: The public method ParentSpanWrapperStackManager.copyOf
currently assumes parentStack is non-null and will NPE if called with null; add
a defensive null check at the start of copyOf (referencing the method name
copyOf and its local variable clone) so that when parentStack is null it returns
an empty Stack<String> (or optionally throws a clear IllegalArgumentException)
instead of allowing a NullPointerException; ensure the method still creates and
returns a new Stack<String> populated via addAll(parentStack) only when
parentStack is non-null.

---

Outside diff comments:
In
`@modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/stores/SpanStore.java`:
- Around line 96-155: The Axis2 MessageContext overload of addSpanWrapper (the
method with parameter org.apache.axis2.context.MessageContext) is missing a
ParentSpanWrapperStackManager.push call that the Synapse MessageContext overload
invokes, and its corresponding finishSpan overload also lacks the matching
ParentSpanWrapperStackManager.pop; update the Axis2 addSpanWrapper to call
ParentSpanWrapperStackManager.push(spanId, msgCtx) after
activeSpanWrappers.add(...) and update the Axis2 finishSpan variant to call
ParentSpanWrapperStackManager.pop(msgCtx) at the appropriate point to mirror the
Synapse MessageContext flow (ensure you edit the methods named addSpanWrapper
and finishSpan that take org.apache.axis2.context.MessageContext and keep
behavior for anonymous sequences and componentUniqueIdWiseSpanWrappers
consistent).

---

Nitpick comments:
In
`@modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/OTLPTelemetryManager.java`:
- Around line 259-274: Both getIntProperty and getMetricIntervalSeconds
implement the same load-parse-validate-positive-and-fallback pattern;
consolidate the logic by extracting a single helper or by making
getMetricIntervalSeconds call getIntProperty so there’s one canonical
implementation for loading, parsing (Integer.parseInt), checking >0, logging the
warning (including propertyName, value and default), and returning the default
on error/invalid value; update references to use the shared method (keep method
names getIntProperty and getMetricIntervalSeconds as entry points or make
getMetricIntervalSeconds delegate to getIntProperty) so duplication is removed
while preserving existing warnings and default behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5752ccf-5ebc-49a7-b284-9d7f1291eaae

📥 Commits

Reviewing files that changed from the base of the PR and between 850a5bf and 5a3fef4.

📒 Files selected for processing (8)
  • modules/core/src/main/java/org/apache/synapse/SynapseConstants.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/OTLPTelemetryManager.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/ParentSpanWrapperStackManager.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/TelemetryConstants.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/parentresolving/LatestActiveParentResolver.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/parentresolving/ParentResolver.java
  • modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/stores/SpanStore.java
  • modules/core/src/main/java/org/apache/synapse/util/MessageHelper.java

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances Synapse’s OpenTelemetry tracing by introducing a per-message stack to track open span wrapper IDs for more accurate parent-span resolution, and by making OTLP BatchSpanProcessor (BSP) parameters configurable via properties. It also ensures the new tracing context state is safely cloned when message contexts are duplicated.

Changes:

  • Add ParentSpanWrapperStackManager and a new message-context property (synapse.parent.stack) to track open span wrapper IDs as a stack.
  • Update span lifecycle and parent resolution to use the stack for determining the current parent span.
  • Make BSP tuning parameters configurable (queue size, batch size, schedule delay, export timeout) with integer parsing/validation helpers.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
modules/core/src/main/java/org/apache/synapse/util/MessageHelper.java Clones the parent-span stack when cloning message contexts to prevent cross-branch contamination.
modules/core/src/main/java/org/apache/synapse/SynapseConstants.java Adds the synapse.parent.stack context property constant.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/stores/SpanStore.java Pushes/pops span IDs to/from the per-message parent stack on span start/finish.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/TelemetryConstants.java Introduces new BSP configuration keys and defaults.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/ParentSpanWrapperStackManager.java New utility to manage the parent span wrapper ID stack in message context.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/parentresolving/ParentResolver.java Routes latest-active parent resolution through the updated resolver signature.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/parentresolving/LatestActiveParentResolver.java Uses the stack top (if present) as the preferred parent, with fallback to latest active span.
modules/core/src/main/java/org/apache/synapse/aspects/flow/statistics/tracing/opentelemetry/management/OTLPTelemetryManager.java Configures BSP from properties and adds a helper for validated integer property parsing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants