Add monitor sampling-rate UI, configurable trace fetch page size, and monitor comparison view by nadheesh · Pull Request #1127 · wso2/agent-manager

nadheesh · 2026-06-22T05:21:51Z

Summary

Three related improvements to eval monitors:

Trace fetch pagination — page size is now configurable and memory-bound (#669)

TraceFetcher previously hardcoded a 1000-trace page size, so up to 1000 fully-parsed traces (with nested spans/payloads) could be held in memory per page. The page size is now a constructor arg (default 10, overridable per call), so peak memory stays bounded regardless of how many traces match a monitor's time window. evaluation-job sets it via a TRACE_FETCH_PAGE_SIZE constant. Also fixed the monitor-start log to print the sampling rate at 2 decimals so values like 0.25 no longer display as 0.2.

Sampling rate UI for monitors (#1126)

The samplingRate field existed in form state and was submitted, but had no visible control — every monitor was silently created with a hidden 25% default. Added a slider (1–100%) in the Data Collection section of the monitor create form, changed the default to 100% (full sampling), and tightened validation to reject 0%.

Compare results across monitors (#1101)

Added a side-by-side monitor comparison view (new compare/:monitorId route), with supporting changes to the monitor view, agent performance card, and evaluation summary card.

Testing

amp-evaluation: ruff check / ruff format --check / mypy src — clean; unit tests pass.
evaluation-job: ruff check / ruff format --check / mypy main.py — clean; unit tests pass.
console: eslint and tsc build pass for the eval, core-ui, and types packages.

Closes #669
Closes #1126
Closes #1101

Summary by CodeRabbit

New Features
- Added “Compare Monitors” page for side-by-side evaluation monitor comparison with radar charts and summary cards.
- Added a Compare action on monitor detail pages to launch the comparison flow.
Bug Fixes
- Default sampling rate for monitor creation/duplication is now 100%.
- Sampling rate validation tightened to accept only values between 1–100%.
Improvements
- Enhanced trace fetching with paginated iteration and deterministic trace sampling.
- Improved evaluation score calculations and radar tooltip/customization options.
Tests
- Expanded coverage for sampling, pagination, and sampling-rate forwarding behavior.

coderabbitai · 2026-06-22T05:22:05Z

Warning

Review limit reached

@nadheesh, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 46 minutes and 36 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26091af6-5218-45cf-8c27-c96f6ff7f486

📥 Commits

Reviewing files that changed from the base of the PR and between 59452e6 and 5e70301.

📒 Files selected for processing (4)

console/workspaces/core-ui/src/Route/Route.tsx
console/workspaces/core-ui/src/pages/index.tsx
console/workspaces/libs/types/src/routes/generated-route.map.ts
console/workspaces/libs/types/src/routes/routes.map.ts

📝 Walkthrough

Walkthrough

This PR implements three features: a monitor comparison page that renders side-by-side radar charts and evaluation summaries for two monitors; a sampling-rate slider (defaulting to 100%) for monitor creation with backend Monitor.run plumbing; and iterator-based paginated trace fetching with deterministic SHA-256 trace sampling in the evaluation backend.

Changes

Compare Monitors UI

Layer / File(s)	Summary
Shared score utilities and card extensions `console/workspaces/pages/eval/src/utils/monitorScoreUtils.ts`, `console/workspaces/pages/eval/src/subComponents/AgentPerformanceCard.tsx`, `console/workspaces/pages/eval/src/subComponents/EvaluationSummaryCard.tsx`	Extracts `getMean`, `computeLevelSummaries`, `computeAverageScore` into a new utility module; adds `connectNulls`, index signature, `RadarTooltipContent`, optional `title`/`renderTooltipContent` to `AgentPerformanceCard`; adds optional `title` to `EvaluationSummaryCard`.
Compare button in ViewMonitor `console/workspaces/pages/eval/src/ViewMonitor.Component.tsx`	Imports `useListMonitors` and shared score utilities; adds `compareAnchorEl` state and a "Compare" button with dropdown that navigates to the compare route with `?with=` param; refactors inline `useMemo` score computations to use shared utilities.
CompareMonitor page component `console/workspaces/pages/eval/src/CompareMonitor.Component.tsx`	New page component reading `monitorId`/`with`/`sourceTimeRange`/`targetTimeRange` from URL; performs dual monitor metadata and score fetching; computes union radar dataset and series config; renders `AgentPerformanceCard` with custom tooltip and two `EvaluationSummaryCard` blocks; shows error `Alert` when no target is selected.
Route and page registry wiring `console/workspaces/libs/types/src/routes/routes.map.ts`, `console/workspaces/libs/types/src/routes/generated-route.map.ts`, `console/workspaces/pages/eval/src/index.ts`, `console/workspaces/core-ui/src/pages/index.tsx`, `console/workspaces/core-ui/src/Route/Route.tsx`	Registers `compare/:monitorId` in route maps, exports `CompareMonitorComponent` and `compareMonitor` metadata from the eval page index, creates `LazyCompareMonitorComponent` in the core-ui registry, and adds the `<Route>` under `monitorBase`.

Sampling Rate: UI and Backend

Layer / File(s)	Summary
Sampling rate slider, schema, and default changes `console/workspaces/pages/eval/src/form/schema.ts`, `console/workspaces/pages/eval/src/CreateMonitor.Component.tsx`, `console/workspaces/pages/eval/src/subComponents/CreateMonitorForm.tsx`	Tightens `samplingRate` Zod refinement to `> 0`; changes duplicate and new-monitor defaults from 25 to 100 and payload conversion fallback from 0 to 100; adds a `Slider` (1–100%) with marks and validation caption to the form.
Backend sample_rate wiring `libs/amp-evaluation/src/amp_evaluation/runner.py`, `evaluation-job/main.py`, `evaluation-job/test_main.py`	Adds `sample_rate: Optional[float]` to `Monitor.run` with lazy fetch/sample/parse generator; adds `TRACE_FETCH_PAGE_SIZE` constant, `--sampling-rate` CLI validation (`(0,1]`), and `sample_rate=args.sampling_rate` forwarding in `main.py`; extends integration tests to assert forwarding and invalid-rate rejection.

Trace Fetch Pagination and Deterministic Sampling

Layer / File(s)	Summary
TraceFetcher iterator paging and `sample_traces` `libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py`, `libs/amp-evaluation/src/amp_evaluation/trace/__init__.py`	Adds `page_size` to `TraceFetcher.__init__`, introduces `_fetch_page` helper, replaces eager `fetch_traces` with an iterator-based paging loop with deduplication and stop conditions; adds `sample_traces` with SHA-256 deterministic sampling; exports `sample_traces` from `trace/__init__.py`.
Runner lazy iterable consumption `libs/amp-evaluation/src/amp_evaluation/runner.py`	Changes `_fetch_traces` return to `Iterable[OTELTrace]`, updates `_evaluate_traces` to accept `Iterable[Trace]` with iterator-driven `while` loop, materializes traces with `list()` in `Experiment._fetch_and_match_traces`.
Pagination and sampling tests `libs/amp-evaluation/tests/test_trace_fetcher.py`, `libs/amp-evaluation/tests/test_eval_runner.py`	Adds `TestFetchTracesPagination` covering multi-page, cursor, dedup, `max_traces`, and page-size rules; adds `TestSampleTraces` for determinism, retention rate, and invalid rates; adds `TestMonitorFetchAndSamplePipeline` with `_FakeFetcher` asserting fetch, sample pipeline, and `traces=` bypass.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(70, 130, 180, 0.5)
    note over EvaluationJob,TraceFetcher: Backend Evaluation Pipeline
    EvaluationJob->>EvaluationJob: validate sampling_rate in (0, 1]
    EvaluationJob->>TraceFetcher: TraceFetcher(page_size=TRACE_FETCH_PAGE_SIZE)
    EvaluationJob->>Monitor: run(start_time, end_time, sample_rate)
    Monitor->>TraceFetcher: fetch_traces(start_time, end_time) → Iterator
    TraceFetcher->>TraceObserverAPI: POST /traces/export page 1
    TraceObserverAPI-->>TraceFetcher: traces + totalCount
    TraceFetcher->>TraceObserverAPI: POST /traces/export page 2 (cursor advanced)
    TraceObserverAPI-->>TraceFetcher: traces + totalCount
    TraceFetcher-->>Monitor: lazy OTELTrace iterator
    Monitor->>Monitor: sample_traces(iterator, sample_rate) → filtered iterator
    Monitor->>Monitor: parse + yield Trace objects
    Monitor->>BaseRunner: _evaluate_traces(Iterable[Trace])
    BaseRunner-->>EvaluationJob: RunResult
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~90 minutes

Possibly related PRs

wso2/agent-manager#1102: Adds the duplicateFrom monitor creation mode and pre-fills samplingRate, directly overlapping with this PR's CreateMonitor.Component.tsx sampling-rate default and payload conversion changes.

Poem

🐇 A radar spins two monitors side by side,
Pages of traces now lazily glide,
SHA-256 flips a coin for each trace,
Sliders set sampling at a confident pace,
Compare the webs — let insights reside! 🕸️✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.45% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the three main changes: sampling-rate UI, configurable trace fetch page size, and monitor comparison view.
Description check	✅ Passed	The PR description clearly outlines the three enhancements, testing performed, and references to closed issues, aligning well with the template structure despite not using formal sections.
Linked Issues check	✅ Passed	All code changes directly address the three linked issues: pagination support for trace fetching [`#669`], sampling rate UI configuration [`#1126`], and monitor comparison view [`#1101`].
Out of Scope Changes check	✅ Passed	All changes are directly related to the three linked issues with no out-of-scope modifications detected across the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (1)

console/workspaces/pages/eval/src/CompareMonitor.Component.tsx (1)
287-291: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Use theme tokens for series colors instead of hardcoded hex values.

Line 287 and Line 291 use hardcoded colors, which can drift from theme variants (including high-contrast modes). Use palette tokens for both series colors.

As per coding guidelines, console/**/*.{ts,tsx,js,jsx} should "Use theme tokens via the sx prop instead of hardcoded colors and spacing values".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/workspaces/pages/eval/src/CompareMonitor.Component.tsx` around lines
287 - 291, The sourceColor and targetColor variables contain hardcoded hex color
values (`#3f8cff` and `#f59e0b` respectively) that can drift from theme variants
including high-contrast modes. Replace these hardcoded colors with appropriate
theme palette tokens from the palette object. For sourceColor, it already uses a
fallback to palette?.primary.main, so ensure it consistently uses theme tokens.
For targetColor, instead of the hardcoded `#f59e0b` string, access and use an
appropriate palette token such as palette?.warning.main or another suitable
theme color that maintains high contrast and aligns with the design system's
color tokens.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@console/workspaces/pages/eval/src/CompareMonitor.Component.tsx`:
- Around line 308-312: Remove the nullish coalescing operator that defaults to 0
in the sourceValue and targetValue assignments. Instead of using (meanA ?? 0) *
100 and (meanB ?? 0) * 100, check if meanA and meanB are not null before
multiplying by 100, otherwise preserve null to represent "no data" in the radar
chart. This prevents "all skipped" data from being plotted as a score of 0,
which distorts the comparison. Apply the same fix to the similar code at lines
323-324.
- Around line 150-160: The sourceTimeRange and targetTimeRange useMemo hooks are
casting arbitrary string values from searchParams directly to TraceListTimeRange
enum type without runtime validation. Add a validation function that checks if a
value is a valid TraceListTimeRange enum member before casting it. Apply this
validation to both the sourceTimeRange useMemo (which reads
searchParams.get("sourceTimeRange")) and the targetTimeRange useMemo (which
reads searchParams.get("targetTimeRange")), using it to verify the retrieved
values are legitimate enum values before assignment, and falling back to
TraceListTimeRange.SEVEN_DAYS if validation fails.

In `@console/workspaces/pages/eval/src/CreateMonitor.Component.tsx`:
- Around line 83-85: The sampling rate clamping calculation uses Math.max(0,
...) as the lower bound, but the form schema now requires a minimum sampling
rate of 1 instead of 0. When duplicating older monitors with a sampling rate of
0, this will result in an invalid prefilled value. Change the lower bound in
Math.max from 0 to 1 in the sourceMonitor.samplingRate calculation to align with
the new valid range of 1-100.

In `@evaluation-job/main.py`:
- Around line 563-569: The validation logic for the sampling_rate argument in
the code block starting at line 563 enforces that the value must be in the range
(0, 1] (exclusive of zero, inclusive of one). Find the argparse argument
definition for --sampling-rate and update its help text to accurately reflect
this constraint instead of documenting it as (0.0-1.0), which incorrectly
implies that zero is a valid value. Ensure the help text matches the actual
validation contract being enforced.

In `@libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py`:
- Around line 496-497: Add validation for the page_size parameter immediately
after it is set from either the input parameter or the default self.page_size
value. Check if page_size is less than or equal to zero and raise an appropriate
exception (such as ValueError) with a clear error message indicating that
page_size must be a positive integer. This validation should occur before the
seen_ids set is initialized to fail fast at the API boundary rather than relying
on backend error handling.
- Around line 502-510: The code currently silently returns when encountering a
page containing only previously seen traceId values, which can result in
undercounting traces when the API reports a higher total count. Instead of
silently returning in the condition checking if not new_traces, detect this
pagination stall scenario by comparing whether _total_count indicates more data
exists beyond what has been processed, and surface this condition (through
logging, raising an exception, or other error handling) rather than silently
truncating the iteration. This ensures pagination stalls are visible and don't
silently cause incomplete trace evaluation.
- Around line 499-516: The max_traces check currently happens after the trace is
yielded in the for loop iterating over new_traces, which means when
max_traces=0, one trace is still yielded before the condition triggers and
returns. Move the max_traces check to before the yield statement so that if
max_traces is set to 0 or would be exceeded, the function returns without
yielding any additional traces. Check if max_traces is not None and yielded >=
max_traces before the yield trace line to ensure the limit is respected from the
first trace.

---

Nitpick comments:
In `@console/workspaces/pages/eval/src/CompareMonitor.Component.tsx`:
- Around line 287-291: The sourceColor and targetColor variables contain
hardcoded hex color values (`#3f8cff` and `#f59e0b` respectively) that can drift
from theme variants including high-contrast modes. Replace these hardcoded
colors with appropriate theme palette tokens from the palette object. For
sourceColor, it already uses a fallback to palette?.primary.main, so ensure it
consistently uses theme tokens. For targetColor, instead of the hardcoded
`#f59e0b` string, access and use an appropriate palette token such as
palette?.warning.main or another suitable theme color that maintains high
contrast and aligns with the design system's color tokens.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a6d2212c-c48a-4dde-b70e-5488dfaf8475

📥 Commits

Reviewing files that changed from the base of the PR and between 37cc24c and 4b10b27.

📒 Files selected for processing (20)

console/workspaces/core-ui/src/Route/Route.tsx
console/workspaces/core-ui/src/pages/index.tsx
console/workspaces/libs/types/src/routes/generated-route.map.ts
console/workspaces/libs/types/src/routes/routes.map.ts
console/workspaces/pages/eval/src/CompareMonitor.Component.tsx
console/workspaces/pages/eval/src/CreateMonitor.Component.tsx
console/workspaces/pages/eval/src/ViewMonitor.Component.tsx
console/workspaces/pages/eval/src/form/schema.ts
console/workspaces/pages/eval/src/index.ts
console/workspaces/pages/eval/src/subComponents/AgentPerformanceCard.tsx
console/workspaces/pages/eval/src/subComponents/CreateMonitorForm.tsx
console/workspaces/pages/eval/src/subComponents/EvaluationSummaryCard.tsx
console/workspaces/pages/eval/src/utils/monitorScoreUtils.ts
evaluation-job/main.py
evaluation-job/test_main.py
libs/amp-evaluation/src/amp_evaluation/runner.py
libs/amp-evaluation/src/amp_evaluation/trace/__init__.py
libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py
libs/amp-evaluation/tests/test_eval_runner.py
libs/amp-evaluation/tests/test_trace_fetcher.py

nadheesh added 5 commits June 18, 2026 14:40

Paginate TraceFetcher.fetch_traces and add deterministic trace sampling

3422454

Merge branch 'wso2:main' into main

9f9e014

Make TraceFetcher page size configurable and memory-bound

2feed7a

Add sampling rate slider to monitor create form

a0649d9

Add compare view for monitors

4b10b27

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

Fix trace-fetch tie truncation and address PR review feedback

59452e6

AnoshanJ approved these changes Jun 22, 2026

View reviewed changes

Merge branch 'wso2:main' into eval-monitor-sampling-pagination-compare

5e70301

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add monitor sampling-rate UI, configurable trace fetch page size, and monitor comparison view#1127

Add monitor sampling-rate UI, configurable trace fetch page size, and monitor comparison view#1127
nadheesh wants to merge 7 commits into
wso2:mainfrom
nadheesh:eval-monitor-sampling-pagination-compare

nadheesh commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nadheesh commented Jun 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Trace fetch pagination — page size is now configurable and memory-bound (#669)

Sampling rate UI for monitors (#1126)

Compare results across monitors (#1101)

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nadheesh commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading