fix: annotation queue related issues by ashrafchowdury · Pull Request #4333 · Agenta-AI/agenta

ashrafchowdury · 2026-05-14T16:12:23Z

Fixed:

Annotation evalautor name slug render issue
Testset save back logic
String feedback input size

…estset synchronization logic

vercel · 2026-05-14T16:12:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	May 18, 2026 10:29am

coderabbitai · 2026-05-14T16:12:30Z

📝 Walkthrough

Walkthrough

This PR enhances evaluator resolution and scenario export logic across the annotation system. It introduces evaluatorRevisionId as a primary evaluator reference, derives fallback slugs from evaluation step keys, updates all display and lookup paths to prefer revision IDs, and refines scenario export logic to handle trace queues and testcase queues differently based on completion state.

Changes

Evaluator Resolution and Export Handling

Layer / File(s)	Summary
Type Schema and Reference Normalization Helpers `web/packages/agenta-annotation/src/state/types.ts`, `web/packages/agenta-entities/src/evaluationRun/state/molecule.ts`	`AnnotationColumnDef` adds `evaluatorRevisionId` field. New normalization helpers (`normalizeStringRef`, `extractRef`, `removeOutputSuffix`, `getLastSegment`) and `getAnnotationEvaluatorSlug` function support multi-source slug derivation from evaluator references, step keys, and mapping columns. Slug precedence in `normalizeResolvedEvaluator` now prefers `ref.slug` over `evaluator.slug`.
Annotation Column Definition Mapping and Step-based Slug Resolution `web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts`, `web/packages/agenta-entities/src/evaluationRun/state/molecule.ts`	Introduces `deriveEvaluatorSlugFromStepKey` to extract slugs from step key segments, and updates evaluator metadata in step references and testcase sync to use slug fallback derivation. The `annotationColumnDefsAtomFamily` atom populates `evaluatorId`, `evaluatorRevisionId`, and applies the multi-source slug derivation logic instead of direct reference access.
Evaluator Display Title and Export Column Label Resolution `web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts`, `web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx`	`getAnnotationDisplayTitle` looks up evaluator names using `evaluatorRevisionId` with `evaluatorId` fallback. `resolveExportColumnLabel` for annotation columns keys off `evaluatorRevisionId ?? evaluatorId` and uses a fallback order of resolved name, annotation column slug, looked-up slug, then step key.
Scenario List View Column and Group Headers `web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx`	`AnnotationColumnHeader` and `AnnotationGroupHeader` resolve evaluator info using `evaluatorRevisionId ?? evaluatorId` and compute `displaySlug` from annotation column definitions for consistent tooltip and title formatting.
Evaluator Names Cell Component Refactoring `web/packages/agenta-annotation-ui/src/components/AnnotationQueuesView/cells/EvaluatorNamesCell.tsx`	`EvaluatorNamesCell` extracts deduplicated evaluator identifiers from `annotationColumnDefs` instead of flat evaluator IDs. The component now passes enriched `evaluatorId`, `evaluatorRevisionId`, and `fallbackSlug` to tag and span components, which resolve names via `evaluatorRevisionId` (fallback: `evaluatorId`) and use `fallbackSlug` in the display fallback chain.
Scenario Export and Completion-Based Filtering `web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts`, `web/packages/agenta-annotation/src/state/testsetSync.ts`	`resolveScenarioIdsForAddToTestset` filters completed scenarios for testcase queues with `scope === "all" \| "complete"`. New `resolveCompletedScenarioIdsForAnnotationExport` computes completed scenario IDs for trace export. Trace export sets `requireAnnotationOutputScenarioIds` from completed scenarios. `canAddToTestsetAtom` gating requires completion for testcases, allows any scenarios for traces. `buildTestcaseExportRows` skips scenarios with no annotation output entries.
Minor UI Enhancements `web/packages/agenta-annotation-ui/src/components/AnnotationSession/AnnotationFormField.tsx`, `web/packages/agenta-annotation-ui/src/components/AnnotationSession/index.tsx`	`StringField` renders `Input.TextArea` with `autoSize` configuration for multi-line entry. `handleAddToTestsetSubmit` normalizes `commitMessage` to empty string when `params.message` is null/undefined.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

Agenta-AI/agenta#4304: Includes related annotation-session UI and navigation-safe behavior changes in the same component paths.

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 60.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The pull request has no description provided by the author, making it impossible to verify that the description relates to the changeset.	Add a detailed pull request description explaining the annotation queue related issues being fixed and how the changes address them.
Title check	❓ Inconclusive	The title 'fix: annotation queue related issues' is generic and doesn't clearly describe the main changes, which involve evaluator name resolution, multi-line input support, and export logic updates.	Consider using a more specific title that highlights the primary change, such as 'fix: evaluator name resolution and related annotation queue issues' or focusing on the most impactful change.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/annotation-queue-related-issues

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…on, add trace handling

coderabbitai

Actionable comments posted: 3

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 123f19ed-9e06-4f76-b009-1dda1f76036b

📥 Commits

Reviewing files that changed from the base of the PR and between 75dc5e2 and 7598cd3.

📒 Files selected for processing (9)

web/packages/agenta-annotation-ui/src/components/AnnotationQueuesView/cells/EvaluatorNamesCell.tsx
web/packages/agenta-annotation-ui/src/components/AnnotationSession/AnnotationFormField.tsx
web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
web/packages/agenta-annotation-ui/src/components/AnnotationSession/index.tsx
web/packages/agenta-annotation/src/state/controllers/annotationFormController.ts
web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts
web/packages/agenta-annotation/src/state/testsetSync.ts
web/packages/agenta-annotation/src/state/types.ts
web/packages/agenta-entities/src/evaluationRun/state/molecule.ts

coderabbitai · 2026-05-18T10:32:20Z

+function getEvaluatorEntryKey(entry: EvaluatorEntry) {
+    return entry.evaluatorId ?? entry.evaluatorRevisionId ?? entry.evaluatorSlug ?? "unknown"
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Dedup and React keys should prioritize evaluatorRevisionId (not evaluatorId).

Current key precedence merges multiple revisions under one evaluator ID, so distinct evaluator revisions can disappear from the cell/tooltip list.

Suggested fix

function getEvaluatorEntryKey(entry: EvaluatorEntry) { - return entry.evaluatorId ?? entry.evaluatorRevisionId ?? entry.evaluatorSlug ?? "unknown" + if (entry.evaluatorRevisionId) return `rev:${entry.evaluatorRevisionId}` + if (entry.evaluatorId) return `id:${entry.evaluatorId}` + if (entry.evaluatorSlug) return `slug:${entry.evaluatorSlug}` + return "unknown" } @@ - const key = col.evaluatorId ?? col.evaluatorRevisionId ?? col.evaluatorSlug + const key = + (col.evaluatorRevisionId && `rev:${col.evaluatorRevisionId}`) || + (col.evaluatorId && `id:${col.evaluatorId}`) || + (col.evaluatorSlug && `slug:${col.evaluatorSlug}`) || + null

Also applies to: 45-46, 73-73, 95-95

coderabbitai · 2026-05-18T10:32:20Z

+function stripOutputSuffix(value: string | null): string | null {
+    if (!value) return null
+    const parts = value.split(".").filter(Boolean)
+    if (parts.length < 2) return value
+    return parts.slice(0, -1).join(".") || value
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

stripOutputSuffix is over-stripping non-output names.

This removes the last segment for any dotted string, so valid values like quality.score become quality. That can mis-derive evaluatorSlug and break downstream matching.

Suggested fix

function stripOutputSuffix(value: string | null): string | null { if (!value) return null const parts = value.split(".").filter(Boolean) if (parts.length < 2) return value - return parts.slice(0, -1).join(".") || value + const last = parts.at(-1)?.toLowerCase() + if (last !== "output" && last !== "outputs") return value + return parts.slice(0, -1).join(".") || value }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

function stripOutputSuffix(value: string | null): string | null {

if (!value) return null

const parts = value.split(".").filter(Boolean)

if (parts.length < 2) return value

return parts.slice(0, -1).join(".") || value

}

function stripOutputSuffix(value: string | null): string | null {

if (!value) return null

const parts = value.split(".").filter(Boolean)

if (parts.length < 2) return value

const last = parts.at(-1)?.toLowerCase()

if (last !== "output" && last !== "outputs") return value

return parts.slice(0, -1).join(".") || value

}

github-actions · 2026-05-18T10:39:07Z

Railway Preview Environment


Preview URL	https://gateway-production-0013.up.railway.app/w
Project	`agenta-oss-pr-4333`
Image tag	`pr-4333-9b0d204`
Status	Deployed
Railway logs	Open logs
Workflow logs	View workflow run
Updated at 2026-05-18T10:39:07.076Z

refactor: improve evaluator name resolution and implement row-based t…

0d022be

…estset synchronization logic

Merge branch 'main' into fix/annotation-queue-related-issues

4fb51a2

vercel Bot had a problem deploying to Preview May 14, 2026 16:14 Failure

ashrafchowdury added 2 commits May 15, 2026 13:07

revert

92202ff

fix: refine scenario resolution logic for testset export and validati…

1db8d1e

…on, add trace handling

vercel Bot had a problem deploying to Preview May 15, 2026 07:13 Failure

This was linked to issues May 15, 2026

Evaluator IDs shown instead of name and slug #4319

Open

String feedback fields in annotation queues are single-line #4320

Open

Annotated test cases are duplicated in test sets #4321

Open

Merge branch 'main' into fix/annotation-queue-related-issues

ba5c62f

vercel Bot deployed to Preview May 15, 2026 12:33 View deployment

Merge branch 'main' into fix/annotation-queue-related-issues

7598cd3

ashrafchowdury marked this pull request as ready for review May 18, 2026 10:28

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. Bug Report Something isn't working Frontend labels May 18, 2026

vercel Bot deployed to Preview May 18, 2026 10:29 View deployment

ashrafchowdury requested a review from ardaerzin May 18, 2026 10:30

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

ardaerzin approved these changes May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: annotation queue related issues#4333

fix: annotation queue related issues#4333
ashrafchowdury wants to merge 6 commits into
mainfrom
fix/annotation-queue-related-issues

ashrafchowdury commented May 14, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Walkthrough

Changes

Possibly Related PRs

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ashrafchowdury commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly Related PRs

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 18, 2026

Railway Preview Environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ashrafchowdury commented May 14, 2026 •

edited

Loading

vercel Bot commented May 14, 2026 •

edited

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading