Skip to content

fix: annotation queue related issues#4333

Open
ashrafchowdury wants to merge 6 commits into
mainfrom
fix/annotation-queue-related-issues
Open

fix: annotation queue related issues#4333
ashrafchowdury wants to merge 6 commits into
mainfrom
fix/annotation-queue-related-issues

Conversation

@ashrafchowdury
Copy link
Copy Markdown
Contributor

@ashrafchowdury ashrafchowdury commented May 14, 2026

Fixed:

  • Annotation evalautor name slug render issue
  • Testset save back logic
  • String feedback input size

@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment May 18, 2026 10:29am

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR enhances evaluator resolution and scenario export logic across the annotation system. It introduces evaluatorRevisionId as a primary evaluator reference, derives fallback slugs from evaluation step keys, updates all display and lookup paths to prefer revision IDs, and refines scenario export logic to handle trace queues and testcase queues differently based on completion state.

Changes

Evaluator Resolution and Export Handling

Layer / File(s) Summary
Type Schema and Reference Normalization Helpers
web/packages/agenta-annotation/src/state/types.ts, web/packages/agenta-entities/src/evaluationRun/state/molecule.ts
AnnotationColumnDef adds evaluatorRevisionId field. New normalization helpers (normalizeStringRef, extractRef, removeOutputSuffix, getLastSegment) and getAnnotationEvaluatorSlug function support multi-source slug derivation from evaluator references, step keys, and mapping columns. Slug precedence in normalizeResolvedEvaluator now prefers ref.slug over evaluator.slug.
Annotation Column Definition Mapping and Step-based Slug Resolution
web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts, web/packages/agenta-entities/src/evaluationRun/state/molecule.ts
Introduces deriveEvaluatorSlugFromStepKey to extract slugs from step key segments, and updates evaluator metadata in step references and testcase sync to use slug fallback derivation. The annotationColumnDefsAtomFamily atom populates evaluatorId, evaluatorRevisionId, and applies the multi-source slug derivation logic instead of direct reference access.
Evaluator Display Title and Export Column Label Resolution
web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts, web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
getAnnotationDisplayTitle looks up evaluator names using evaluatorRevisionId with evaluatorId fallback. resolveExportColumnLabel for annotation columns keys off evaluatorRevisionId ?? evaluatorId and uses a fallback order of resolved name, annotation column slug, looked-up slug, then step key.
Scenario List View Column and Group Headers
web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
AnnotationColumnHeader and AnnotationGroupHeader resolve evaluator info using evaluatorRevisionId ?? evaluatorId and compute displaySlug from annotation column definitions for consistent tooltip and title formatting.
Evaluator Names Cell Component Refactoring
web/packages/agenta-annotation-ui/src/components/AnnotationQueuesView/cells/EvaluatorNamesCell.tsx
EvaluatorNamesCell extracts deduplicated evaluator identifiers from annotationColumnDefs instead of flat evaluator IDs. The component now passes enriched evaluatorId, evaluatorRevisionId, and fallbackSlug to tag and span components, which resolve names via evaluatorRevisionId (fallback: evaluatorId) and use fallbackSlug in the display fallback chain.
Scenario Export and Completion-Based Filtering
web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts, web/packages/agenta-annotation/src/state/testsetSync.ts
resolveScenarioIdsForAddToTestset filters completed scenarios for testcase queues with scope === "all" | "complete". New resolveCompletedScenarioIdsForAnnotationExport computes completed scenario IDs for trace export. Trace export sets requireAnnotationOutputScenarioIds from completed scenarios. canAddToTestsetAtom gating requires completion for testcases, allows any scenarios for traces. buildTestcaseExportRows skips scenarios with no annotation output entries.
Minor UI Enhancements
web/packages/agenta-annotation-ui/src/components/AnnotationSession/AnnotationFormField.tsx, web/packages/agenta-annotation-ui/src/components/AnnotationSession/index.tsx
StringField renders Input.TextArea with autoSize configuration for multi-line entry. handleAddToTestsetSubmit normalizes commitMessage to empty string when params.message is null/undefined.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

  • Agenta-AI/agenta#4304: Includes related annotation-session UI and navigation-safe behavior changes in the same component paths.
🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The pull request has no description provided by the author, making it impossible to verify that the description relates to the changeset. Add a detailed pull request description explaining the annotation queue related issues being fixed and how the changes address them.
Title check ❓ Inconclusive The title 'fix: annotation queue related issues' is generic and doesn't clearly describe the main changes, which involve evaluator name resolution, multi-line input support, and export logic updates. Consider using a more specific title that highlights the primary change, such as 'fix: evaluator name resolution and related annotation queue issues' or focusing on the most impactful change.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/annotation-queue-related-issues

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ashrafchowdury ashrafchowdury marked this pull request as ready for review May 18, 2026 10:28
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. Bug Report Something isn't working Frontend labels May 18, 2026
@ashrafchowdury ashrafchowdury requested a review from ardaerzin May 18, 2026 10:30
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 123f19ed-9e06-4f76-b009-1dda1f76036b

📥 Commits

Reviewing files that changed from the base of the PR and between 75dc5e2 and 7598cd3.

📒 Files selected for processing (9)
  • web/packages/agenta-annotation-ui/src/components/AnnotationQueuesView/cells/EvaluatorNamesCell.tsx
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/AnnotationFormField.tsx
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/index.tsx
  • web/packages/agenta-annotation/src/state/controllers/annotationFormController.ts
  • web/packages/agenta-annotation/src/state/controllers/annotationSessionController.ts
  • web/packages/agenta-annotation/src/state/testsetSync.ts
  • web/packages/agenta-annotation/src/state/types.ts
  • web/packages/agenta-entities/src/evaluationRun/state/molecule.ts

Comment on lines +18 to +20
function getEvaluatorEntryKey(entry: EvaluatorEntry) {
return entry.evaluatorId ?? entry.evaluatorRevisionId ?? entry.evaluatorSlug ?? "unknown"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Dedup and React keys should prioritize evaluatorRevisionId (not evaluatorId).

Current key precedence merges multiple revisions under one evaluator ID, so distinct evaluator revisions can disappear from the cell/tooltip list.

Suggested fix
 function getEvaluatorEntryKey(entry: EvaluatorEntry) {
-    return entry.evaluatorId ?? entry.evaluatorRevisionId ?? entry.evaluatorSlug ?? "unknown"
+    if (entry.evaluatorRevisionId) return `rev:${entry.evaluatorRevisionId}`
+    if (entry.evaluatorId) return `id:${entry.evaluatorId}`
+    if (entry.evaluatorSlug) return `slug:${entry.evaluatorSlug}`
+    return "unknown"
 }
@@
-        const key = col.evaluatorId ?? col.evaluatorRevisionId ?? col.evaluatorSlug
+        const key =
+            (col.evaluatorRevisionId && `rev:${col.evaluatorRevisionId}`) ||
+            (col.evaluatorId && `id:${col.evaluatorId}`) ||
+            (col.evaluatorSlug && `slug:${col.evaluatorSlug}`) ||
+            null

Also applies to: 45-46, 73-73, 95-95

Comment on lines +280 to +285
function stripOutputSuffix(value: string | null): string | null {
if (!value) return null
const parts = value.split(".").filter(Boolean)
if (parts.length < 2) return value
return parts.slice(0, -1).join(".") || value
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

stripOutputSuffix is over-stripping non-output names.

This removes the last segment for any dotted string, so valid values like quality.score become quality. That can mis-derive evaluatorSlug and break downstream matching.

Suggested fix
 function stripOutputSuffix(value: string | null): string | null {
     if (!value) return null
     const parts = value.split(".").filter(Boolean)
     if (parts.length < 2) return value
-    return parts.slice(0, -1).join(".") || value
+    const last = parts.at(-1)?.toLowerCase()
+    if (last !== "output" && last !== "outputs") return value
+    return parts.slice(0, -1).join(".") || value
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function stripOutputSuffix(value: string | null): string | null {
if (!value) return null
const parts = value.split(".").filter(Boolean)
if (parts.length < 2) return value
return parts.slice(0, -1).join(".") || value
}
function stripOutputSuffix(value: string | null): string | null {
if (!value) return null
const parts = value.split(".").filter(Boolean)
if (parts.length < 2) return value
const last = parts.at(-1)?.toLowerCase()
if (last !== "output" && last !== "outputs") return value
return parts.slice(0, -1).join(".") || value
}

@github-actions
Copy link
Copy Markdown
Contributor

Railway Preview Environment

Preview URL https://gateway-production-0013.up.railway.app/w
Project agenta-oss-pr-4333
Image tag pr-4333-9b0d204
Status Deployed
Railway logs Open logs
Workflow logs View workflow run
Updated at 2026-05-18T10:39:07.076Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Report Something isn't working Frontend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

2 participants