Add per-step reward field to Action and Observation schemas by henryjcee · Pull Request #183 · neulab/agent-data-protocol

henryjcee · 2026-04-09T12:06:11Z

Summary

Adds an optional reward: float | None field to the base Action class (schema/action/action.py), inherited by ApiAction, CodeAction, and MessageAction
Adds an optional reward: float | None field to the base Observation class (schema/observation/observation.py), inherited by TextObservation, WebObservation, and ImageObservation
Updates schema/SCHEMA.md to document the new field on both base classes

Motivation

ADP currently has no mechanism to attach reward signals to individual trajectory steps. This makes it difficult to use ADP-formatted data for reinforcement learning, where per-step rewards are a core primitive.

This change adds reward as a first-class optional field on every action and observation, allowing datasets to record the reward received at each step of a trajectory. It's conceivable that some RL settings may provide reward with an observation or at action time, this change supports both approaches.

Design notes

reward defaults to None — fully backwards-compatible, all existing sample_std.json files validate without modification
Modelled as a plain float scalar (not a distribution or vector) to keep the schema simple and composable

Tests

pytest tests/test_standardized_schemas.py — all 33 datasets pass
Full test suite — 136 passed, 0 failures

I don't think new tests are required by this but happy to add if useful.

…optional `reward: float | None` field to the base `Action` and `Observation` classes,enabling RL training data to carry per-step reward signals. All six concrete action/observation types inherit the field. Existing datasets are unaffected as the field defaults to None.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add per-step reward field to Action and Observation schemas#183

Add per-step reward field to Action and Observation schemas#183
henryjcee wants to merge 1 commit intoneulab:mainfrom
henryjcee:henry/add-reward-to-the-schema

henryjcee commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

henryjcee commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Design notes

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

henryjcee commented Apr 9, 2026 •

edited

Loading