fix: predict all CoQA turn answers instead of only the last turn by rahulraj-jhawar-devrev · Pull Request #3704 · EleutherAI/lm-evaluation-harness

rahulraj-jhawar-devrev · 2026-04-14T15:50:36Z

Problem

CoQA implementation only predicts the last answer of each text. The official CoQA benchmark evaluates predictions for ALL turn_ids and averages results across turns.

Changes

Modified CoQA utils to iterate over all turns instead of just the last one
Maintains conversation context (previous Q&A pairs) for each turn prediction
Output format matches official CoQA evaluation expectations

Implementation Details

Added process_docs function that expands each conversation into multiple instances (one per turn)
Each expanded instance contains the story and conversation history up to that specific turn
The model predicts the answer for each turn with full context of previous Q&A pairs
Version bumped to 4.0 to reflect the change in evaluation behavior

🤖 Generated with Claude Code

EleutherAI#1231) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

CLAassistant · 2026-04-14T15:50:42Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Rahulraj Jhawar seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

fix: predict all CoQA turn answers instead of only the last turn (fixes

df338dd

EleutherAI#1231) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: predict all CoQA turn answers instead of only the last turn#3704

fix: predict all CoQA turn answers instead of only the last turn#3704
rahulraj-jhawar-devrev wants to merge 1 commit intoEleutherAI:mainfrom
rahulraj-jhawar-devrev:rahulraj/contributions

rahulraj-jhawar-devrev commented Apr 14, 2026

Uh oh!

CLAassistant commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rahulraj-jhawar-devrev commented Apr 14, 2026

Problem

Changes

Implementation Details

Uh oh!

CLAassistant commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants