Skip to content

Fix BigBench multiple-choice crash on mixed-format tasks#3702

Open
Chessing234 wants to merge 1 commit intoEleutherAI:mainfrom
Chessing234:fix/bigbench-mc-filter
Open

Fix BigBench multiple-choice crash on mixed-format tasks#3702
Chessing234 wants to merge 1 commit intoEleutherAI:mainfrom
Chessing234:fix/bigbench-mc-filter

Conversation

@Chessing234
Copy link
Copy Markdown

Bug

BigBench multiple-choice tasks crash with ValueError on subtasks that contain a mix of multiple-choice and free-form examples in the same dataset split (e.g. kanji_ascii).

The Jinja template {{multiple_choice_targets.index(targets[0])}} raises ValueError: 'u' is not in list (or similar) when it encounters a row where multiple_choice_targets is an empty list.

Root cause

generate_tasks.py decides whether a subtask qualifies as multiple-choice by checking only the first example. Subtasks like kanji_ascii have multiple-choice examples at the start but free-form examples later (index 188+), where multiple_choice_targets and multiple_choice_scores are both [].

Fix

Add a process_docs filter to both multiple-choice template YAMLs that drops rows with empty multiple_choice_targets before evaluation. This follows the same pattern used by other tasks in the repo (e.g. crows_pairs, bbq).

Three files changed:

  • New lm_eval/tasks/bigbench/utils.py — one filter function
  • Modified multiple_choice_template_a_yaml — added process_docs line
  • Modified multiple_choice_template_b_yaml — added process_docs line

Fixes #3636

Some BigBench subtasks (e.g. kanji_ascii) contain a mix of
multiple-choice and free-form examples in the same dataset split.
The multiple-choice templates crash with a ValueError when they
encounter rows where `multiple_choice_targets` is an empty list.

Add a `process_docs` filter that drops rows without
multiple-choice targets before evaluation.

Fixes EleutherAI#3636

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Chessing234 Chessing234 requested a review from 0xSMT as a code owner April 13, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bigbench benchmark breaks with jinja error

1 participant