Actions: EleutherAI/lm-evaluation-harness
Actions
Showing runs from all workflows
2,500+ workflow runs
2,500+ workflow runs
tool_calls and reasoning: Tracking and evaluation
Unit Tests
#6140:
Pull request #3685
synchronize
by
RawthiL
tool_calls and reasoning: Tracking and evaluation
Tasks Modified
#6167:
Pull request #3685
synchronize
by
RawthiL