Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5845454. Configure here.
| state["sandbox_id"], | ||
| command, | ||
| working_dir=self._workdir(state), | ||
| timeout=self.debug_timeout or self.test_timeout, |
There was a problem hiding this comment.
Falsy check on debug_timeout ignores explicit zero
Low Severity
The expression self.debug_timeout or self.test_timeout uses Python's or operator, which treats 0 as falsy. If a caller explicitly passes debug_timeout=0, it will be silently ignored and self.test_timeout (default 900) will be used instead. The correct pattern for an int | None optional is self.debug_timeout if self.debug_timeout is not None else self.test_timeout.
Reviewed by Cursor Bugbot for commit 5845454. Configure here.
| - entry: create sandbox and optionally run ``taskset.setup(state)`` | ||
| - debug step: ``none``, ``gold_patch``, ``command``, or ``script`` | ||
| - exit: optionally run task tests and score them | ||
| """ |
There was a problem hiding this comment.
New environment class missing documentation updates
Low Severity
SWEDebugEnv is a new user-facing environment class exported from the experimental composable module, but no documentation files are updated. The docs/environments.md file describes the composable module's classes (ComposableEnv, TaskSet, SandboxTaskSet, Harness, SandboxSpec) under the experimental section, and docs/reference.md lists environment classes. The new SWEDebugEnv class is not mentioned in either. This violates the rule requiring documentation updates when adding core user-facing functionality described in docs/.
Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit 5845454. Configure here.


Summary
SWEDebugEnv, a no-agent staged debugger for SWE-styleSandboxTaskSetinstancesdebug_step(none,gold_patch,command,script), and optional test/scoring at exitSWEDebugEnvfrom the experimental composable modulesValidation
uv run pytest tests/test_swe_debug_env.py -quv run ruff check verifiers/envs/experimental/composable/swe_debug_env.py tests/test_swe_debug_env.pyruff check,ruff format,ty (ci parity)Note
Medium Risk
Adds a new sandbox-orchestrating environment that can execute arbitrary debug commands/scripts and short-circuit test runs, which could affect resource usage and failure classification. Also changes Multi-SWE dataset construction by no longer excluding C/C++ rows, potentially impacting evaluation mix and runtime.
Overview
Adds
SWEDebugEnv, a no-agent experimental environment that creates a SWE-style sandbox, optionally runs task setup, performs one configurable debug step (none,gold_patch,command,script), and optionally runs/scoring tests, recording timing and output tails plus standardized failure reasons.Exports
SWEDebugEnvvia the experimental__init__modules and adds focused pytest coverage for the pipeline and failure handling. Separately updatesMultiSWETaskSetto stop filtering out C/C++ tasks during dataset build.Reviewed by Cursor Bugbot for commit 4208539. Bugbot is set up for automated code reviews on this repo. Configure here.