skills(triage): drop "run full suite" — defer to PR CI per "ship before end_turn"#671
Merged
Conversation
…re end_turn" Triage Step 6.3's "Run the full test suite and lints" conflicts with running-in-ci's "End the turn only when work is shipped" (#661): a long suite kicked off before push pushes the agent into mid-wait end_turn and the deliverable never ships. Fold the targeted-test check into Step 2 and renumber. Closes a 2-occurrence pattern (May tend-mention #5741, June tend-triage on PRQL/prql#5988) where the agent finished real work, called ScheduleWakeup expecting resumption, and exited without commit/push. Co-Authored-By: Claude <noreply@anthropic.com>
tend-agent
commented
Jun 8, 2026
tend-agent
left a comment
Collaborator
Author
There was a problem hiding this comment.
Same pattern elsewhere: plugins/tend-ci-runner/skills/nightly/SKILL.md line 281 still says branch, fix, run full test suite, commit, push, create PR, poll CI. By this PR's own argument — an explicit step-list instruction tends to override the generic running-in-ci rule — nightly remains vulnerable to the same mid-wait-end_turn failure mode, just one of the project's longer suites away from reproducing it. Happy to push a follow-up that drops run full test suite from that line; say the word.
This was referenced Jun 9, 2026
Merged
max-sixty
added a commit
that referenced
this pull request
Jun 11, 2026
Bumps the generator version to 0.1.4, syncs the lockfile, and adds the 0.1.4 CHANGELOG section (published as the GitHub Release notes on tag push). Notable changes since 0.1.3: - Claude harnesses switch to `bypassPermissions` (#677) - GitHub Releases publish from `CHANGELOG.md` on tag push (#678) — this release is the first live exercise of that workflow - Interactive sandbox receives GitHub Actions context env vars (#664) - install-tend bot-token step restructured around scope-audit remediation (#680) - Review skill checks the rollup before APPROVE (#667); running-in-ci/triage/ci-fix skill refinements (#661, #666, #669, #670, #671) > _This was written by Claude Code on behalf of max_ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
tend-triage's Step 6 "If fixing" tells the bot to "Run the full test suite and lints" before commit/push. That directly conflicts withrunning-in-ci's "End the turn only when work is shipped" (added in #661), which forbids backgrounding pre-push work the deliverable depends on.Today on
PRQL/prql, run 27149177431 (tend-triageon issue #5988, "Compiler crash when joining a zero-row relation"):task prqlc:pull-request(~7 min suite) withrun_in_background: true.Monitor(timeout_ms: 420000) andScheduleWakeup(delaySeconds: 180) to wait for the suite, then emittedend_turn.Last assistant text: "I've scheduled a fallback check. Waiting for the test suite to finish."
This is the second occurrence of "agent calls
ScheduleWakeup/Monitorin CI expecting resumption and the deliverable never ships" — the first wastend-mentionrun 26347739838 (May 24, fix attempt closed as #593, gate-tightening followed in #602, and the root-cause guidance shipped in #661 — but the triage skill never picked up the alignment). PRQL is still pinned tomax-sixty/tend@0.1.2, predating #661, so the conflicting Step 6.3 won. Even once consumers bump past #661, an agent reading both skills will see Step 6.3's explicit instruction and a generic principle inrunning-in-ci; the explicit instruction tends to win.Solution
Rewrite triage Step 6.3 so the two skills agree: the targeted reproduction test plus a clean compile is sufficient confidence; the comprehensive suite is PR CI's job. Renumber the following two steps (4 → 3, 5 → 4) since "Confirm the test passes" merges naturally into Step 2.
Gate assessment
end_turnleaves deliverable unshipped" acrosstend-mentionandtend-triage. Both fully diagnosed from session logs.Evidence gist (full per-run history): https://gist.github.com/a017dd295de05dc44af4479748ffe636
Sources and notes
kgutwin, non-maintainer).Bash(task prqlc:pull-request, timeout 420000)→Bash(until... done, run_in_background: true, timeout 420000)→Monitor(timeout_ms: 420000, persistent: false)→Write(/tmp/pr-body.md)→Read(.../tasks/...output)(×3) →ScheduleWakeup(delaySeconds: 180, prompt: "Check the prqlc:pull-request test output ... and continue triage of issue 5988"). Stop reason:end_turn. Nogit commit/git push/gh pr create/gh issue commentin the session.tend-mentionoccurrence: PRQL/prql Run 26347739838, terminated at 6h cap; documented in the 2026-05 evidence gist under "Acted on (High evidence, structural):tend-mention6h-cap idle afterScheduleWakeupin CI".forbid ScheduleWakeup and fire-and-forget background bash in CI) — closed at 1 occurrence pre-gate-tightening. Comment: "no! we need to change our criteria; this is structural and it's not critical". Gates were tightened in skills(review-gates): structural classification doesn't override Gate 1 #602, then skills(running-in-ci): end the turn only when work is shipped #661 added the more nuanced framing — this PR completes the picture by removing the triage-side conflict.running-in-ci's rule (post-skills(running-in-ci): end the turn only when work is shipped #661): "A targeted compile plus the tests directly exercising the change is enough local confidence to ship — leave the comprehensive matrix to CI."