fix(rewrite): preserve newline separators in compound commands#356
Merged
fix(rewrite): preserve newline separators in compound commands#356
Conversation
When an agent submits a multi-line bash command via a hook (Claude Code, Gemini, Cursor) and at least one segment matches a tokf filter, the rewrite engine used to drop the newlines between segments — gluing the adjacent commands together. With a `head -N` pipe in one line and any following token, the result was malformed: `head -1\necho` collapsed to `head -1echo`, which the shell parsed as `head -1 -e -c -h -o > 1echo` and silently created stray files (1echo, 1cat, 1tokf, …) in the agent's cwd. See #355. Root cause: rable parses `cmd1\ncmd2` as two top-level AST nodes rather than a single List node, so `compound_segments` emitted an empty separator for them. The reassembly in `rewrite_with_config_and_options` joined segments with that empty separator and the newline was lost. Fix: between consecutive top-level nodes, compute the separator as the literal source slice between each node's end-span and the next node's start-span. That captures `\n`, `;`, whitespace, comments, etc. byte-for-byte, so `seg + sep` round-trips the original source. Also add `TOKF_HOOK_LOG=/path/to/log`, an opt-in diagnostic that appends one YAML record per hook invocation (BEFORE / AFTER / outcome). The bug in #355 was invisible from `tokf rewrite` runs because the reporter tested without filters; live-session bisection needs this. Closes #355 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address minor findings from the multi-agent review of PR #355: - bash_ast: add heredoc and line-continuation cases to the round-trip test to lock in that the inter-node source slice doesn't drop heredoc bodies or backslash-newline continuations - hook: add empty-env-var, unwritable-log-path, and Ask-outcome integration tests to exercise more of the diagnostic-log call site (which has only one shared logging point, so this also covers Deny by construction even though no test wires Deny directly) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Filter Verification ReportChanged FiltersNo filter files changed in this PR. All Filters Summary✅ 143/143 test cases passed across 51 filters Generated by |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Multi-line bash commands going through the tokf hook used to glue adjacent segments together when at least one was rewritten. With a
head -Npipe in one line, this produced malformed output likehead -1echo > 1echo 2>&1— the shell created a stray file in the agent's cwd before the malformed flag errored out.Root cause
compound_segmentsiterated every top-level rable AST node with a hard-coded emptyparent_sep. rable parsescmd1\ncmd2as two top-level nodes (not aListwith a Newline operator), so the newline separator was dropped during reassembly. The fix slices the source between consecutive node spans byte-for-byte, capturing\n/;/whitespace/comments verbatim.Diagnostic logging
Also adds
TOKF_HOOK_LOG=/path/to/hook.log(off by default). When set, everytokf hook handleinvocation appends one YAML record covering BEFORE / AFTER / outcome. This was the reporter's explicit ask: bugs that only manifest in live AI-tool sessions are otherwise invisible —tokf rewrite "..."won't reproduce them when the test command isn't multi-segment with a filter match. Documented indocs/diagnostics.md.Closes #355
Test plan
cargo fmt -- --checkcargo clippy --workspace --all-targets -- -D warningscargo test --workspace— 2190 passedbash_ast_tests.rs: separator preservation, byte-for-byte round-trip across&&/||/;/newline/heredoc/line-continuationrewrite/tests.rs: explicit Hook creates stray 1<cmd> files in cwd from misformed head -1<word> rewrite #355 reproducer with filter-matched segmentscli_hook_handle.rs(7 cases): TOKF_HOOK_LOG records Allow/PassThrough/Ask, multi-line command preserved verbatim, stray-file shape negated, no log when env unset/empty, unwritable path doesn't block hookdebug_log.rsfor the YAML formatter (multi-line indenting, all hook formats, empty input)tokf rewriteof the canonical reproducer now emits separator-preserved output🤖 Generated with Claude Code