Skip to content

fix(rewrite): preserve newline separators in compound commands#356

Merged
mpecan merged 2 commits intomainfrom
feat/355-preserve-multiline-separators
Apr 29, 2026
Merged

fix(rewrite): preserve newline separators in compound commands#356
mpecan merged 2 commits intomainfrom
feat/355-preserve-multiline-separators

Conversation

@mpecan
Copy link
Copy Markdown
Owner

@mpecan mpecan commented Apr 29, 2026

Summary

Multi-line bash commands going through the tokf hook used to glue adjacent segments together when at least one was rewritten. With a head -N pipe in one line, this produced malformed output like head -1echo > 1echo 2>&1 — the shell created a stray file in the agent's cwd before the malformed flag errored out.

# Before fix:
$ tokf rewrite "$(printf 'cargo test\nls | head -1\necho hi')"
tokf run cargo testtokf run --baseline-pipe 'head -1' lsecho hi

# After fix:
$ tokf rewrite "$(printf 'cargo test\nls | head -1\necho hi')"
tokf run cargo test
tokf run --baseline-pipe 'head -1' ls
echo hi

Root cause

compound_segments iterated every top-level rable AST node with a hard-coded empty parent_sep. rable parses cmd1\ncmd2 as two top-level nodes (not a List with a Newline operator), so the newline separator was dropped during reassembly. The fix slices the source between consecutive node spans byte-for-byte, capturing \n/;/whitespace/comments verbatim.

Diagnostic logging

Also adds TOKF_HOOK_LOG=/path/to/hook.log (off by default). When set, every tokf hook handle invocation appends one YAML record covering BEFORE / AFTER / outcome. This was the reporter's explicit ask: bugs that only manifest in live AI-tool sessions are otherwise invisible — tokf rewrite "..." won't reproduce them when the test command isn't multi-segment with a filter match. Documented in docs/diagnostics.md.

Closes #355

Test plan

  • cargo fmt -- --check
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo test --workspace — 2190 passed
  • New unit tests in bash_ast_tests.rs: separator preservation, byte-for-byte round-trip across &&/||/;/newline/heredoc/line-continuation
  • New integration test in rewrite/tests.rs: explicit Hook creates stray 1<cmd> files in cwd from misformed head -1<word> rewrite #355 reproducer with filter-matched segments
  • New integration tests in cli_hook_handle.rs (7 cases): TOKF_HOOK_LOG records Allow/PassThrough/Ask, multi-line command preserved verbatim, stray-file shape negated, no log when env unset/empty, unwritable path doesn't block hook
  • Inline unit tests in debug_log.rs for the YAML formatter (multi-line indenting, all hook formats, empty input)
  • Verified locally that tokf rewrite of the canonical reproducer now emits separator-preserved output

🤖 Generated with Claude Code

mpecan and others added 2 commits April 29, 2026 23:07
When an agent submits a multi-line bash command via a hook (Claude Code,
Gemini, Cursor) and at least one segment matches a tokf filter, the
rewrite engine used to drop the newlines between segments — gluing the
adjacent commands together. With a `head -N` pipe in one line and any
following token, the result was malformed: `head -1\necho` collapsed
to `head -1echo`, which the shell parsed as `head -1 -e -c -h -o > 1echo`
and silently created stray files (1echo, 1cat, 1tokf, …) in the agent's
cwd. See #355.

Root cause: rable parses `cmd1\ncmd2` as two top-level AST nodes rather
than a single List node, so `compound_segments` emitted an empty
separator for them. The reassembly in `rewrite_with_config_and_options`
joined segments with that empty separator and the newline was lost.

Fix: between consecutive top-level nodes, compute the separator as the
literal source slice between each node's end-span and the next node's
start-span. That captures `\n`, `;`, whitespace, comments, etc.
byte-for-byte, so `seg + sep` round-trips the original source.

Also add `TOKF_HOOK_LOG=/path/to/log`, an opt-in diagnostic that appends
one YAML record per hook invocation (BEFORE / AFTER / outcome). The bug
in #355 was invisible from `tokf rewrite` runs because the reporter
tested without filters; live-session bisection needs this.

Closes #355

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address minor findings from the multi-agent review of PR #355:

- bash_ast: add heredoc and line-continuation cases to the round-trip
  test to lock in that the inter-node source slice doesn't drop heredoc
  bodies or backslash-newline continuations
- hook: add empty-env-var, unwritable-log-path, and Ask-outcome
  integration tests to exercise more of the diagnostic-log call site
  (which has only one shared logging point, so this also covers Deny
  by construction even though no test wires Deny directly)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@repository-butler
Copy link
Copy Markdown
Contributor

Filter Verification Report

Changed Filters

No filter files changed in this PR.

All Filters Summary

✅ 143/143 test cases passed across 51 filters


Generated by tokf verify

@mpecan mpecan merged commit dc5de0f into main Apr 29, 2026
5 checks passed
@mpecan mpecan deleted the feat/355-preserve-multiline-separators branch April 29, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hook creates stray 1<cmd> files in cwd from misformed head -1<word> rewrite

1 participant