Skip to content

fix(cli): stream UTF-8 reads in read tool again#10077

Merged
chrarnoldus merged 2 commits intomainfrom
fix/read-stream-utf8
May 8, 2026
Merged

fix(cli): stream UTF-8 reads in read tool again#10077
chrarnoldus merged 2 commits intomainfrom
fix/read-stream-utf8

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented May 8, 2026

Summary

The previous `lines()` always called `Encoding.read`, which buffers the entire file into memory before line splitting -- even for plain UTF-8 files that never needed encoding translation. Bad for large logs and build artifacts that the caller is going to truncate at the line / 50 KB byte cap anyway.

Optimistically stream the file as UTF-8 using a fatal-mode `TextDecoder` (with `{ stream: true }` for partial multibyte sequences across chunk boundaries). Only fall back to the iconv full-file decode when the bytes turn out not to be valid UTF-8.

The streaming + retry logic lives in a new helper at `packages/opencode/src/kilocode/text-stream.ts` so the `lines` function in `tool/read.ts` stays close to upstream OpenCode's shape.

Comment thread packages/opencode/src/tool/read.ts Outdated
@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented May 8, 2026

Code Review Summary

Status: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 1
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
packages/opencode/src/kilocode/text-stream.ts UTF-8 BOM regression: openUtf8 passes U+FEFF through to the consumer unchanged. For UTF-8-BOM files, the BOM leaks into line 1 of the tool output as a junk character visible to the LLM. The openDecoded fallback path (via iconv-lite) still strips BOMs, creating an inconsistency between the two paths. The main branch always stripped BOMs via Encoding.read(). Fix: add first-line BOM normalization to readLines — e.g. const line_text = count === 1 && text.charCodeAt(0) === 0xfeff ? text.slice(1) : text — or restore the first-flag stripping inside openUtf8.

Fix these issues in Kilo Cloud

Previously Resolved
  • WARNING (read.ts): Encoding detection only sampled the first 4KB — resolved by switching to optimistic UTF-8 streaming
  • WARNING (text-stream.ts): raw stream not destroyed on consumer early-exit — resolved by out.on("close", () => raw.destroy()) in commit bc5dc59
Files Reviewed (3 files)
  • .changeset/read-stream-utf8.md - 0 issues
  • packages/opencode/src/kilocode/text-stream.ts - 1 issue (latest commit: doc trimming only, no logic change)
  • packages/opencode/src/tool/read.ts - 0 issues

Reviewed by claude-sonnet-4.6 · 395,357 tokens

@kilo-code-bot kilo-code-bot Bot force-pushed the fix/read-stream-utf8 branch 2 times, most recently from 35d9ed1 to 3982ec3 Compare May 8, 2026 19:22
Comment thread packages/opencode/src/kilocode/text-stream.ts
Optimistically stream the file as UTF-8 -- the common case -- using a
fatal-mode TextDecoder so the read tool can stop pulling bytes from
disk once the line / 50KB byte cap is hit. Only fall back to a
full-buffer iconv decode when the bytes turn out not to be valid UTF-8.

The streaming + retry logic lives in a new kilo helper
(packages/opencode/src/kilocode/text-stream.ts) so the read tool's
`lines` function stays close to upstream OpenCode shape.
@kilo-code-bot kilo-code-bot Bot force-pushed the fix/read-stream-utf8 branch from 3982ec3 to 1cf0943 Compare May 8, 2026 19:30
When the consumer (readLines) hits the line/byte cap and destroys the
PassThrough, the underlying createReadStream had no link back and would
keep reading chunks to EOF in the background, defeating the early-exit
optimisation for large files.
@kilo-code-bot kilo-code-bot Bot force-pushed the fix/read-stream-utf8 branch from bc5dc59 to 67e9fdc Compare May 8, 2026 19:39
@chrarnoldus chrarnoldus merged commit 045fa89 into main May 8, 2026
13 checks passed
@chrarnoldus chrarnoldus deleted the fix/read-stream-utf8 branch May 8, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants