Fix #233: prevent YouTube/media thumbnails from being stored as base6…#237
Fix #233: prevent YouTube/media thumbnails from being stored as base6…#237pieer wants to merge 4 commits into
Conversation
…4 and suppress diff Two-part fix for issue #233 where adding a YouTube link to an article triggered 'File diff suppressed because one or more lines are too long': 1. toast-editor.ts: intercept clipboard HTML paste in WYSIWYG mode and strip external <img> elements before Toast UI processes them. Images copied from media sites (YouTube thumbnails, og:image, etc.) were flowing through addImageBlobHook and being stored as very long base64 strings in the markdown file. Only data: URI images (deliberately embedded by the user) are kept. 2. gitdiff.go: detect diff lines that are Markdown images with an inline base64 data URI. Instead of marking the entire file as IsIncompleteLineTooLong (which hides the whole diff with no load option), substitute a short placeholder like '[embedded base64 image, ~45 KB]' so the rest of the diff renders normally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SummaryTwo fixes for issue #233: (1) a paste interceptor in Findings1. 🔴 Backend base64 placeholder is unreachable for real images (default config)File: // Handle the base64 case inside the isFragment path, where over-long lines actually land.
if isFragment {
if len(line) > 1 && isBase64ImageDiffLine(line[1:]) {
drained := len(line)
for isFragment {
lineBytes, isFragment, err = input.ReadLine()
if err != nil { return lineBytes, isFragment, fmt.Errorf("unable to ReadLine: %w", err) }
drained += len(lineBytes)
}
decodedKB := drained * 3 / 4 / 1024 // approximate
line = string(line[0]) + fmt.Sprintf("[embedded base64 image, ~%d KB]", decodedKB)
} else {
curFile.IsIncomplete = true
curFile.IsIncompleteLineTooLong = true
for isFragment { /* drain as today */ }
}
}Status:
2. 🔴 Frontend re-dispatches the cleaned paste at an element the editor never listens onFile: // Dispatch on the original target (inside view.dom) so the editor actually receives it.
// The capture listener re-runs but now finds no external <img>, so it won't re-dispatch.
const target = (e.target as HTMLElement) ?? container;
target.dispatchEvent(new ClipboardEvent('paste', {bubbles: true, cancelable: true, clipboardData: dt}));Status:
3. 🟡 Synthetic
|
| Severity | Count |
|---|---|
| 🔴 High | 2 |
| 🟡 Medium | 2 |
| ⚪️ Low | 2 |
The base64 placeholder only ran in the `len(line) > maxLineCharacters` branch, but a real embedded image far exceeds the read buffer (max(maxLineCharacters, 4096), 5000 by default), so ReadLine returns it as a fragment. The pre-existing `if isFragment` block set IsIncompleteLineTooLong and drained the line before the placeholder check, and the truncated fragment was never > maxLineCharacters — so the whole file diff was still suppressed under the default config. - Detect and substitute the placeholder inside the isFragment path (counting the drained bytes for the size estimate); keep the non-fragment branch for the rare case where maxLineCharacters is configured below the buffer size. - Anchor isBase64ImageDiffLine to the Markdown image syntax via regex so prose or code merely mentioning a data URI is not misclassified. - Estimate the decoded size more accurately and format it with base.FileSize, and keep the image's alt text in the placeholder. - Add TestParsePatch_base64Image covering the fragment path and the prose guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ontainer The paste interceptor rebuilt a DataTransfer without external images but dispatched the synthetic paste on `container` — an ancestor of the element the editor listens on (the pseudo-clipboard textarea in markdown mode, the ProseMirror contenteditable in WYSIWYG). Events don't descend into children, so the editor never received it and the entire paste was dropped whenever an external image was present. Dispatch on `e.target` instead; rebuilding the clipboard without the image item is what preserves the text while dropping the incidental thumbnail. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gracefully
new ClipboardEvent('paste', {clipboardData}) is not honored everywhere (Safari
ignores it), so re-dispatching would carry no data and silently drop the paste.
Detect support once; when unavailable, skip the strip/re-dispatch and let the
paste proceed normally — the gitdiff base64 placeholder remains the backend
safety net, so the only downside on those engines is the (now diff-safe) base64
still being stored, rather than losing the user's pasted content.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fix #233: Ensure media previews do not trigger MAX_GIT_DIFF_LINE_CHARACTERS
Issue 1 (YouTube → base64): The mechanism is rich-text paste. When a user copies content from YouTube.com, the browser clipboard carries HTML that includes the channel thumbnail as an
element. In WYSIWYG mode, Toast UI processes that HTML, extracts the image, and the addImageBlobHook in toast-editor.ts:87 converts it to base64 before storing it in markdown. Intentional design for dropped/local images — unintended for externally-fetched thumbnails in clipboard HTML.
Issue 2 (diff suppression): parseHunks in gitdiff.go:977 truncates any line exceeding MaxGitDiffLineCharacters (5000) and sets IsIncompleteLineTooLong = true, which causes the diff template (box.tmpl:175) to suppress the entire file diff with no load button
Fix 1 — web_src/js/features/toast-editor.ts (prevents the base64 from being created)
Added a paste intercept that runs before Toast UI processes clipboard HTML. When HTML content is pasted (e.g. copied from YouTube with a thumbnail
in the rich HTML), it strips any image whose src is an external URL, replacing it with its alt text or removing it. Images with data: URIs (locally dropped/deliberately embedded) are untouched. The cleaned HTML is re-dispatched so Toast UI still processes the rest of the pasted content normally.
Fix 2 — services/gitdiff/gitdiff.go (prevents the diff from being suppressed even if base64 is present)
Added isBase64ImageDiffLine() which detects Markdown image lines containing an inline data:image/...;base64, payload. When such a line exceeds maxLineCharacters, instead of marking the entire file as IsIncompleteLineTooLong (which hides the whole diff with no load option), the base64 payload is replaced with a short placeholder like +[embedded base64 image, ~45 KB]. All other content in the file renders normally.
Co-Author: Opus 4.8