text-freetype2: Track text length explicitly to permit embedded null bytes#13093
text-freetype2: Track text length explicitly to permit embedded null bytes#130934RH1T3CT0R7 wants to merge 1 commit intoobsproject:masterfrom
Conversation
|
This appears to be a similar change to #11045 What makes this change more (or less) correct? The previous PR had discussion and landed in a method we were happy with, but fell through the cracks due to low priority. EDIT: Fixed the link, copy/paste failed me :) |
…bytes Instead of stripping null bytes (which silently discards data), track string lengths explicitly using a new text_len field in ft2_source. This passes the known buffer size to os_utf8_to_wcs, allowing null bytes to be converted as regular characters rather than truncating the string at the first null. Updates remove_cr, cache_glyphs, and get_ft2_text_width to work with explicit lengths instead of relying on null-terminated string semantics. Resolves: obsproject#7595
|
Thanks for pointing out #11045 - I wasn't aware of it when I submitted this PR. After reviewing that PR and the prior discussion (#7628, #7595), I agree that explicitly tracking string lengths (as #11045 does) is the better approach - it preserves null bytes as characters rather than silently removing them, which aligns with the maintainers' preference for representing them as invalid character glyphs. My initial approach (stripping null bytes) was essentially the same as what was rejected in #7628. I've reworked this PR to follow the explicit length-tracking approach from #11045, rebased on the current master. Key changes:
Since #11045 has been open since July 2024, hopefully this helps move the fix forward. |
|
Thanks for the follow up. The PR being old doesn't mean we're not still considering it, or that is not still correct. @kkartaltepe Does this now supersede your PR? I'd like to close one of these that are going towards the same goal, so we can focus review efforts on either one (which are scheduled for review next major version). |
|
Triage note: Need to determine if this or #11045 is better. |
Description
When 'Read from file' is enabled, files containing embedded null bytes are truncated at the first null because
strlen()treats it as end-of-string. This causes only partial file content to display.Approach
Instead of stripping null bytes (as in the initial version of this PR), this now follows the explicit length-tracking approach discussed in #11045:
text_lenfield toft2_sourcestruct to track wide string length explicitlyfilesize) toos_utf8_to_wcs_ptrinstead ofstrlen(), allowing embedded null bytes to be converted as regular characterswcslen(srcdata->text)calls withsrcdata->text_lenthroughout the pluginremove_cr(),cache_glyphs(), andget_ft2_text_width()to work with explicit lengthslibobs/util/utf8.c— the existingutf8_to_wcharalready handles embedded nulls correctly wheninsizeis providedThis is safe because null bytes never appear as continuation bytes in valid UTF-8 sequences, and
utf8_to_wchardocuments that "If UTF-8 string contains zero symbols, they will be translated as regular symbols" when explicit size is given.Testing
read_from_endpath) tested with both null-byte and clean filesResolves #7595
Checklist
clang-format