Byte buffers holding XLUnicodeString prematurely truncated#521
Byte buffers holding XLUnicodeString prematurely truncated#521sftse wants to merge 5 commits intotafia:masterfrom
Conversation
|
Looks like a good patchset with tests and a sensible fix. I vote +1 for merge and if there is no dissent I will merge in a few days. |
|
Before merge I want to add the particular, spec-nonconformant |
Okay. I'll wait for that. Let me know. |
|
While revisiting this to add a unit test, I noticed that the formula that is triggering this high byte string panic is also the one from row 43 col 9 that was the point of discussion here. This makes me think there is more to this test than initially thought. The other thing I'm not able to piece together is that the If there are any doubts about how the |
|
Actually, let's hold off on merge until we can investigate this further. |
I created a test xls file called This has the following binary data for the formula: I intrepreted this as: (As an aside, I said in another comment that The formula in Comparing the 2 binary data sets: There are 2 odd things here.
All in all, it looks like a junk formula except for the fact that Excel (including a couple of older versions that I have) reads it. This is one of the reasons I stopped working with the xls format once xlsx came along. :-) We could maybe just fix the Update 2: I think I understand the difference between the 2 ptgRef structures above. The first in The second is the 4 byte RgceLoc (2.5.198.109) with the row/col relative/absolute bit in the column entry. |
|
@sftse I undated the previous comment to clarify what |
|
@sftse let me know when this patchset ready for merge? |
When reading a PtgStr 2.5.198.89 the cch byte of ShortXLUnicodeString indicates the number of characters in the string. The error here is twofold: 1. The byte buffer holding the string characters in prematurely truncated before calling fn read_unicode_string_no_cch() based on cch, although the correct length in bytes can only be known inside fn read_unicode..() after checking the fHighByte flag. The fix is to not truncate the buffer at all pass it in its entirety so that fn read_unicode_string_no_cch() may decide how many bytes to read. 2. The second error then advances the offset into the buffer based on this erroneous length, which later leads to crashes.
5a43311 to
b9ed9a7
Compare
|
Haven't forgotten about this, thanks for your review. Rebased, will revisit. |
This PR is another batch of commits from #463.
The first commit b5b948a introduces a test that fails.
1c58893 fixes one of the mistakes,
so that a different error message appears.Already fixed in upstreamed PR.599eae8 fixes the next failure due to a spec nonconformity of the test file, which seems like a logical extension of the standard.
Edit: updated commit hashes after push