More OOM fixes by sftse · Pull Request #525 · tafia/calamine

sftse · 2025-07-07T19:02:45Z

This is another batch of commits from #463.

This adds another test case that OOM allocates.
I rechecked the work and noticed that the fn parse_dimensions is in some sense superfluous. This fixes the function as-is, but I don't see what the function is even doing, or if there are any features that depend on it. It only is used to reserve space on the Vec holding the cells, and from basic testing always overallocates, sometimes by quite a bit.

If no features depend on parsing the dimensions it might be sensible to remove, but we can also merge as-is.

…prevent OOM allocation due to overflow

…e, otherwise this will crash

Copilot

Pull Request Overview

This PR adds handling for an additional OOM-related test file and tightens parsing functions to avoid over-allocation and out-of-bounds dimensions.

Extended the existing test_oom_allocation to cover OOM_alloc2.xls and verify the worksheet name.
Refactored parse_string to use a dynamic header length, early-return empty strings for zero-length records, and updated decoding offsets.
Updated parse_dimensions to clamp invalid column ranges and added a new unit test for parse_string.

Comments suppressed due to low confidence (2)

src/xls.rs:875

Add a unit test for parse_dimensions covering the branch where cf > 0xFF or cl < cf, to confirm that cf is correctly reset to 0.

    if 0xFF < cf || cl < cf {

src/xls.rs:792

Introduce a unit test for the special-case in parse_string where a two-byte zero-length record returns Ok(String::new()), ensuring the early-return branch behaves as intended.

        if 2 == r.len() && read_u16(r) == 0 {

jmcnamara · 2025-07-09T00:25:25Z

LGTM. I ran Copilot review since it sometimes picks up something useful. In this case I'm not sure the comment is correct. I'm happy to merge as-is unless you want to make a change.

I rechecked the work and noticed that the fn parse_dimensions is in some sense superfluous. This fixes the function as-is, but I don't see what the function is even doing, or if there are any features that depend on it. It only is used to reserve space on the Vec holding the cells, and from basic testing always overallocates, sometimes by quite a bit.

That would require a call by @tafia.

I'll review it in the context of the Xlsx file when I get to that part of the docs. Let's leave it in place for now.

sftse · 2025-07-09T07:57:55Z

Named constants make sense to me if there really is a meaningful name to attach to it or if it is used repeatedly. In this case it really is a magic constant extracted from the spec, with no obvious indication why it may not be larger than 0xFF. Referencing the spec is about as meaningful as it can get.

Good to merge.

jmcnamara · 2025-07-09T08:35:55Z

Merged. Thanks.

jmcnamara · 2025-07-09T09:21:25Z

@sftse Can #463 be closed now?

sftse added 6 commits July 7, 2025 12:40

test: add xls with oom allocations

0376f64

fix: tests/OOM_alloc2.xls: if first column is too large, set to 0 to …

cb39432

…prevent OOM allocation due to overflow

fix: accept XLUnicodeString without reserved tag

f7c8508

test: expand

085ac5b

test: cover parse_string edgecase

18e1c37

fix: length checks must cover biff != Biff2-Biff5 && r.len() == 2 cas…

eb83f93

…e, otherwise this will crash

sftse force-pushed the underflow2 branch from edf2bac to eb83f93 Compare July 8, 2025 08:45

jmcnamara requested a review from Copilot July 9, 2025 00:17

Copilot AI reviewed Jul 9, 2025

View reviewed changes

Comment thread src/xls.rs

jmcnamara merged commit c9d5868 into tafia:master Jul 9, 2025
4 checks passed

sftse deleted the underflow2 branch July 9, 2025 08:36

sftse mentioned this pull request Oct 23, 2025

Accept XLUnicodeRichExtendedString without reserved tag #571

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More OOM fixes#525

More OOM fixes#525
jmcnamara merged 6 commits intotafia:masterfrom
sftse:underflow2

sftse commented Jul 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

jmcnamara commented Jul 9, 2025 •

edited

Loading

Uh oh!

sftse commented Jul 9, 2025

Uh oh!

Uh oh!

jmcnamara commented Jul 9, 2025

Uh oh!

jmcnamara commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sftse commented Jul 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

jmcnamara commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sftse commented Jul 9, 2025

Uh oh!

Uh oh!

jmcnamara commented Jul 9, 2025

Uh oh!

jmcnamara commented Jul 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jmcnamara commented Jul 9, 2025 •

edited

Loading