Skip to content

More OOM fixes#525

Merged
jmcnamara merged 6 commits intotafia:masterfrom
sftse:underflow2
Jul 9, 2025
Merged

More OOM fixes#525
jmcnamara merged 6 commits intotafia:masterfrom
sftse:underflow2

Conversation

@sftse
Copy link
Copy Markdown
Contributor

@sftse sftse commented Jul 7, 2025

This is another batch of commits from #463.

This adds another test case that OOM allocates.
I rechecked the work and noticed that the fn parse_dimensions is in some sense superfluous. This fixes the function as-is, but I don't see what the function is even doing, or if there are any features that depend on it. It only is used to reserve space on the Vec holding the cells, and from basic testing always overallocates, sometimes by quite a bit.

If no features depend on parsing the dimensions it might be sensible to remove, but we can also merge as-is.

@jmcnamara jmcnamara requested a review from Copilot July 9, 2025 00:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds handling for an additional OOM-related test file and tightens parsing functions to avoid over-allocation and out-of-bounds dimensions.

  • Extended the existing test_oom_allocation to cover OOM_alloc2.xls and verify the worksheet name.
  • Refactored parse_string to use a dynamic header length, early-return empty strings for zero-length records, and updated decoding offsets.
  • Updated parse_dimensions to clamp invalid column ranges and added a new unit test for parse_string.
Comments suppressed due to low confidence (2)

src/xls.rs:875

  • Add a unit test for parse_dimensions covering the branch where cf > 0xFF or cl < cf, to confirm that cf is correctly reset to 0.
    if 0xFF < cf || cl < cf {

src/xls.rs:792

  • Introduce a unit test for the special-case in parse_string where a two-byte zero-length record returns Ok(String::new()), ensuring the early-return branch behaves as intended.
        if 2 == r.len() && read_u16(r) == 0 {

Comment thread src/xls.rs
@jmcnamara
Copy link
Copy Markdown
Collaborator

jmcnamara commented Jul 9, 2025

LGTM. I ran Copilot review since it sometimes picks up something useful. In this case I'm not sure the comment is correct. I'm happy to merge as-is unless you want to make a change.

I rechecked the work and noticed that the fn parse_dimensions is in some sense superfluous. This fixes the function as-is, but I don't see what the function is even doing, or if there are any features that depend on it. It only is used to reserve space on the Vec holding the cells, and from basic testing always overallocates, sometimes by quite a bit.

That would require a call by @tafia.

I'll review it in the context of the Xlsx file when I get to that part of the docs. Let's leave it in place for now.

@sftse
Copy link
Copy Markdown
Contributor Author

sftse commented Jul 9, 2025

Named constants make sense to me if there really is a meaningful name to attach to it or if it is used repeatedly. In this case it really is a magic constant extracted from the spec, with no obvious indication why it may not be larger than 0xFF. Referencing the spec is about as meaningful as it can get.

Good to merge.

@jmcnamara jmcnamara merged commit c9d5868 into tafia:master Jul 9, 2025
4 checks passed
@jmcnamara
Copy link
Copy Markdown
Collaborator

Merged. Thanks.

@sftse sftse deleted the underflow2 branch July 9, 2025 08:36
@jmcnamara
Copy link
Copy Markdown
Collaborator

@sftse Can #463 be closed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants