Skip to content

[C++][Parquet] Uncontrolled Memory Allocation (OOM) in Parquet Delta decoders #49955

@sivaadityacoder

Description

@sivaadityacoder

Describe the bug, including details regarding any error messages, version, and platform.

An uncontrolled memory allocation vulnerability exists in the Parquet DeltaByteArrayDecoder, DeltaLengthByteArrayDecoder, and DeltaBitPackDecoder.

Currently, these decoders trust the num_values (and implicitly the total_value_count_) provided by the Parquet data page header. The decoders eagerly allocate memory arrays sized proportionally to this unvalidated count (e.g. buffered_prefix_length_->Resize(num_prefix * sizeof(int32_t))).

An attacker can exploit this by crafting a tiny Parquet file (e.g., ~300 bytes) with a maliciously forged num_values (e.g., 1,000,000,000). When this file is opened, Arrow attempts an immediate massive allocation (e.g., 4 GB), resulting in a std::bad_alloc and an immediate Denial of Service (OOM) crash.

(Note: This vulnerability was originally submitted to the Huntr bug bounty platform (Reference ID: 7814255d-e945-427f-ab84-6eddc3a35a37), and is being filed here at the recommendation of the ASF Security Team and Andrew Lamb).

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions