Skip to content

pref: Use builtin compression for arrow ipc writer #4655

@wForget

Description

@wForget

What is the problem the feature request solves?

I benchmarked several shuffle IPC write strategies and compared five different approaches. The results show that Arrow IPC's built-in compression provides both better performance and a higher compression ratio than the other tested approaches.

The evaluated strategies are:

  1. outer-zstd-stream — a single ZSTD stream wrapping the entire IPC stream.
  2. ipc-builtin-zstd — Arrow IPC StreamWriter with built-in ZSTD compression.
  3. per-batch-zstd-stream — each IPC batch written into its own ZSTD frame (current comet shuffle style).
  4. per-batch-bulk-zstd — per-batch IPC buffer compressed with bulk ZSTD and written as length-prefixed frames.
  5. per-batch-ipc-builtin-zstd — per-batch IPC stream using Arrow IPC's built-in ZSTD compression.

Based on the benchmark results, the built-in Arrow IPC compression (ipc-builtin-zstd) achieves the best overall balance of throughput and compression efficiency.

Run ipc_write_bench benchmark:

cargo run --features shuffle-bench -p datafusion-comet-shuffle --bin ipc_write_bench -- \
  --input benchmark_data/customer.parquet \
  --output-dir /tmp/ipc_write_bench \
  --iterations 3 \
  --warmup 1

result:

=== Column Batch Write Benchmark ===
Input:       benchmark_data/customer.parquet
Batches:     210
Total rows:  1,500,000
Batch size:  8,192
Zstd level:  3
Iterations:  3 (warmup: 1)

--- outer-zstd-stream ---
  write avg:      15.853s
  write min/max:  15.702s / 16.099s
  write rows/s:   94,620
  read avg:       4.818s
  read min/max:   4.568s / 5.112s
  read rows/s:    311,306
  file size:      319.96 MiB
  verify rows:    1,500,000 (ok)

--- ipc-builtin-zstd ---
  write avg:      13.864s
  write min/max:  13.829s / 13.901s
  write rows/s:   108,195
  read avg:       4.755s
  read min/max:   4.730s / 4.804s
  read rows/s:    315,465
  file size:      318.94 MiB
  verify rows:    1,500,000 (ok)

--- per-batch-zstd-stream ---
  write avg:      15.255s
  write min/max:  15.172s / 15.303s
  write rows/s:   98,327
  read avg:       4.968s
  read min/max:   4.954s / 4.994s
  read rows/s:    301,933
  file size:      319.97 MiB
  verify rows:    1,500,000 (ok)

--- per-batch-bulk-zstd ---
  write avg:      14.510s
  write min/max:  14.395s / 14.612s
  write rows/s:   103,380
  read avg:       4.933s
  read min/max:   4.883s / 5.025s
  read rows/s:    304,095
  file size:      318.45 MiB
  verify rows:    1,500,000 (ok)

--- per-batch-ipc-builtin-zstd ---
  write avg:      13.938s
  write min/max:  13.814s / 14.010s
  write rows/s:   107,616
  read avg:       4.922s
  read min/max:   4.805s / 5.001s
  read rows/s:    304,730
  file size:      319.04 MiB
  verify rows:    1,500,000 (ok)

=== Summary ===
method                        write (s)   read (s)         size   write rows/s    read rows/s
outer-zstd-stream                15.853      4.818   319.96 MiB         94,620        311,306
ipc-builtin-zstd                 13.864      4.755   318.94 MiB        108,195        315,465
per-batch-zstd-stream            15.255      4.968   319.97 MiB         98,327        301,933
per-batch-bulk-zstd              14.510      4.933   318.45 MiB        103,380        304,095
per-batch-ipc-builtin-zstd       13.938      4.922   319.04 MiB        107,616        304,730

Describe the potential solution

No response

Additional context

Note: This proposal depends on apache/arrow-rs#10132. Arrow IPC currently uses ZSTD's default compression level (3), while Comet shuffle uses compression level 1.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions