What is the problem the feature request solves?
I benchmarked several shuffle IPC write strategies and compared five different approaches. The results show that Arrow IPC's built-in compression provides both better performance and a higher compression ratio than the other tested approaches.
The evaluated strategies are:
outer-zstd-stream — a single ZSTD stream wrapping the entire IPC stream.
ipc-builtin-zstd — Arrow IPC StreamWriter with built-in ZSTD compression.
per-batch-zstd-stream — each IPC batch written into its own ZSTD frame (current comet shuffle style).
per-batch-bulk-zstd — per-batch IPC buffer compressed with bulk ZSTD and written as length-prefixed frames.
per-batch-ipc-builtin-zstd — per-batch IPC stream using Arrow IPC's built-in ZSTD compression.
Based on the benchmark results, the built-in Arrow IPC compression (ipc-builtin-zstd) achieves the best overall balance of throughput and compression efficiency.
Run ipc_write_bench benchmark:
cargo run --features shuffle-bench -p datafusion-comet-shuffle --bin ipc_write_bench -- \
--input benchmark_data/customer.parquet \
--output-dir /tmp/ipc_write_bench \
--iterations 3 \
--warmup 1
result:
=== Column Batch Write Benchmark ===
Input: benchmark_data/customer.parquet
Batches: 210
Total rows: 1,500,000
Batch size: 8,192
Zstd level: 3
Iterations: 3 (warmup: 1)
--- outer-zstd-stream ---
write avg: 15.853s
write min/max: 15.702s / 16.099s
write rows/s: 94,620
read avg: 4.818s
read min/max: 4.568s / 5.112s
read rows/s: 311,306
file size: 319.96 MiB
verify rows: 1,500,000 (ok)
--- ipc-builtin-zstd ---
write avg: 13.864s
write min/max: 13.829s / 13.901s
write rows/s: 108,195
read avg: 4.755s
read min/max: 4.730s / 4.804s
read rows/s: 315,465
file size: 318.94 MiB
verify rows: 1,500,000 (ok)
--- per-batch-zstd-stream ---
write avg: 15.255s
write min/max: 15.172s / 15.303s
write rows/s: 98,327
read avg: 4.968s
read min/max: 4.954s / 4.994s
read rows/s: 301,933
file size: 319.97 MiB
verify rows: 1,500,000 (ok)
--- per-batch-bulk-zstd ---
write avg: 14.510s
write min/max: 14.395s / 14.612s
write rows/s: 103,380
read avg: 4.933s
read min/max: 4.883s / 5.025s
read rows/s: 304,095
file size: 318.45 MiB
verify rows: 1,500,000 (ok)
--- per-batch-ipc-builtin-zstd ---
write avg: 13.938s
write min/max: 13.814s / 14.010s
write rows/s: 107,616
read avg: 4.922s
read min/max: 4.805s / 5.001s
read rows/s: 304,730
file size: 319.04 MiB
verify rows: 1,500,000 (ok)
=== Summary ===
method write (s) read (s) size write rows/s read rows/s
outer-zstd-stream 15.853 4.818 319.96 MiB 94,620 311,306
ipc-builtin-zstd 13.864 4.755 318.94 MiB 108,195 315,465
per-batch-zstd-stream 15.255 4.968 319.97 MiB 98,327 301,933
per-batch-bulk-zstd 14.510 4.933 318.45 MiB 103,380 304,095
per-batch-ipc-builtin-zstd 13.938 4.922 319.04 MiB 107,616 304,730
Describe the potential solution
No response
Additional context
Note: This proposal depends on apache/arrow-rs#10132. Arrow IPC currently uses ZSTD's default compression level (3), while Comet shuffle uses compression level 1.
What is the problem the feature request solves?
I benchmarked several shuffle IPC write strategies and compared five different approaches. The results show that Arrow IPC's built-in compression provides both better performance and a higher compression ratio than the other tested approaches.
The evaluated strategies are:
outer-zstd-stream— a single ZSTD stream wrapping the entire IPC stream.ipc-builtin-zstd— Arrow IPCStreamWriterwith built-in ZSTD compression.per-batch-zstd-stream— each IPC batch written into its own ZSTD frame (current comet shuffle style).per-batch-bulk-zstd— per-batch IPC buffer compressed with bulk ZSTD and written as length-prefixed frames.per-batch-ipc-builtin-zstd— per-batch IPC stream using Arrow IPC's built-in ZSTD compression.Based on the benchmark results, the built-in Arrow IPC compression (
ipc-builtin-zstd) achieves the best overall balance of throughput and compression efficiency.Run ipc_write_bench benchmark:
result:
Describe the potential solution
No response
Additional context
Note: This proposal depends on apache/arrow-rs#10132. Arrow IPC currently uses ZSTD's default compression level (3), while Comet shuffle uses compression level 1.