Hybrid eager/deferred accumulation for string_agg GroupsAccumulator to reduce copying and memory usage#21469
Draft
kosiew wants to merge 9 commits intoapache:mainfrom
Draft
Hybrid eager/deferred accumulation for string_agg GroupsAccumulator to reduce copying and memory usage#21469kosiew wants to merge 9 commits intoapache:mainfrom
kosiew wants to merge 9 commits intoapache:mainfrom
Conversation
Optimize StringAggGroupsAccumulator to retain input and state batches with metadata instead of building a Vec<Option<String>> on every update. Assemble concatenated strings lazily in evaluate() and state(). Adjust size() to reflect retained arrays and metadata. Support EmitTo::First(n) by emitting the required prefix and renumbering retained groups. Include note for future mixed-batch compaction work.
Remove unnecessary &mut self from append_rows. Consolidate repeated string-append loop into a typed private helper using ArrayAccessor. Eliminate redundant runtime null checks in favor of non-null entry invariant with debug_assert!. Simplify retain_after_emit into a single filter-and-renumber pass. Trim local ceremony in evaluate() and state() for clarity.
Consolidate string-like array routing through a single StringInputArray abstraction to improve maintainability. Rename the slot appender to append_group_value for better readability of the lazy-assembly path.
Update append_rows_typed and append_batch_values_typed to accept array references instead of values. Modify call sites in StringInputArray to pass references, improving memory efficiency and consistency across function calls.
Adjust string_agg to implement a hybrid accumulator, offering eager updates for lightweight workloads and switching to deferred row tracking for larger batches. This change enhances performance while maintaining efficiency. Included mixed-mode regression tests to cover various batch scenarios and ensure correctness.
Contributor
Author
|
Benchmark (#21437) |
4b3ed6e to
51ac58a
Compare
Eliminate repeated match arms in string_agg.rs by introducing a local dispatch macro. This enhances clarity and readability, allowing each method to focus on intent while simplifying maintenance for future changes. The refactor preserves existing static dispatch behavior, ensuring that all targeted tests continue to pass.
Remove redundant num_groups field and derive emission size from values. Collapse deferred-state retention into a tighter iterator/unzip flow. Eliminate the extra append_batch_values forwarding helper. Split evaluate() into smaller private steps with replay_deferred_batches and finish_emit. Simplify should_defer and refine the deferred replay loop for clarity and efficiency.
51ac58a to
baa8054
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
string_aggGroupsAccumulator#21156Rationale for this change
The current
StringAggGroupsAccumulatoreagerly copies every input string into per-groupStringbuffers duringupdate_batch. This approach is simple but can lead to significant memory overhead and unnecessary copying, especially for large payloads or high-cardinality groupings.This PR introduces a hybrid strategy that defers copying when it is likely to be beneficial. Instead of immediately materializing strings, the accumulator can retain references to input batches and store lightweight
(group_idx, row_idx)entries. Actual string concatenation is deferred untilevaluate().This approach aims to:
What changes are included in this PR?
Introduced a hybrid accumulation model:
Added new internal structures:
batches: Vec<ArrayRef>to retain input arraysbatch_entries: Vec<Vec<(u32, u32)>>to track(group_idx, row_idx)pairsnum_groupsto track total group countIntroduced
StringInputArrayabstraction to unify handling of:Utf8LargeUtf8Utf8ViewImplemented heuristics to decide when to defer:
DEFER_GROUP_THRESHOLDDEFER_PAYLOAD_LEN_THRESHOLDRefactored append logic:
append_rows_typedfor deferred indexingappend_batch_typedfor eager materializationappend_batch_values_typedfor reconstruction during evaluationUpdated
evaluate()to:EmitTo::First)Added state management improvements:
clear_state()to fully reset buffersretain_after_emit()to compact deferred state after partial emitsExtended memory accounting in
size()to include:Added test:
groups_mixed_eager_and_deferred_batchesto validate correctness of hybrid behaviorAre these changes tested?
Yes.
Added a new test covering mixed eager and deferred batches
Existing tests continue to pass
The new test verifies:
Are there any user-facing changes?
No.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.