from_protobuf (part 1): Add ordinal remapping for GetStructField/GetArrayStructFields#14499
Open
thirtiseven wants to merge 3 commits intoNVIDIA:mainfrom
Open
from_protobuf (part 1): Add ordinal remapping for GetStructField/GetArrayStructFields#14499thirtiseven wants to merge 3 commits intoNVIDIA:mainfrom
thirtiseven wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
…tFields Introduce PRUNED_ORDINAL_TAG (TreeNodeTag) and named Meta classes that support runtime ordinal remapping for struct field extraction expressions. This enables future schema projection optimizations (e.g., protobuf nested pruning) to rewrite field ordinals at GPU conversion time without modifying the generic runtime expression classes. Changes: - Add GpuStructFieldOrdinalTag with PRUNED_ORDINAL_TAG TreeNodeTag - Add GpuGetStructFieldMeta with effectiveOrdinal that reads the tag, falling back to the original ordinal when no tag is set - Extend GpuGetArrayStructFieldsMeta with effectiveOrdinal and effectiveNumFields for pruned array-of-struct access - Register GetStructField with GpuGetStructFieldMeta instead of an anonymous UnaryExprMeta in GpuOverrides - Fix Alias typeMeta to delegate to child expression, ensuring overrideDataType propagates correctly through aliases Signed-off-by: Haoyang Li <haoyangl@nvidia.com> Made-with: Cursor
Collaborator
Author
|
@greptile review |
Contributor
Greptile SummaryThis PR is Part 1 of the Changes
Minor observations (no blockers)
Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[GetStructField / GetArrayStructFields\nSpark expression node] -->|set PRUNED_ORDINAL_TAG| B{PRUNED_ORDINAL_TAG set?}
B -- No --> C[Use expr.ordinal\nUse expr.numFields]
B -- Yes --> D[Use tagged ordinal\nDerive numFields from child schema]
C --> E[GpuGetStructField / GpuGetArrayStructFields\nGPU expression]
D --> E
E --> F[doColumnar: extract field at ordinal\nfrom GPU column view]
G[Alias meta] -->|typeMeta| H[Delegate to childExprs.head.typeMeta]
H --> I[Child effective DataType\npropagates through Alias]
Reviews (2): Last reviewed commit: "Remove redundant tests and dead effectiv..." | Re-trigger Greptile |
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/complexTypeExtractors.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/complexTypeExtractors.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/test/scala/com/nvidia/spark/rapids/StructFieldOrdinalTagSuite.scala
Outdated
Show resolved
Hide resolved
…st ordinal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part 1 of #14354
Description
This PR:
Introduce
PRUNED_ORDINAL_TAG(TreeNodeTag) and named Meta classes that support runtime ordinal remapping for struct field extraction expressions.This enables future schema projection optimizations (e.g., protobuf nested pruning) to rewrite field ordinals at GPU conversion time without modifying the generic runtime expression classes.
Changes:
GpuStructFieldOrdinalTagwithPRUNED_ORDINAL_TAGTreeNodeTagGpuGetStructFieldMetawitheffectiveOrdinalthat reads the tag, falling back to the original ordinal when no tag is setGpuGetArrayStructFieldsMetawitheffectiveOrdinalandeffectiveNumFieldsfor pruned array-of-struct accessGetStructFieldwithGpuGetStructFieldMetainstead of an anonymousUnaryExprMetainGpuOverridestypeMetato delegate to child expression, ensuringoverrideDataTypepropagates correctly through aliases.Checklists
(Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
Made-with: Cursor