Skip to content

Allow retrieving dataset top-level field names#21956

Merged
dpiparo merged 2 commits intoroot-project:masterfrom
vepadulano:rdf-expose-toplevel-field-names
Apr 22, 2026
Merged

Allow retrieving dataset top-level field names#21956
dpiparo merged 2 commits intoroot-project:masterfrom
vepadulano:rdf-expose-toplevel-field-names

Conversation

@vepadulano
Copy link
Copy Markdown
Member

Useful for instance when calling Snapshot and wanting to select only top-level field names, possibly further filtering the list (e.g. through regexes).

Personal note: I believe that having GetColumnNames, GetDefinedColumnNames, GetColumnType was already showing a pattern, now with this new one we are clearly lacking a general-purpose API to inspect the dataset schema description. I wouldn't introduce it in this PR because it needs further thinking

FYI @TomasDado

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Test Results

    22 files      22 suites   3d 13h 1m 51s ⏱️
 3 845 tests  3 844 ✅  1 💤 0 ❌
76 827 runs  76 809 ✅ 18 💤 0 ❌

Results for commit b5c100d.

♻️ This comment has been updated with latest results.

@dpiparo
Copy link
Copy Markdown
Member

dpiparo commented Apr 20, 2026

Is this PR fixing #18733 ?

@vepadulano
Copy link
Copy Markdown
Member Author

Indeed @dpiparo thanks! This PR fixes #18733

@vepadulano vepadulano linked an issue Apr 20, 2026 that may be closed by this pull request
@vepadulano vepadulano force-pushed the rdf-expose-toplevel-field-names branch 2 times, most recently from 564f6cc to 0e22c5f Compare April 21, 2026 06:59
@vepadulano vepadulano closed this Apr 21, 2026
@vepadulano vepadulano reopened this Apr 21, 2026
@vepadulano vepadulano added the clean build Ask CI to do non-incremental build on PR label Apr 21, 2026
@vepadulano vepadulano force-pushed the rdf-expose-toplevel-field-names branch from 0e22c5f to aecacda Compare April 21, 2026 14:09
@vepadulano vepadulano added this to the 6.40.00 milestone Apr 21, 2026
@vepadulano
Copy link
Copy Markdown
Member Author

@dpiparo this should go to 6.40

@dpiparo dpiparo modified the milestones: 6.40.00, 6.40.00-rc1 Apr 21, 2026
Useful for instance when calling Snapshot and wanting to select only top-level field names, possibly further filtering the list (e.g. through regexes).
Enabling the GetTopLevelFieldNames method in RNTuple exposed one previously
faulty interaction between this and Snapshot. For the case of Snapshot with a
regex, the regex would only consider top-level column names. If the regex
contains a specific name of a subfield, e.g. "columnName.dataMember", then the
regex would fail even though that column exists in the dataset. This commit also
keeps the default Snapshot behaviour of only considering the top-level column
names, but checks for the full list of column names in case a regex failed
before throwing the final error if necessary.
@vepadulano vepadulano force-pushed the rdf-expose-toplevel-field-names branch from aecacda to b5c100d Compare April 22, 2026 05:21
@dpiparo dpiparo modified the milestones: 6.40.00, 6.40.00-rc1 Apr 22, 2026
@dpiparo dpiparo merged commit 13bb547 into root-project:master Apr 22, 2026
30 checks passed
@dpiparo
Copy link
Copy Markdown
Member

dpiparo commented Apr 22, 2026

/backport to 6.38, 6.36

@root-project-bot
Copy link
Copy Markdown

Something went wrong with the backport to 6.38: @dpiparo please see the logs

@dpiparo dpiparo removed the clean build Ask CI to do non-incremental build on PR label Apr 22, 2026
@dpiparo
Copy link
Copy Markdown
Member

dpiparo commented Apr 22, 2026

/backport to 6.38, 6.36

@root-project-bot
Copy link
Copy Markdown

Something went wrong with the backport to 6.36: @dpiparo please see the logs

@dpiparo dpiparo added in:RDataFrame clean build Ask CI to do non-incremental build on PR labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean build Ask CI to do non-incremental build on PR in:RDataFrame

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Get list of top-level fields from RDataFrame

3 participants