Skip to content

feat: support reference rel with outer references#977

Draft
yongchul wants to merge 1 commit intosubstrait-io:mainfrom
yongchul:planrel_with_bindings
Draft

feat: support reference rel with outer references#977
yongchul wants to merge 1 commit intosubstrait-io:mainfrom
yongchul:planrel_with_bindings

Conversation

@yongchul
Copy link
Copy Markdown
Contributor

@yongchul yongchul commented Feb 25, 2026

Motivation

Currently, a Plan can hold shared Rel subtrees that are referenced via ReferenceRel(subtree_ordinal) to express DAGs and multi-query optimizations. However, when the shared Rel contains OuterReference field references (e.g., from a correlated subquery), those outer references become free variables at the plan level — they refer to scopes that don't exist in the PlanRel context. Each ReferenceRel usage site may appear in a different relational context, so the bindings for those outer references must be supplied per-reference.

There is currently no mechanism to express this, which means correlated subqueries with outer references cannot be factored out as shared common subexpressions.

Changes

PlanRel (plan.proto)

Added a repeated OuterReferenceDeclaration field. Each declaration pairs:

  • An outer-reference-rooted FieldReference — identifying the free variable as it appears inside the Rel
  • A Type — the expected type of the value

This serves as the parameter signature for a parameterized common subexpression. Declarations are required when the Rel contains outer references.

ReferenceRel (algebra.proto)

Added a repeated Expression.FieldReference outer_reference_bindings field. Entries are positionally matched to the outer_reference_declarations in the referenced PlanRel: outer_reference_bindings[i] provides the concrete value for outer_reference_declarations[i], evaluated in the ReferenceRel's local relational context.

Documentation (logical_relations.md)

Added a section explaining parameterized common subexpressions with a worked example: a single query where two structurally identical correlated subqueries (at different nesting levels, accessing different fields) are factored into a shared PlanRel with two ReferenceRel usages providing different positional bindings.

AI disclaimer

This PR was assisted by Claude Sonnet 4.6.


This change is Reviewable

@yongchul yongchul force-pushed the planrel_with_bindings branch from b32b72c to de7c2b6 Compare February 25, 2026 20:39
@yongchul yongchul marked this pull request as ready for review February 25, 2026 20:41
// The outer reference being declared. This must be a FieldReference with
// root_type set to OuterReference, matching a FieldReference used inside
// the Rel of this PlanRel.
Expression.FieldReference outer_reference = 1;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could employ more explicit convention like OuterReferencePlaceHolder, which has (rel index, index in the declaration) to reference the declaration in the PlanRel.

The problem of this approach is that it will require rewriting the plan to replace the existing reference with the placeholders.

@yongchul yongchul marked this pull request as draft April 2, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant