Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions proto/substrait/algebra.proto
Original file line number Diff line number Diff line change
Expand Up @@ -1334,10 +1334,13 @@ message Expression {
// The lower and upper bound specify how many rows before and after the current row
// the window should extend.
BOUNDS_TYPE_ROWS = 1;
// The lower and upper bound describe a range of values. The window should include all rows
// where the value of the ordering column is greater than or equal to (current_value - lower bound)
// and less than or equal to (current_value + upper bound). This bounds type is only valid if there
// is a single ordering column.
// The lower and upper bound describe a range of values. When using numeric offsets (Preceding
// or Following with offset > 0), the window includes all rows where the value of the ordering
// column is greater than or equal to (current_value - lower bound) and less than or equal to
// (current_value + upper bound). When ANY numeric offset is present as a bound, there must be
// EXACTLY ONE ordering column.
// UNBOUNDED and CURRENT ROW bounds work with 0 or more ordering columns. CURRENT ROW
// includes all rows with matching values across all ordering columns (peer rows).
BOUNDS_TYPE_RANGE = 2;
}

Expand Down
18 changes: 13 additions & 5 deletions site/docs/expressions/window_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,19 @@ Window function signatures contain all the properties defined for [aggregate fun

When binding a window function, the binding must include the following additional properties beyond the standard aggregate binding properties:

| Property | Description | Required |
| ----------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
| Partition | A list of partitioning expressions. | False, defaults to a single partition for the entire dataset |
| Lower Bound | Bound Following(int64), Bound Trailing(int64) or CurrentRow. | False, defaults to start of partition |
| Upper Bound | Bound Following(int64), Bound Trailing(int64) or CurrentRow. | False, defaults to end of partition |
| Property | Description | Required |
| ----------- | ------------------------------------------------------------ | -------- |
| Partition | A list of partitioning expressions. Empty list means a single partition for the entire dataset. | True |
| Order By | A list of ordering expressions with sort directions. Empty list means unordered. | True |
| Bounds Type | ROWS or RANGE. ROWS bounds count physical rows. RANGE bounds consider value equivalence based on ordering columns. | True |
| Lower Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | True |
| Upper Bound | Preceding(int64), Following(int64), CurrentRow, or Unbounded. | True |

### RANGE Bounds with Multiple Ordering Columns

When using RANGE bounds with numeric offsets (Preceding or Following with offset > 0), only a single ordering column is allowed. This is because numeric offsets require arithmetic on the ordering column values (e.g., current_value - offset), which is ambiguous with multiple columns.

RANGE bounds with UNBOUNDED or CURRENT ROW work with any number of ordering columns. CURRENT ROW includes all rows with matching values across all ordering columns (peer rows).

## Aggregate Functions as Window Functions

Expand Down
Loading