Window Functions¶
Window functions perform calculations across rows of the query result. They
run after the HAVING clause but before the final query ORDER BY clause.
Invoking a window function requires an OVER clause:
function(args) OVER (
[PARTITION BY expression]
[ORDER BY expression [ASC|DESC]]
[frame]
)
The PARTITION BY clause separates rows into independent window partitions.
The window ORDER BY clause determines the order used by the window
function; ordinal references such as ORDER BY 1 are not supported in window
ORDER BY clauses.
The verified frame forms are:
{ROWS|RANGE} frame_start
{ROWS|RANGE} BETWEEN frame_start AND frame_end
frame_start and frame_end can be:
UNBOUNDED PRECEDING
expression PRECEDING
CURRENT ROW
expression FOLLOWING
UNBOUNDED FOLLOWING
If no frame is specified, a window with ORDER BY uses RANGE UNBOUNDED
PRECEDING through CURRENT ROW. A window without ORDER BY uses the
whole partition.
Aggregate Window Functions¶
Aggregate functions can be used as window functions by adding an OVER
clause. The aggregate is computed for each row over the rows in the current
window frame.
SELECT grp, v,
sum(v) OVER (
PARTITION BY grp
ORDER BY v
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_sum
FROM (VALUES (1, 10), (1, 20), (2, 5), (2, 7)) AS t(grp, v)
ORDER BY grp, v;
-- (1, 10, 10), (1, 20, 30), (2, 5, 5), (2, 7, 12)
Ranking Functions¶
- cume_dist() -> double()¶
Returns the cumulative distribution of the current row in the window partition. Peer rows in the window ordering return the same value.
- dense_rank() -> bigint()¶
Returns the rank of the current row without gaps between peer groups.
- ntile(n) -> bigint()¶
Divides the rows for each window partition into
nbuckets numbered from1ton. The bucket count must be a positive constant integer.
- percent_rank() -> double()¶
Returns
(r - 1) / (n - 1), whererisrank()for the current row andnis the number of rows in the partition. A single-row partition returns0.0.
- rank() -> bigint()¶
Returns the rank of the current row. Peer rows have the same rank, and the next rank includes gaps for the peer rows.
- row_number() -> bigint()¶
Returns a sequential number for each row, starting with one, according to the ordering within the window partition.
SELECT v,
rank() OVER (ORDER BY v),
dense_rank() OVER (ORDER BY v),
percent_rank() OVER (ORDER BY v),
cume_dist() OVER (ORDER BY v)
FROM (VALUES (10), (20), (20), (30)) AS t(v)
ORDER BY v;
-- (10, 1, 1, 0.0, 0.25)
-- (20, 2, 2, 0.3333333333333333, 0.75)
-- (20, 2, 2, 0.3333333333333333, 0.75)
-- (30, 4, 3, 1.0, 1.0)
Value Functions¶
Value functions support RESPECT NULLS and IGNORE NULLS after the
function call and before OVER. Null values are respected by default. If
IGNORE NULLS is specified, rows where the value expression is null are
excluded from the value-function calculation.
- first_value(x) -> [same as input]()¶
Returns the first value of the window frame.
- last_value(x) -> [same as input]()¶
Returns the last value of the window frame.
- nth_value(x, offset) -> [same as input]()¶
Returns the value at the specified one-based offset from the beginning of the window frame. The offset must be a positive constant integer. If the offset is greater than the number of values in the frame, null is returned.
- lead(x[, offset [, default_value]]) -> [same as input]()¶
Returns the value at
offsetrows after the current row in the window partition. The default offset is1. If the offset refers to a row outside the partition,default_valueis returned, or null when no default value is specified. Offset and default arguments must be constants.
- lag(x[, offset [, default_value]]) -> [same as input]()¶
Returns the value at
offsetrows before the current row in the window partition. The default offset is1. If the offset refers to a row outside the partition,default_valueis returned, or null when no default value is specified. Offset and default arguments must be constants.
SELECT id,
first_value(v) OVER win,
last_value(v) OVER win,
nth_value(v, 2) OVER win,
lag(v) OVER win,
lead(v, 2) OVER win
FROM (VALUES (1, 10), (2, 20), (3, 30), (4, 40)) AS t(id, v)
WINDOW win AS (
ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
ORDER BY id;
-- (1, 10, 40, 20, NULL, 30)
-- (2, 10, 40, 20, 10, 40)
-- (3, 10, 40, 20, 20, NULL)
-- (4, 10, 40, 20, 30, NULL)