Skip to content

Commit 096b3ec

Browse files
authored
Merge pull request #33317 from ggevay/with-ordinality-docs
docs: Add `WITH ORDINALITY`
2 parents 580ddb1 + 0dcf8cf commit 096b3ec

File tree

3 files changed

+246
-3
lines changed

3 files changed

+246
-3
lines changed
Lines changed: 238 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,238 @@
1+
---
2+
title: "Table functions"
3+
description: "Functions that return multiple rows"
4+
menu:
5+
main:
6+
parent: 'sql-functions'
7+
---
8+
9+
## Overview
10+
11+
[Table functions](/sql/functions/#table-functions) return multiple rows from one
12+
input row. They are typically used in the `FROM` clause, where their arguments
13+
are allowed to refer to columns of earlier tables in the `FROM` clause.
14+
15+
For example, consider the following table whose rows consist of lists of
16+
integers:
17+
18+
```mzsql
19+
CREATE TABLE quizzes(scores int list);
20+
INSERT INTO quizzes VALUES (LIST[5, 7, 8]), (LIST[3, 3]);
21+
```
22+
23+
Query the `scores` column from the table:
24+
25+
```mzsql
26+
SELECT scores
27+
FROM quizzes;
28+
```
29+
30+
The query returns two rows, where each row is a list:
31+
32+
```
33+
scores
34+
---------
35+
{3,3}
36+
{5,7,8}
37+
(2 rows)
38+
```
39+
40+
Now, apply the [`unnest`](/sql/functions/#unnest) table function to expand the
41+
`scores` list into a collection of rows, where each row contains one list item:
42+
43+
```mzsql
44+
SELECT scores, score
45+
FROM
46+
quizzes,
47+
unnest(scores) AS score; -- In Materialize, shorthand for AS t(score)
48+
```
49+
50+
The query returns 5 rows, one row for each list item:
51+
52+
```
53+
scores | score
54+
---------+-------
55+
{3,3} | 3
56+
{3,3} | 3
57+
{5,7,8} | 5
58+
{5,7,8} | 7
59+
{5,7,8} | 8
60+
(5 rows)
61+
```
62+
63+
{{< tip >}}
64+
65+
For illustrative purposes, the original `scores` column is included in the
66+
results (i.e., query projection). In practice, you generally would omit
67+
including the original list to minimize the return data size.
68+
69+
{{</ tip >}}
70+
71+
## `WITH ORDINALITY`
72+
73+
When a table function is used in the `FROM` clause, you can add `WITH
74+
ORDINALITY` after the table function call. `WITH ORDINALITY` adds a column that
75+
includes the **1**-based numbering for each output row, restarting at **1** for
76+
each input row.
77+
78+
The following example uses `unnest(...) WITH ORDINALITY` to include the `ordinality` column containing the **1**-based numbering of the unnested items:
79+
```mzsql
80+
SELECT scores, score, ordinality
81+
FROM
82+
quizzes,
83+
unnest(scores) WITH ORDINALITY AS t(score,ordinality);
84+
```
85+
86+
The results includes the `ordinality` column:
87+
```
88+
scores | score | ordinality
89+
---------+-------+------------
90+
{3,3} | 3 | 1
91+
{3,3} | 3 | 2
92+
{5,7,8} | 5 | 1
93+
{5,7,8} | 7 | 2
94+
{5,7,8} | 8 | 3
95+
(5 rows)
96+
```
97+
98+
## Table- and column aliases
99+
100+
You can use table- and column aliases to name both the result column(s) of a table function as well as the ordinality column, if present. For example:
101+
```mzsql
102+
SELECT scores, t.score, t.listidx
103+
FROM
104+
quizzes,
105+
unnest(scores) WITH ORDINALITY AS t(score,listidx);
106+
```
107+
108+
You can also name fewer columns in the column alias list than the number of
109+
columns in the output of the table function (plus `WITH ORDINALITY`, if
110+
present), in which case the extra columns retain their original names.
111+
112+
113+
## `ROWS FROM`
114+
115+
When you select from multiple relations without specifying a relationship, you
116+
get a cross join. This is also the case when you select from multiple table
117+
functions in `FROM` without specifying a relationship.
118+
119+
For example, consider the following query that selects from two table functions
120+
without a relationship:
121+
122+
```mzsql
123+
SELECT *
124+
FROM
125+
generate_series(1, 2) AS g1,
126+
generate_series(6, 7) AS g2;
127+
```
128+
129+
The query returns every combination of rows from both:
130+
131+
```
132+
133+
g1 | g2
134+
----+----
135+
1 | 6
136+
1 | 7
137+
2 | 6
138+
2 | 7
139+
(4 rows)
140+
```
141+
142+
Using `ROWS FROM` clause with the multiple table functions, you can zip the
143+
outputs of the table functions (i.e., combine the n-th output row from each
144+
table function into a single row) instead of the cross product.
145+
That is, combine first output rows of all the table functions into the first row, the second output rows of all the table functions are combined into
146+
a second row, and so on.
147+
148+
For example, modify the previous query to use `ROWS FROM` with the table
149+
functions:
150+
151+
```mzsql
152+
SELECT *
153+
FROM
154+
ROWS FROM (
155+
generate_series(1, 2),
156+
generate_series(6, 7)
157+
) AS t(g1, g2);
158+
```
159+
160+
Instead of the cross product, the results are the "zipped" rows:
161+
162+
```
163+
g1 | g2
164+
----+----
165+
1 | 6
166+
2 | 7
167+
(2 rows)
168+
```
169+
170+
If the table functions in a `ROWS FROM` clause produce a different number of
171+
rows, nulls are used for padding:
172+
```mzsql
173+
SELECT *
174+
FROM
175+
ROWS FROM (
176+
generate_series(1, 3), -- 3 rows
177+
generate_series(6, 7) -- 2 rows
178+
) AS t(g1, g2);
179+
```
180+
181+
The row with the `g1` value of 3 has a null `g2` value (note that if using psql,
182+
psql prints null as an empty string):
183+
184+
```
185+
| g1 | g2 |
186+
| -- | ---- |
187+
| 3 | null |
188+
| 1 | 6 |
189+
| 2 | 7 |
190+
(3 rows)
191+
```
192+
193+
For `ROWS FROM` clauses:
194+
- you can use `WITH ORDINALITY` on the entire `ROWS FROM` clause, not on the
195+
individual table functions within the `ROWS FROM` clause.
196+
- you can use table- and column aliases only on the entire `ROWS FROM` clause,
197+
not on the individual table functions within `ROWS FROM` clause.
198+
199+
For example:
200+
201+
```mzsql
202+
SELECT *
203+
FROM
204+
ROWS FROM (
205+
generate_series(5, 6),
206+
generate_series(8, 9)
207+
) WITH ORDINALITY AS t(g1, g2, o);
208+
```
209+
210+
The results contain the ordinality value in the `o` column:
211+
212+
```
213+
214+
g1 | g2 | o
215+
----+----+---
216+
5 | 8 | 1
217+
6 | 9 | 2
218+
(2 rows)
219+
```
220+
221+
222+
## Table functions in the `SELECT` clause
223+
224+
You can call table functions in the `SELECT` clause. These will be executed as if they were at the end of the `FROM` clause, but their output columns will be at the appropriate position specified by their positions in the `SELECT` clause.
225+
226+
However, table functions in a `SELECT` clause have a number of restrictions (similar to Postgres):
227+
- If there are multiple table functions in the `SELECT` clause, they are executed as if in an implicit `ROWS FROM` clause.
228+
- `WITH ORDINALITY` and (explicit) `ROWS FROM` are not allowed.
229+
- You can give a table function call a column alias, but not a table alias.
230+
- If there are multiple output columns of a table function (e.g., `regexp_extract` has an output column per capture group), these will be combined into a single column, with a record type.
231+
232+
## Tabletized scalar functions
233+
234+
You can also call ordinary scalar functions in the `FROM` clause as if they were table functions. In that case, their output will be considered a table with a single row and column.
235+
236+
## See also
237+
238+
See a list of table functions in the [function reference](/sql/functions/#table-functions).

doc/user/content/sql/select/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ _select&lowbar;with&lowbar;ctes_, _select&lowbar;with&lowbar;recursive&lowbar;ct
3232
**DISTINCT** | <a name="select-distinct"></a>Return only distinct values.
3333
**DISTINCT ON (** _col&lowbar;ref_... **)** | <a name="select-distinct-on"></a>Return only the first row with a distinct value for _col&lowbar;ref_. If an `ORDER BY` clause is also present, then `DISTINCT ON` will respect that ordering when choosing which row to return for each distinct value of `col_ref...`. Please note that in this case, you should start the `ORDER BY` clause with the same `col_ref...` as the `DISTINCT ON` clause. For an example, see [Top K](/transform-data/idiomatic-materialize-sql/top-k/#select-top-1-item).
3434
_target&lowbar;elem_ | Return identified columns or functions.
35-
**FROM** _table&lowbar;ref_ | The tables you want to read from; note that these can also be other `SELECT` statements or [Common Table Expressions](#common-table-expressions-ctes) (CTEs).
35+
**FROM** _table&lowbar;expr_ | The tables you want to read from; note that these can also be other `SELECT` statements, [Common Table Expressions](#common-table-expressions-ctes) (CTEs), or [table function calls](/sql/functions/table-functions).
3636
_join&lowbar;expr_ | A join expression; for more details, see the [`JOIN` documentation](/sql/select/join/).
3737
**WHERE** _expression_ | Filter tuples by _expression_.
3838
**GROUP BY** _col&lowbar;ref_ | Group aggregations by _col&lowbar;ref_.

doc/user/data/sql_funcs.yml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -718,7 +718,10 @@
718718
url: /sql/types/jsonb#to_jsonb
719719

720720
- type: Table
721-
description: Table functions evaluate to a set of rows, rather than a single expression.
721+
description: |
722+
Table functions evaluate to a collection of rows rather than a single row. You can use the `WITH ORDINALITY` and
723+
`ROWS FROM` clauses together with table functions. For more details, see [Table functions](/sql/functions/table-functions).
724+
722725
functions:
723726
- signature: 'generate_series(start: int, stop: int) -> Col<int>'
724727
description: Generate all integer values between `start` and `stop`, inclusive.
@@ -731,7 +734,9 @@
731734
- signature: 'generate_subscripts(a: anyarray, dim: int) -> Col<int>'
732735
description: Generates a series comprising the valid subscripts of the `dim`'th dimension of the given array `a`.
733736
- signature: 'regexp_extract(regex: str, haystack: str) -> Col<string>'
734-
description: Values of the capture groups of `regex` as matched in `haystack`.
737+
description: Values of the capture groups of `regex` as matched in `haystack`. Outputs each capture group in a
738+
separate column. At least one capture group is needed. (The capture groups are the parts of the regular expression
739+
between parentheses.)
735740
- signature: 'regexp_split_to_table(text: str, pattern: str [, flags: str]]) -> Col<string>'
736741
description: |
737742
Splits `text` by the regular expression `pattern`.

0 commit comments

Comments
 (0)