Skip to content

Commit

Permalink
[SPARK-48783][DOCS] Update the table-valued function docs
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR updates the table-valued function SQL reference doc to include the new set of TVFs that can be used in the FROM clause of a query, as well as the examples.

### Why are the changes needed?

To improve the documentation.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47184 from allisonwang-db/spark-48783-tvf-docs.

Authored-by: allisonwang-db <allison.wang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
allisonwang-db authored and cloud-fan committed Jul 5, 2024
1 parent 310f8ea commit a2f8001
Showing 1 changed file with 17 additions and 7 deletions.
24 changes: 17 additions & 7 deletions docs/sql-ref-syntax-qry-select-tvf.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,15 @@ A table-valued function (TVF) is a function that returns a relation or a set of
|**range** ( *start, end* )|Long, Long|Creates a table with a single *LongType* column named *id*, <br/> containing rows in a range from *start* to *end* (exclusive) with step value 1.|
|**range** ( *start, end, step* )|Long, Long, Long|Creates a table with a single *LongType* column named *id*, <br/> containing rows in a range from *start* to *end* (exclusive) with *step* value.|
|**range** ( *start, end, step, numPartitions* )|Long, Long, Long, Int|Creates a table with a single *LongType* column named *id*, <br/> containing rows in a range from *start* to *end* (exclusive) with *step* value, with partition number *numPartitions* specified.|
|**explode** ( *expr* )|Array/Map|Separates the elements of array *expr* into multiple rows, or the elements of map *expr* into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.|
|**explode_outer** <br> ( *expr* )|Array/Map|Separates the elements of array *expr* into multiple rows, or the elements of map *expr* into multiple rows and columns. Unless specified otherwise, uses the default column name col for elements of the array or key and value for the elements of the map.|
|**inline** ( *expr* )|Expression|Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.|
|**inline_outer** <br> ( *expr* )|Expression|Explodes an array of structs into a table. Uses column names col1, col2, etc. by default unless specified otherwise.|
|**posexplode** <br> ( *expr* )|Array/Map|Separates the elements of array *expr* into multiple rows with positions, or the elements of map *expr* into multiple rows and columns with positions. Unless specified otherwise, uses the column name pos for position, col for elements of the array or key and value for elements of the map.|
|**posexplode_outer** ( *expr* )|Array/Map|Separates the elements of array *expr* into multiple rows with positions, or the elements of map *expr* into multiple rows and columns with positions. Unless specified otherwise, uses the column name pos for position, col for elements of the array or key and value for elements of the map.|
|**stack** ( *n, expr1, ..., exprk* )|Seq[Expression]|Separates *expr1, ..., exprk* into n rows. Uses column names col0, col1, etc. by default unless specified otherwise.|
|**json_tuple** <br> ( *jsonStr, p1, p2, ..., pn* )|Seq[Expression]|Returns a tuple like the function *get_json_object*, but it takes multiple names. All the input parameters and output column types are string.|
|**parse_url** <br> ( *url, partToExtract[, key]* )|Seq[Expression]|Extracts a part from a URL.|

#### TVFs that can be specified in SELECT/LATERAL VIEW clauses:

Expand Down Expand Up @@ -99,39 +108,39 @@ SELECT * FROM range(5, 8) AS test;
| 7|
+---+

SELECT explode(array(10, 20));
SELECT * FROM explode(array(10, 20));
+---+
|col|
+---+
| 10|
| 20|
+---+

SELECT inline(array(struct(1, 'a'), struct(2, 'b')));
SELECT * FROM inline(array(struct(1, 'a'), struct(2, 'b')));
+----+----+
|col1|col2|
+----+----+
| 1| a|
| 2| b|
+----+----+

SELECT posexplode(array(10,20));
SELECT * FROM posexplode(array(10,20));
+---+---+
|pos|col|
+---+---+
| 0| 10|
| 1| 20|
+---+---+

SELECT stack(2, 1, 2, 3);
SELECT * FROM stack(2, 1, 2, 3);
+----+----+
|col0|col1|
+----+----+
| 1| 2|
| 3|null|
+----+----+

SELECT json_tuple('{"a":1, "b":2}', 'a', 'b');
SELECT * FROM json_tuple('{"a":1, "b":2}', 'a', 'b');
+---+---+
| c0| c1|
+---+---+
Expand All @@ -145,11 +154,11 @@ SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST');
| spark.apache.org|
+-----------------------------------------------------+

-- Use explode in a LATERAL VIEW clause
-- Use explode with LATERAL join
CREATE TABLE test (c1 INT);
INSERT INTO test VALUES (1);
INSERT INTO test VALUES (2);
SELECT * FROM test LATERAL VIEW explode (ARRAY(3,4)) AS c2;
SELECT * FROM test, LATERAL explode (ARRAY(3,4)) AS c2;
+--+--+
|c1|c2|
+--+--+
Expand All @@ -163,4 +172,5 @@ SELECT * FROM test LATERAL VIEW explode (ARRAY(3,4)) AS c2;
### Related Statements

* [SELECT](sql-ref-syntax-qry-select.html)
* [LATERAL](sql-ref-syntax-qry-select-lateral-subquery.md)
* [LATERAL VIEW Clause](sql-ref-syntax-qry-select-lateral-view.html)

0 comments on commit a2f8001

Please sign in to comment.