-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-31390][SQL][DOCS] Document Window Function in SQL Syntax Section #28220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #121296 has finished for PR 28220 at commit
|
cc @maropu |
also cc: @viirya |
docs/sql-ref-syntax-qry-window.md
Outdated
**This page is under construction** | ||
### Description | ||
|
||
Similarly to aggregate functions, window functions operate on a group of rows. However, unlike aggregate functions, window functions perform aggregation without reducing, calculating a return value for each row in the group. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Similarly to aggregate functions, window functions operate on a group of rows.
-> A window function operates on a group of rows and this is comparable to aggregate functions.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
window functions perform aggregation without reducing, calculating a return value for each row in the group.
is not clear. This means window functions do not compute a single aggregated value. Instead, they can generate multiple aggregated values for each group
?
Test build #121331 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
**This page is under construction** | ||
### Description | ||
|
||
Similarly to aggregate functions, window functions operate on a group of rows. However, unlike aggregate functions, window functions perform aggregation without reducing, calculating an aggregated value for each row in the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"without reducing"? Sounds confusing. How about "without reducing the number of rows"?
And "but calculating an aggregated value for each row in the specified window."
Test build #121333 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
### Description | ||
|
||
Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also cc @srowen
Please feel free to rephrase. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
computing a cumulative -> computing a cumulative sum (or anything similar: average, statistic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it looks better. How about putting the last statement in a new line?;
...the current row.
Spark SQL supports three types of window functions:
* Ranking Functions
* Analytic Functions
* Aggregate Functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this list here? The Syntax
section has the same list.
Test build #121337 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
Specifies a comma separated list of key and value pairs for partitions.<br><br> | ||
<b>Syntax:</b><br> | ||
<code> | ||
{ PARTITION | DISTRIBUTE } BY partition_col_name = partition_col_val ( [ , ... ] ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I found the double spaces in this line.
docs/sql-ref-syntax-qry-window.md
Outdated
MAX | MIN | COUNT | SUM | AVG | ... | ||
</code> | ||
<br> | ||
Please refer <a href="api/sql/">here</a> for a complete list of Spark Aggregate Functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: here
-> the Built-in Function document
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Spark Aggregate Functions.
-> Spark aggregate functions.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will put sql-ref-functions-builtin.html
as the link for Built-in Function document. It's broken now but will work after your PR is in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ur, could you revert the link back? I'm currently not sure that my PR is target at 3.0.
docs/sql-ref-syntax-qry-window.md
Outdated
### Description | ||
|
||
Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative, or accessing the value of rows given the relative position of the current row. Spark SQL supports three types of window functions: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it looks better. How about putting the last statement in a new line?;
...the current row.
Spark SQL supports three types of window functions:
* Ranking Functions
* Analytic Functions
* Aggregate Functions
Test build #121342 has finished for PR 28220 at commit
|
Test build #121344 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
<dt><code><em>window_function</em></code></dt> | ||
<dd> | ||
<ul> | ||
<li> Ranking Functions </li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: <li> Ranking Functions </li>
-> <li>Ranking Functions</li>
docs/sql-ref-syntax-qry-window.md
Outdated
MAX | MIN | COUNT | SUM | AVG | ... | ||
</code> | ||
<br> | ||
Please refer to the <a href="sql-ref-functions-builtin.html">Built-in Function</a> document for a complete list of Spark aggregate functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Built-in Function
-> Built-in Functions
by referring to the title in the doc: https://spark.apache.org/docs/latest/api/sql/index.html
docs/sql-ref-syntax-qry-window.md
Outdated
Specifies an ordering of the rows.<br><br> | ||
<b>Syntax:</b><br> | ||
<code> | ||
{ ORDER | SORT } BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move ORDER BY
and PARTITION BY
caluses into the Syntax
section like the Pg doc one?
[ existing_window_name ]
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]
docs/sql-ref-syntax-qry-window.md
Outdated
**This page is under construction** | ||
### Description | ||
|
||
Window functions operate on a group of rows, referred to as a window, and calculate an aggregated value for each row based on the specified window. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative statistic, or accessing the value of rows given the relative position of the current row. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... calculate a return value for each row based on a group of rows"
Test build #121352 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
UNBOUNDED { PRECEDING | FOLLOWING } | ||
| CURRENT ROW | ||
| boolean_expression { PRECEDING | FOLLOWING } | ||
</code> <br><br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to describe what these clauses (RANGE, ROWS, BETWEEN, ...) are.
Test build #121371 has finished for PR 28220 at commit
|
docs/sql-ref-syntax-qry-window.md
Outdated
### Examples | ||
|
||
{% highlight sql %} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine except for the existing two comments.
Could you update the screenshot in the description, too? |
Test build #121393 has finished for PR 28220 at commit
|
@maropu I have addressed the last two comments and updated the screenshots in description. Thanks for reviewing! |
cc @srowen for final sign off. |
docs/sql-ref-syntax-qry-window.md
Outdated
+-----+-----------+------+-----+ | ||
|
||
SELECT name, salary, | ||
LAG(salary) OVER (PARTITION BY dept ORDER BY salary) as lag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as -> AS
but definitely don't change it just for that. Looks fine. I'll merge shortly
docs/sql-ref-syntax-qry-window.md
Outdated
+-----+-----------+------+----------+ | ||
|
||
SELECT name, dept, age, CUME_DIST() OVER (PARTITION BY dept ORDER BY age | ||
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cume_dist FROM employees; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as
-> AS
here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix this
Test build #121433 has finished for PR 28220 at commit
|
### What changes were proposed in this pull request? Document Window Function in SQL syntax ### Why are the changes needed? Make SQL Reference complete ### Does this PR introduce any user-facing change? Yes <img width="1050" alt="Screen Shot 2020-04-16 at 9 13 34 PM" src="https://user-images.githubusercontent.com/13592258/79531509-7bf5af00-8027-11ea-8291-a91b2e97a1b5.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 14 12 PM" src="https://user-images.githubusercontent.com/13592258/79531514-7e580900-8027-11ea-8761-4c5a888c476f.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 14 45 PM" src="https://user-images.githubusercontent.com/13592258/79531518-82842680-8027-11ea-876f-6375aa5b5ead.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 15 10 PM" src="https://user-images.githubusercontent.com/13592258/79531521-844dea00-8027-11ea-8948-712f054d42ee.png"> <img width="1050" alt="Screen Shot 2020-04-16 at 9 15 25 PM" src="https://user-images.githubusercontent.com/13592258/79531528-8748da80-8027-11ea-9dae-a465286982ac.png"> ### How was this patch tested? Manually build and check Closes #28220 from huaxingao/sql-win-fun. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org> (cherry picked from commit 142f436) Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
Thanks! Merged to master/3.0. |
Thanks, all! |
What changes were proposed in this pull request?
Document Window Function in SQL syntax
Why are the changes needed?
Make SQL Reference complete
Does this PR introduce any user-facing change?
Yes
How was this patch tested?
Manually build and check