-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-19140][SS]Allow update mode for non-aggregation streaming queries #16520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #71096 has finished for PR 16520 at commit
|
LGTM! ultra nit: |
@@ -58,7 +62,9 @@ final class DataStreamWriter[T] private[sql](ds: Dataset[T]) { | |||
* the sink | |||
* - `complete`: all the rows in the streaming DataFrame/Dataset will be written to the sink | |||
* every time these is some updates | |||
* | |||
* - `update`: only the rows that were updated in the streaming DataFrame/Dataset will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also please update pyspark docs?
Also could you please update pyspark docs? |
Test build #71157 has finished for PR 16520 at commit
|
@@ -665,6 +665,9 @@ def outputMode(self, outputMode): | |||
the sink | |||
* `complete`:All the rows in the streaming DataFrame/Dataset will be written to the sink | |||
every time these is some updates | |||
* `update`:only the rows that were updated in the streaming DataFrame/Dataset will be | |||
written to the sink every time there are some updates. If the query doesn't contain | |||
aggregations, it will be equivalent to the `append` mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please remove the
before append
.
every time these is some updates | ||
* `update`:only the rows that were updated in the streaming DataFrame/Dataset will be | ||
written to the sink every time there are some updates. If the query doesn't contain | ||
aggregations, it will be equivalent to the `append` mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@@ -57,7 +57,8 @@ public static OutputMode Complete() { | |||
|
|||
/** | |||
* OutputMode in which only the rows that were updated in the streaming DataFrame/Dataset will | |||
* be written to the sink every time there are some updates. | |||
* be written to the sink every time there are some updates. If the query doesn't contain | |||
* aggregations, it will be equivalent to the `Append` mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
* written to the sink every time these is some updates. This output mode can only be used in | ||
* queries that contain aggregations. | ||
* written to the sink every time these is some updates. If the query doesn't contain | ||
* aggregations, it will be equivalent to the `Append` mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Left a nit (occurring multiple times), otherwise LGTM! |
thanks LGTM! |
Test build #71159 has finished for PR 16520 at commit
|
Test build #71168 has finished for PR 16520 at commit
|
Thanks. Merging to master and 2.1. |
…ries ## What changes were proposed in this pull request? This PR allow update mode for non-aggregation streaming queries. It will be same as the append mode if a query has no aggregations. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixiong@databricks.com> Closes #16520 from zsxwing/update-without-agg. (cherry picked from commit bc6c56e) Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
…ries ## What changes were proposed in this pull request? This PR allow update mode for non-aggregation streaming queries. It will be same as the append mode if a query has no aggregations. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixiong@databricks.com> Closes apache#16520 from zsxwing/update-without-agg.
…ries ## What changes were proposed in this pull request? This PR allow update mode for non-aggregation streaming queries. It will be same as the append mode if a query has no aggregations. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixiong@databricks.com> Closes apache#16520 from zsxwing/update-without-agg.
What changes were proposed in this pull request?
This PR allow update mode for non-aggregation streaming queries. It will be same as the append mode if a query has no aggregations.
How was this patch tested?
Jenkins