[SPARK-29331][SQL] create DS v2 Write at physical plan #26001

cloud-fan · 2019-10-02T15:14:30Z

What changes were proposed in this pull request?

create DS v2 write at physical plan instead of logical plan, for streaming write.

Why are the changes needed?

We may need some physical information when creating DS v2 write, e.g. #25990 . This also matches batch write.

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests

cloud-fan · 2019-10-02T15:15:59Z

cc @brkyvz @jose-torres @edrevo

SparkQA · 2019-10-02T18:10:10Z

Test build #111692 has finished for PR 26001 at commit e112af9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-10-03T05:38:57Z

Test build #111708 has finished for PR 26001 at commit 79f94f5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

edrevo

Sweet! Thanks for the change!

cloud-fan · 2019-10-03T08:51:15Z

external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriter.scala

@@ -50,7 +50,7 @@ private[kafka010] object KafkaWriter extends Logging {
      topic: Option[String] = None): Unit = {
    schema.find(_.name == TOPIC_ATTRIBUTE_NAME).getOrElse(
      if (topic.isEmpty) {
-        throw new AnalysisException(s"topic option required when no " +
+        throw new IllegalArgumentException(s"topic option required when no " +


Now we check the options at the physical plan phase, so this should not be AnalysisException anymore.

If this is unacceptable, then we may need to have analysis time write info and runtime write info. Table.newWriteBuilder takes analysis time write info and WriteBuilder.build takes runtime write info.

I'm not sure if it's worth this complexity.

Should it be SparkException? I think the last time we discussed these, it wasn't clear what type of exception to use after analysis. Maybe we need new exception types?

We can use SparkException as well. IllegalArgumentException is a standard java exception which indicates invalid input, I think it's OK to use it even after analysis.

I think we typically want to always raise SparkException because all exception types inherit from it. Unless we are throwing an exception from a method where there is an illegal argument, but that's not what is happening here.

I think we typically want to always raise SparkException because all exception types inherit from it.

In Spark SQL no exceptions inherit from it. In fact SparkException was rarely used in Spark SQL before we adding the v2 commands. SparkException is defined in spark core and usually used when Spark fails to run a task.

In Spark SQL, AnalysisException and standard Java exceptions are more widely used.

Sorry, I thought that AnalysisException inherited from SparkException. Looks like I was wrong.

cloud-fan · 2019-10-03T08:52:09Z

also cc @rdblue

edrevo · 2019-10-13T09:57:00Z

Anything I can do to help push this PR forward? I'd love to get this in so I can finish the PR to add the number of partitions to DSv2

edrevo · 2019-10-23T13:33:46Z

all the checks have passed in this PR. could it be merged?

create Write at physical plan

e112af9

dongjoon-hyun added the SQL label Oct 2, 2019

fix tests

79f94f5

edrevo approved these changes Oct 3, 2019

View reviewed changes

cloud-fan commented Oct 3, 2019

View reviewed changes

edrevo mentioned this pull request Nov 7, 2019

[SPARK-29248][SQL] Pass in number of partitions to WriteBuilder #25945

Closed

cloud-fan closed this Dec 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-29331][SQL] create DS v2 Write at physical plan #26001

[SPARK-29331][SQL] create DS v2 Write at physical plan #26001

Uh oh!

cloud-fan commented Oct 2, 2019

Uh oh!

cloud-fan commented Oct 2, 2019

Uh oh!

SparkQA commented Oct 2, 2019

Uh oh!

SparkQA commented Oct 3, 2019

Uh oh!

edrevo left a comment

Uh oh!

cloud-fan Oct 3, 2019

Uh oh!

cloud-fan Oct 3, 2019

Uh oh!

rdblue Oct 3, 2019

Uh oh!

cloud-fan Oct 4, 2019

Uh oh!

rdblue Oct 4, 2019

Uh oh!

cloud-fan Oct 7, 2019

Uh oh!

rdblue Oct 7, 2019

Uh oh!

cloud-fan commented Oct 3, 2019

Uh oh!

edrevo commented Oct 13, 2019

Uh oh!

edrevo commented Oct 23, 2019

Uh oh!

Uh oh!

[SPARK-29331][SQL] create DS v2 Write at physical plan #26001

[SPARK-29331][SQL] create DS v2 Write at physical plan #26001

Uh oh!

Conversation

cloud-fan commented Oct 2, 2019

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

cloud-fan commented Oct 2, 2019

Uh oh!

SparkQA commented Oct 2, 2019

Uh oh!

SparkQA commented Oct 3, 2019

Uh oh!

edrevo left a comment

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 3, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 3, 2019

Choose a reason for hiding this comment

Uh oh!

rdblue Oct 3, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 4, 2019

Choose a reason for hiding this comment

Uh oh!

rdblue Oct 4, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan Oct 7, 2019

Choose a reason for hiding this comment

Uh oh!

rdblue Oct 7, 2019

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Oct 3, 2019

Uh oh!

edrevo commented Oct 13, 2019

Uh oh!

edrevo commented Oct 23, 2019

Uh oh!

Uh oh!