-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spark][Kernel] Pass table identifier in Coordinated Commits #3608
[Spark][Kernel] Pass table identifier in Coordinated Commits #3608
Conversation
52447f6
to
a33c191
Compare
Missing some newly added files by chance?
|
48b79ff
to
668920b
Compare
Yes - fixed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems to mix two things:
- Introducing the
TableDescriptor
and plumbing it through all the various method signatures and call sites. Important, but very noisy. - Introducing the
TableIdentifier
and barely starting to plumb it in a couple of spots (which just pass a hard-wired empty option).
Can we move the second part to the PR that actually plumbs through table identifiers in a meaningful way? Then this PR would be a lot easier to review, and probably that later PR as well (plus meaningful testing becomes possible there).
668920b
to
d94a825
Compare
d94a825
to
9775eac
Compare
The changes to plumb TableIdentifier is a superset of the changes to plumb TableDescriptor. This PR focuses on the TableDescriptor part and together with it also takes care of plumbing TableIdentifier in those areas. Have left specific TODOs to fix the call-sites and pass a non-empty TableIdentifier which will be taken in next set of PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kernel changes LGTM
...lts/src/main/java/io/delta/kernel/defaults/engine/DefaultCommitCoordinatorClientHandler.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just double check some files that may import java Optional unnecessarily?
catalystTableIdentifier.catalog.toSeq ++ | ||
catalystTableIdentifier.database.toSeq | ||
new TableIdentifier(namespace.toArray, catalystTableIdentifier.table) | ||
}.map(Optional.of[TableIdentifier]).getOrElse(Optional.empty[TableIdentifier]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a(ny easy) way to directly convert a scala Option into a java Optional?
https://www.javadoc.io/doc/org.scala-lang.modules/scala-java8-compat_2.11/0.6.0/scala/compat/java8/OptionConverters.html
https://www.scala-lang.org/api/2.13.6/scala/jdk/OptionConverters$.html
(maybe it's a scala-2.13 thing we don't have a compat shim for yet?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed we currently lack that particular shim... https://github.com/delta-io/delta/blob/master/spark/src/main/scala/org/apache/spark/sql/util/ScalaExtensions.scala#L22
Maybe we should consider adding it as a follow-up item, since coordinated commit code has a lot of scala/java interop?
@@ -18,6 +18,7 @@ package org.apache.spark.sql.delta | |||
|
|||
// scalastyle:off import.ordering.noEmptyLine | |||
import java.nio.file.FileAlreadyExistsException | |||
import java.util.Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import?
@@ -91,6 +93,40 @@ object DeltaTestImplicits { | |||
} | |||
} | |||
|
|||
/** Helper class for working with [[TableCommitCoordinatorClient]] */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helpers avoid having to retrofit zillions of call sites in path-based unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - these are very useful.
Which Delta project/connector is this regarding?
Description
This PR adds TableIdentifier in the Coordinated Commit Interface.
There will be a separate change to pass the tableName reliably to the Commit Coordinator in delta-spark.
How was this patch tested?
Existing UTs
Does this PR introduce any user-facing changes?
No