-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-21287][SQL] Move the validation of fetch size from JDBCOptions to JdbcDialect #26230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
to support data fetch in stream manner (fetch one row at a time) against MySQL database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks good.
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
Outdated
Show resolved
Hide resolved
Test build #4903 has finished for PR 26230 at commit
|
I think the |
Can you add some tests in
|
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
Outdated
Show resolved
Hide resolved
…ve validateFetchSize method.
2. Add test "[SPARK-21287] Dialect validate properties" in JDBCSuite.
@maropu updated, please help review again. |
* in [[org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions]]. | ||
* @param properties The connection properties. This is passed through from the relation. | ||
*/ | ||
def validateProperties(properties: Map[String, String]): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.. wait .. properties: Map[String, String]
this will requires Scala map instead of Java map, if we should implement the dialect from Java side. We could just send Properties as are I suspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose scala Map because i saw the JdbcDialect.beforeFetch accepts scala Map as parameter.
I agree with you, we may consider changing the parameter type for both beforeFetch and validateProperties, but i think we can leave this to another PR.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK to leave as-is here.
@@ -147,14 +147,7 @@ class JDBCOptions( | |||
""".stripMargin | |||
) | |||
|
|||
val fetchSize = { | |||
val size = parameters.getOrElse(JDBC_BATCH_FETCH_SIZE, "0").toInt | |||
require(size >= 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we could just remove this condition and set it as is, and then expect the DBMS to throw an error as well. Seems like Postgres throws an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just put the discussion on SPARK-21287 here for reference :
https://issues.apache.org/jira/browse/SPARK-21287
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that's fine too. It isn't really supposed to be negative, but there's one notable exception in MySQL's driver. Even so, we don't manually check other settings here, and so yeah I don't know if there's much value in this check in the end. I'd be OK just removing it entirely too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen @HyukjinKwon I think both of these two solutions should be ok, so I submitted another PR #26244 which just remove the check of non-negative condition in JDBCOptions.
You can decide which one to follow up, i'll close the other.
### What changes were proposed in this pull request? Remove the requirement of fetch_size>=0 from JDBCOptions to allow negative fetch size. ### Why are the changes needed? Namely, to allow data fetch in stream manner (row-by-row fetch) against MySQL database. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Unit test (JDBCSuite) This closes #26230 . Closes #26244 from fuwhu/SPARK-21287-FIX. Authored-by: fuwhu <bestwwg@163.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit 92b2529) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Why are the changes needed?
To fix #SPARK-21287 , namely to support data fetch in stream manner against MySQL database.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests (JDBCSuite).