-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-23238][SQL] Externalize SQLConf configurations exposed in documentation #20403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @ueshin and @BryanCutler, mind taking a look please? |
LGTM. |
Test build #86693 has finished for PR 20403 at commit
|
retest this please |
Test build #86701 has finished for PR 20403 at commit
|
Jenkins, retest this please. |
Test build #86705 has finished for PR 20403 at commit
|
retest this please |
@@ -1045,9 +1045,10 @@ object SQLConf { | |||
buildConf("spark.sql.execution.arrow.enabled") | |||
.internal() | |||
.doc("Make use of Apache Arrow for columnar data transfers. Currently available " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When true, make use of
@@ -1045,9 +1045,10 @@ object SQLConf { | |||
buildConf("spark.sql.execution.arrow.enabled") | |||
.internal() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conf will be mentioned in the doc #19575. I think this is not qualified as an internal conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change it in this PR and use this JIRA number https://issues.apache.org/jira/browse/SPARK-23238
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
Test build #86712 has finished for PR 20403 at commit
|
Test build #86716 has finished for PR 20403 at commit
|
this has been failing due to #19892 which was recently reverted |
jenkins retest this please |
Test build #86726 has started for PR 20403 at commit |
retest this please |
LGTM |
retest this please |
ok to test |
Test build #86730 has finished for PR 20403 at commit
|
retest this please. |
Test build #86737 has finished for PR 20403 at commit
|
"for use with pyspark.sql.DataFrame.toPandas, and " + | ||
"pyspark.sql.SparkSession.createDataFrame when its input is a Pandas DataFrame. " + | ||
"The following data types are unsupported: " + | ||
"MapType, ArrayType of TimestampType, and nested StructType.") | ||
.booleanConf | ||
.createWithDefault(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark.sql.execution.arrow.maxRecordsPerBatch
is also mentioned in the doc change at #19575. Shall we also externalize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Let me update spark.sql.inMemoryColumnarStorage.compressed
and spark.sql.inMemoryColumnarStorage.batchSize
too. These are also exposed but internals.
Test build #86742 has finished for PR 20403 at commit
|
Let's also update the PR title. |
.booleanConf | ||
.createWithDefault(false) | ||
|
||
val ARROW_EXECUTION_MAX_RECORDS_PER_BATCH = | ||
buildConf("spark.sql.execution.arrow.maxRecordsPerBatch") | ||
.internal() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about conf. https://github.com/apache/spark/pull/19575/files#r164252424
If we want to merge this PR now, maybe revert this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can externalize this conf in that PR #19575, if we believe this conf is the one we will use in the long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Let me take this out. Is this only one you are concerned of for now?
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Test build #86749 has finished for PR 20403 at commit
|
…mentation ## What changes were proposed in this pull request? This PR proposes to expose few internal configurations found in the documentation. Also it fixes the description for `spark.sql.execution.arrow.enabled`. It's quite self-explanatory. ## How was this patch tested? N/A Author: hyukjinkwon <gurwls223@gmail.com> Closes #20403 from HyukjinKwon/minor-doc-arrow. (cherry picked from commit 39d2c6b) Signed-off-by: hyukjinkwon <gurwls223@gmail.com>
Merged to master and branch-2.3. Thanks @ueshin, @gatorsmile, @BryanCutler, @felixcheung and @viirya. |
What changes were proposed in this pull request?
This PR proposes to expose few internal configurations found in the documentation.
Also it fixes the description for
spark.sql.execution.arrow.enabled
.It's quite self-explanatory.
How was this patch tested?
N/A