-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-22940][SQL] HiveExternalCatalogVersionsSuite should succeed on platforms that don't have wget #20147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
OK to test |
ok to test |
} | ||
|
||
try { | ||
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Utils.doFetchFile
is a little overkill for this case, maybe we can just implement a simple downloading function with java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh, if we can reuse code instead of rewriting it, seems OK to me
Test build #85659 has finished for PR 20147 at commit
|
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) | ||
result.exists() | ||
} catch { | ||
case ex: Exception => logError("Could not get file from url " + urlString + ": " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should just be a warning (it's not fatal) and you could use string interpolation.
val filename = "string-out.txt" | ||
|
||
if (!getFileFromUrl(urlString, outDir.toString, filename)) { | ||
throw new java.io.IOException("Could not get string from url " + urlString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import this
throw new java.io.IOException("Could not get string from url " + urlString) | ||
} | ||
|
||
val outputFile = new File(outDir.toString + File.separator + filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to use the constructor that takes File, String rather than construct the path. MInor thing.
I'd also say use the newer Path API but may not be what Scala APIs use.
val outputFile = new File(outDir.toString + File.separator + filename) | ||
val fis = new FileInputStream(outputFile) | ||
val result = Source.fromInputStream(fis)(Codec(encoding)).mkString | ||
fis.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close() in a finally block?
} | ||
|
||
try { | ||
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh, if we can reuse code instead of rewriting it, seems OK to me
Test build #85689 has finished for PR 20147 at commit
|
val url = s"$preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz" | ||
getStringFromUrl("https://www.apache.org/dyn/closer.lua?preferred=true") | ||
val filename = s"spark-$version-bin-hadoop2.7.tgz" | ||
val url = s"$preferredMirror/spark/spark-$version/" + filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use filename
in interpolation also?
} | ||
} | ||
|
||
private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Charset
instead of String... but since you're really not using that parameter, I'd just hardcode the charset in the only place where it's used.
} | ||
|
||
private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = { | ||
val outDir = Files.createTempDirectory("string-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Utils.createTempDir
; it handles deleting the directory for you later.
Another option is to use File.createTempFile
+ File.deleteOnExit
. Same result.
|
||
val contentFile = new File(outDir.toFile, filename) | ||
try { | ||
Source.fromFile(contentFile)(Codec(encoding)).mkString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok, but there's an easier / cleaner API for this. Look around for calls like this:
Files.toString(file, StandardCharsets.UTF_8);
try { | ||
Source.fromFile(contentFile)(Codec(encoding)).mkString | ||
} finally { | ||
contentFile.delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of this finally block if you switch to Utils.createTempDir
.
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) | ||
result.exists() | ||
} catch { | ||
case ex: Exception => logWarning("Could not get file from url " + urlString + ": " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of multi-line case statements, put them in the next line, properly indented.
Also: logWarning(s"I shall prefer interpolation", ex)
|
||
try { | ||
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) | ||
result.exists() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary (just return true
? doFetchFile
should always return a file that exists), but ok.
val filename = "string-out.txt" | ||
|
||
if (!getFileFromUrl(urlString, outDir.toString, filename)) { | ||
throw new IOException("Could not get string from url " + urlString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about letting the exception from doFetchFile
propagate, and only handle it as part of the retries in tryDownloadSpark
?
Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf) | ||
} | ||
|
||
private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems encoding: String = "UTF-8"
is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I should have caught that. Will fix.
Test build #85711 has finished for PR 20147 at commit
|
Test build #85717 has finished for PR 20147 at commit
|
retest this please |
Test build #85718 has finished for PR 20147 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Merging to master / 2.3.
Left a comment but no need to address it.
outDir.mkdirs() | ||
} | ||
|
||
// propagate exceptions up to the caller of getFileFromUrl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally don't add these kind of comments since it's implied in every statement outside of a try...catch.
… platforms that don't have wget ## What changes were proposed in this pull request? Modified HiveExternalCatalogVersionsSuite.scala to use Utils.doFetchFile to download different versions of Spark binaries rather than launching wget as an external process. On platforms that don't have wget installed, this suite fails with an error. cloud-fan : would you like to check this change? ## How was this patch tested? 1) test-only of HiveExternalCatalogVersionsSuite on several platforms. Tested bad mirror, read timeout, and redirects. 2) ./dev/run-tests Author: Bruce Robbins <bersprockets@gmail.com> Closes #20147 from bersprockets/SPARK-22940-alt. (cherry picked from commit c0b7424) Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
What changes were proposed in this pull request?
Modified HiveExternalCatalogVersionsSuite.scala to use Utils.doFetchFile to download different versions of Spark binaries rather than launching wget as an external process.
On platforms that don't have wget installed, this suite fails with an error.
@cloud-fan : would you like to check this change?
How was this patch tested?