Skip to content

[SPARK-22940][SQL] HiveExternalCatalogVersionsSuite should succeed on platforms that don't have wget #20147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

bersprockets
Copy link
Contributor

What changes were proposed in this pull request?

Modified HiveExternalCatalogVersionsSuite.scala to use Utils.doFetchFile to download different versions of Spark binaries rather than launching wget as an external process.

On platforms that don't have wget installed, this suite fails with an error.

@cloud-fan : would you like to check this change?

How was this patch tested?

  1. test-only of HiveExternalCatalogVersionsSuite on several platforms. Tested bad mirror, read timeout, and redirects.
  2. ./dev/run-tests

@cloud-fan
Copy link
Contributor

OK to test

@HyukjinKwon
Copy link
Member

ok to test

}

try {
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Utils.doFetchFile is a little overkill for this case, maybe we can just implement a simple downloading function with java.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, if we can reuse code instead of rewriting it, seems OK to me

@SparkQA
Copy link

SparkQA commented Jan 4, 2018

Test build #85659 has finished for PR 20147 at commit c5e835e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
result.exists()
} catch {
case ex: Exception => logError("Could not get file from url " + urlString + ": "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be a warning (it's not fatal) and you could use string interpolation.

val filename = "string-out.txt"

if (!getFileFromUrl(urlString, outDir.toString, filename)) {
throw new java.io.IOException("Could not get string from url " + urlString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import this

throw new java.io.IOException("Could not get string from url " + urlString)
}

val outputFile = new File(outDir.toString + File.separator + filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to use the constructor that takes File, String rather than construct the path. MInor thing.
I'd also say use the newer Path API but may not be what Scala APIs use.

val outputFile = new File(outDir.toString + File.separator + filename)
val fis = new FileInputStream(outputFile)
val result = Source.fromInputStream(fis)(Codec(encoding)).mkString
fis.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close() in a finally block?

}

try {
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh, if we can reuse code instead of rewriting it, seems OK to me

@SparkQA
Copy link

SparkQA commented Jan 4, 2018

Test build #85689 has finished for PR 20147 at commit e85b813.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val url = s"$preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz"
getStringFromUrl("https://www.apache.org/dyn/closer.lua?preferred=true")
val filename = s"spark-$version-bin-hadoop2.7.tgz"
val url = s"$preferredMirror/spark/spark-$version/" + filename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use filename in interpolation also?

}
}

private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Charset instead of String... but since you're really not using that parameter, I'd just hardcode the charset in the only place where it's used.

}

private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = {
val outDir = Files.createTempDirectory("string-")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Utils.createTempDir; it handles deleting the directory for you later.

Another option is to use File.createTempFile + File.deleteOnExit. Same result.


val contentFile = new File(outDir.toFile, filename)
try {
Source.fromFile(contentFile)(Codec(encoding)).mkString
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok, but there's an easier / cleaner API for this. Look around for calls like this:

Files.toString(file, StandardCharsets.UTF_8);

try {
Source.fromFile(contentFile)(Codec(encoding)).mkString
} finally {
contentFile.delete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get rid of this finally block if you switch to Utils.createTempDir.

val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
result.exists()
} catch {
case ex: Exception => logWarning("Could not get file from url " + urlString + ": "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of multi-line case statements, put them in the next line, properly indented.

Also: logWarning(s"I shall prefer interpolation", ex)


try {
val result = Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
result.exists()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary (just return true? doFetchFile should always return a file that exists), but ok.

val filename = "string-out.txt"

if (!getFileFromUrl(urlString, outDir.toString, filename)) {
throw new IOException("Could not get string from url " + urlString)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about letting the exception from doFetchFile propagate, and only handle it as part of the retries in tryDownloadSpark?

Utils.doFetchFile(urlString, outDir, filename, conf, securityManager, hadoopConf)
}

private def getStringFromUrl(urlString: String, encoding: String = "UTF-8"): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems encoding: String = "UTF-8" is not used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I should have caught that. Will fix.

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85711 has finished for PR 20147 at commit 3dbfffd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85717 has finished for PR 20147 at commit 7b58d99.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jan 5, 2018

Test build #85718 has finished for PR 20147 at commit 7b58d99.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merging to master / 2.3.

Left a comment but no need to address it.

outDir.mkdirs()
}

// propagate exceptions up to the caller of getFileFromUrl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't add these kind of comments since it's implied in every statement outside of a try...catch.

asfgit pushed a commit that referenced this pull request Jan 5, 2018
… platforms that don't have wget

## What changes were proposed in this pull request?

Modified HiveExternalCatalogVersionsSuite.scala to use Utils.doFetchFile to download different versions of Spark binaries rather than launching wget as an external process.

On platforms that don't have wget installed, this suite fails with an error.

cloud-fan : would you like to check this change?

## How was this patch tested?

1) test-only of HiveExternalCatalogVersionsSuite on several platforms. Tested bad mirror, read timeout, and redirects.
2) ./dev/run-tests

Author: Bruce Robbins <bersprockets@gmail.com>

Closes #20147 from bersprockets/SPARK-22940-alt.

(cherry picked from commit c0b7424)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
@asfgit asfgit closed this in c0b7424 Jan 5, 2018
@bersprockets bersprockets deleted the SPARK-22940-alt branch March 5, 2018 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants