Skip to content

[SPARK-33214][TEST][HIVE] Stop HiveExternalCatalogVersionsSuite from using a hard-coded location to store localized Spark binaries. #30122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

xkrogen
Copy link
Contributor

@xkrogen xkrogen commented Oct 21, 2020

What changes were proposed in this pull request?

This PR changes HiveExternalCatalogVersionsSuite to, by default, use a standard temporary directory to store the Spark binaries that it localizes. It additionally adds a new System property, spark.test.cache-dir, which can be used to define a static location into which the Spark binary will be localized to allow for sharing between test executions. If the System property is used, the downloaded binaries won't be deleted after the test runs.

Why are the changes needed?

In SPARK-22356 (PR #19579), the sparkTestingDir used by HiveExternalCatalogVersionsSuite became hard-coded to enable re-use of the downloaded Spark tarball between test executions:

  // For local test, you can set `sparkTestingDir` to a static value like `/tmp/test-spark`, to
  // avoid downloading Spark of different versions in each run.
  private val sparkTestingDir = new File("/tmp/test-spark")

However this doesn't work, since it gets deleted every time:

  override def afterAll(): Unit = {
    try {
      Utils.deleteRecursively(wareHousePath)
      Utils.deleteRecursively(tmpDataDir)
      Utils.deleteRecursively(sparkTestingDir)
    } finally {
      super.afterAll()
    }
  }

It's bad that we're hard-coding to a /tmp directory, as in some cases this is not the proper place to store temporary files. We're not currently making any good use of it.

Does this PR introduce any user-facing change?

Developer-facing changes only, as this is in a test.

How was this patch tested?

The test continues to execute as expected.

@xkrogen xkrogen force-pushed the xkrogen-SPARK-33214-hiveexternalversioncatalogsuite-fix branch from 8bbfc0f to c56cf1f Compare October 22, 2020 20:41
@xkrogen
Copy link
Contributor Author

xkrogen commented Nov 3, 2020

Ping @cloud-fan @gatorsmile who were involved in the original PR.

@cloud-fan
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35188/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35188/

@SparkQA
Copy link

SparkQA commented Nov 4, 2020

Test build #130588 has finished for PR 30122 at commit c56cf1f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in ff724d2 Nov 4, 2020
@xkrogen xkrogen deleted the xkrogen-SPARK-33214-hiveexternalversioncatalogsuite-fix branch November 4, 2020 16:03
@xkrogen
Copy link
Contributor Author

xkrogen commented Nov 4, 2020

Thanks @cloud-fan !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants