Skip to content

[SPARK-19231][SPARKR] add error handling for download and untar for Spark release #16589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

felixcheung
Copy link
Member

What changes were proposed in this pull request?

When R is starting as a package and it needs to download the Spark release distribution we need to handle error for download and untar, and clean up, otherwise it will get stuck.

How was this patch tested?

manually

@SparkQA
Copy link

SparkQA commented Jan 16, 2017

Test build #71409 has finished for PR 16589 at commit 9b21c11.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@shivaram shivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @felixcheung - Some minor inline comments

@@ -54,7 +54,7 @@
#' }
#' @param overwrite If \code{TRUE}, download and overwrite the existing tar file in localDir
#' and force re-install Spark (in case the local directory or file is corrupted)
#' @return \code{install.spark} returns the local directory where Spark is found or installed
#' @return the (invisible) local directory where Spark is found or installed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to change the return value here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it wasn't actually - it was always invisible - see here, just not documented as such. Also referencing the method name and saying returns again is redundant as this would go to the return section of the generated doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks

@@ -201,14 +221,20 @@ directDownloadTar <- function(mirrorUrl, version, hadoopVersion, packageName, pa
msg <- sprintf(fmt, version, ifelse(hadoopVersion == "without", "Free build", hadoopVersion),
packageRemotePath)
message(msg)
downloadUrl(packageRemotePath, packageLocalPath, paste0("Fetch failed from ", mirrorUrl))
downloadUrl(packageRemotePath, packageLocalPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not use this Fetch failed from error message and instead use the error message from download.file

Copy link
Member Author

@felixcheung felixcheung Jan 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still there. Both messages Fetch failed and error message from download.file were there before.

Before it is passing the url string several method calls deep, instead, I handle the error up in the stack and display the same message like in here L121, and with url here L176

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't relate this to the update in L176 - I think this is fine. In general I think this file has gotten a little unwieldy with error messages coming from different functions. I wonder if there is a better way to refactor things to setup some expectations on where errors are thrown etc.

Copy link
Member Author

@felixcheung felixcheung Jan 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I agree. I guess I'm trying to bubble up error messages to the top level but generally without exception to throw is making this non-trivial (never thought I'd say that!)

passing all the extra pieces of info around just to display them in error messages doesn't seem very intuitive either..

Copy link
Contributor

@shivaram shivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @felixcheung - Change looks good to me in terms of catching the errors from untar. I think the overall file organization could be improved, but we can leave that to a future PR

@@ -54,7 +54,7 @@
#' }
#' @param overwrite If \code{TRUE}, download and overwrite the existing tar file in localDir
#' and force re-install Spark (in case the local directory or file is corrupted)
#' @return \code{install.spark} returns the local directory where Spark is found or installed
#' @return the (invisible) local directory where Spark is found or installed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks

@@ -201,14 +221,20 @@ directDownloadTar <- function(mirrorUrl, version, hadoopVersion, packageName, pa
msg <- sprintf(fmt, version, ifelse(hadoopVersion == "without", "Free build", hadoopVersion),
packageRemotePath)
message(msg)
downloadUrl(packageRemotePath, packageLocalPath, paste0("Fetch failed from ", mirrorUrl))
downloadUrl(packageRemotePath, packageLocalPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't relate this to the update in L176 - I think this is fine. In general I think this file has gotten a little unwieldy with error messages coming from different functions. I wonder if there is a better way to refactor things to setup some expectations on where errors are thrown etc.

asfgit pushed a commit that referenced this pull request Jan 18, 2017
…park release

## What changes were proposed in this pull request?

When R is starting as a package and it needs to download the Spark release distribution we need to handle error for download and untar, and clean up, otherwise it will get stuck.

## How was this patch tested?

manually

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #16589 from felixcheung/rtarreturncode.

(cherry picked from commit 278fa1e)
Signed-off-by: Felix Cheung <felixcheung@apache.org>
@asfgit asfgit closed this in 278fa1e Jan 18, 2017
@felixcheung
Copy link
Member Author

merged to master and branch-2.1.
@shivaram let me know if you have anything specific about this file re: your comment here you have in mind? I'll open a JIRA to track. thanks!

@shivaram
Copy link
Contributor

I think its fine to leave it as is now - When we next add some feature or update the file we can do some refactoring

uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…park release

## What changes were proposed in this pull request?

When R is starting as a package and it needs to download the Spark release distribution we need to handle error for download and untar, and clean up, otherwise it will get stuck.

## How was this patch tested?

manually

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes apache#16589 from felixcheung/rtarreturncode.
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 15, 2017
…park release

## What changes were proposed in this pull request?

When R is starting as a package and it needs to download the Spark release distribution we need to handle error for download and untar, and clean up, otherwise it will get stuck.

## How was this patch tested?

manually

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes apache#16589 from felixcheung/rtarreturncode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants