-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detecting corrupted or incomplete downloads #81
Comments
Thanks for bringing this up @sigmafelix. Creating a file size check function, following the first suggested approach, would be relatively simple with the
My immediate concern with this approach is its performance at scale. Retrieving the size with
|
Potential performance benefits using
|
@mitchellmanware Thank you for sharing the possible solutions. Checking status code of |
Was this addressed in the most recent PR? If not I will include in next round of manuscript-related changes. |
@mitchellmanware It is not addressed yet. I think we could proceed the manuscript without this functionality and add it in the next version of the package. |
@mitchellmanware
When running
beethoven
pipeline in 2022, I found that one (or more) of GEOS-CF chemical file was downloaded incompletely (i.e., the file causing the error was 2MB, which is only one-fortieth in size of typical GEOS-CF chemical files). Post-checking or detection of incomplete files would be helpful for users who want to download a large set of files from the internet.For this file in trouble, I will replace it with a newly downloaded file. Could you change the write permission of input/geos directory in the team project folder @kyle-messier ?
Considerations
SHA256MD5SUM) that are provided by the data source in some cases. If such piece of information is retrievable from JSON or HTTP request header, we could quickly verify the downloaded files with that.fs
package includes many handy functions to summarize files intibble
s. In this case, we could compare each file size with the typical size or a statistic of all downloaded files to indicate which files were probably corrupted or incomplete.The text was updated successfully, but these errors were encountered: