Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add/Remove download options for files uploaded using rsync #3350

Closed
djbrooke opened this issue Sep 13, 2016 · 9 comments
Closed

Add/Remove download options for files uploaded using rsync #3350

djbrooke opened this issue Sep 13, 2016 · 9 comments

Comments

@djbrooke
Copy link
Contributor

djbrooke commented Sep 13, 2016

No description provided.

@pdurbin
Copy link
Member

pdurbin commented Sep 13, 2016

During the 2016-09-08 SBGrid Sprint Planning meeting ( https://docs.google.com/document/d/1wWSdKUOGA1L7UqFsgF3aOs8_9uyjnVpsPAxk7FObOOI/edit ) this issue was given an effort level of "8".

The system that @bmckinney and @pameyer are migrating from ( https://data.sbgrid.org ) only permits files to enter and leave the system via rsync because the datasets are relatively big (55 GB or so, I believe). In contrast, files in Dataverse can be downloaded via HTTP either one at a time or in batches (including "all files") as a zip. We have concerns that Glassfish or some other part of the innards of Dataverse will fall over (due to memory pressure or what have you) if someone tries to download all the files in a large datasets as a zip so we are considering disabling that feature based on the fact that dataset supports rsync or perhaps based on the total amount of storage used by that dataset.

It's pretty easy to turn off "download all as zip" based on if a dataset supports rsync. I don't think we actually have any mechanisms currently for determining the total storage used by a dataset (this would be useful for implementing quotas, by the way).

@scolapasta has argued a few times that it shouldn't matter how data gets in to Dataverse. We should strive to support a scenario where the author uploads files via rsync and a researcher later downloads them via zip (as long as the files are too big). I definitely agree with this aspiration but I think this issue need to be scoped properly to make it into a release. What did we mean by "8"? What are we trying to achieve at this time. I'm going to assign this issue to @djbrooke @scolapasta @bmckinney and myself to discuss further.

@djbrooke djbrooke changed the title Add/Remove download options for files uploaded using rsync Disable download button for datasets over a certain size Sep 15, 2016
@djbrooke
Copy link
Contributor Author

If a user currently tries to download a dataset that's too large, the user gets a zip file with some of the files and an additional file that communicates that the file size was too high to be successfully downloaded.

Instead of this experience, it was suggested that the download button is disabled with a message that files can be downloaded individually. This provides a consistent experience for files uploaded using rsync and those files uploaded through other methods.

@djbrooke djbrooke changed the title Disable download button for datasets over a certain size Add/Remove download options for files uploaded using rsync Sep 15, 2016
@djbrooke djbrooke added the ready label Sep 15, 2016
@djbrooke djbrooke removed the ready label Sep 15, 2016
@djbrooke
Copy link
Contributor Author

Moving to the backlog - this is not a blocker for the SBGrid folks.

@pdurbin
Copy link
Member

pdurbin commented Sep 15, 2016

We noticed this in the code and @bmckinney is going to play around with setting this to "-1":

/**
 * Download-as-zip size limit.
 * returns 0 if not specified; 
 * (the file zipper will then use the default value)
 * set to -1 to disable zip downloads. 
 */

@pdurbin
Copy link
Member

pdurbin commented Sep 15, 2016

the user gets a zip file with some of the files and an additional file that communicates that the file size was too high to be successfully downloaded

Right. This issue is related (error message saved a file within the zip): #2060

@pameyer
Copy link
Contributor

pameyer commented Sep 15, 2016

On the themes of "not caring how files get in", it would seem like it would make sense to hide this UI option (and disable through API as appropriate) if the total dataset size is larger than an admin-configurable threshold (aka - return "-1" for the above method).

@pdurbin pdurbin added the SBGrid label Oct 7, 2016
@djbrooke djbrooke self-assigned this Jun 21, 2017
@pdurbin
Copy link
Member

pdurbin commented Jun 25, 2017

Related to #3776 in the sense that @pameyer wants to hide buttons for his installation. That issue is about hiding buttons to restrict files because restrictions will be unenforceable without the logic we put in Glassfish. This one is about hiding buttons to download files because they're too big.

@pameyer
Copy link
Contributor

pameyer commented Jun 26, 2017

This also intersects with "package files", which are currently showing (intentional with current "experimental" status) non-functional download links.

@djbrooke djbrooke removed their assignment Jun 27, 2017
@djbrooke
Copy link
Contributor Author

The necessary adjustments to the download process for SBGrid are covered in #3348. Once we have the case of users uploading via rsync and wanting to download individual files, we'll create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants