-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source tarballs are from now on archived #2194
Comments
Great work! |
Björn, this is awesome, thank you!! I would vote for automatically adding the cargo url to the package after the source has been backed up. |
This is fantastic. @jxtx, @nekrut, @jgoecks, much thanks for the storage space. Agreed, cargo urls should definitely be added automatically. Should probably append to the original URL so we have a record of where cargo port got the tarball in the first place. @bgruening: if a meta.yaml updates source/md5/sha256 and the build number is bumped (but not the version number), is the cargo port url updated? Hopefully this shouldn't happen much, but curious to know how it's handled. |
@daler this is currently not covered. It's mainly a problem of the storage organisation. If you have an idea how to structure the depot? I guess you need to put the hash in the directory structure or something like that. But than again what happens with packages without a checksum? @daler I think this would perfectly fit into bioconda-utils :) |
@bgruening If cargo port were a conda-specific depot, then using build numbers in the directory structure would work. Otherwise, updating the cargo port tarball upon checksum collision with meta.yaml would delete the original stored tarball. Bad news for reproducibility. So I agree, including checksum in directory (or maybe just in the basename) would help. But maybe we just need to be careful on the bioconda side to enforce what we can and avoid these kinds of issues. For example we should definitely enforce checksums on all recipes. Any cases you know of where that would not be possible? Sounds like we need a linter module in bioconda-utils . . . |
We should definitely be more strict in using checksums as mentioned above I'm all in. Please add it to this list: #1860
All github/bitbucket URLs. Even worse if you calculate the checksum from on of these, it can change in the future for the same commit. I think they creating the archives on the fly and as soon as they change the underlying zlib library the checksum will change.
Thanks @daler! |
More than 50 tarballs are already mirrored: https://depot.galaxyproject.org/software/ I will close this Issue now! Please remember that we have this feature, we should check it after restructuring the repo and such things, maybe we break it. |
@bioconda/all from now on we will mirror all source tarballs via The Cargo Port project https://depot.galaxyproject.org/software/
The mechanism is implemented in large parts in galaxyproject/cargo-port#93.
url
s andgit_url
sSo whenever your tarball disappears and you would like to rebuild a recipe, go to cargo-port, get the new (archived) URL and update the recipe with this URL.
@daler this will make bioarchive not obsolete but we should be much saver from now on, even without using bioarchive and for all of your packages. We could even think about to automatically include a second URL in every recipe that points to The Cargo Port as fallback.
Remarks:
git_url
as bad-practice. It is a resource hug for github and it means more pain to mirror these packages. Essentially, we need to create tarballs out of it and store them, but can not give you a checksum to control what we did. For this reason I would encourage everyone to useurl
wherever possible, especially for github repositories.Thanks to @jxtx, @nekrut and @jgoecks and the Galaxy team for sponsoring the archive-space - and thanks to @erasche for his help to get this working and the weekend hack.
Sustainable conda packages ftw.
The text was updated successfully, but these errors were encountered: