-
Notifications
You must be signed in to change notification settings - Fork 213
bugfix checksum not constant when using git_config (#2726) #2727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
bugfix checksum not constant when using git_config (#2726) #2727
Conversation
@Louwrensth Thanks a lot for your contribution! I guess this change can help with getting the exact same source tarballs, but I'm not sure it's sufficient to ensure that checksums are always the same. I suspect different version of Also, is there a version constraint on the @mboisson Thoughts on this? |
@boegel You are right, this PR doesn't catch all. I found this list of known issues regarding creating reproducible tarballs, seems useful: https://wiki.debian.org/ReproducibleBuilds/Howto#Identified_problems.2C_and_possible_solutions I'm willing to make a start and go through these known solutions for Version constraint for |
Long term the easier/more portable solution would certainly be to stick to using That's harder to implement though, since right now EasyBuild always verifies the checksum on the source tarballs themselves, not the unpacked directory that results from it... W.r.t. |
@boegel I am mostly oblivious to the checksum problems with tarballs, so I am afraid I can be of no help. I don't have an opinion on the topic. |
aa0cfa5
to
699f683
Compare
I've been playing with various modifications of this command (after https://wiki.debian.org/ReproducibleBuilds)
And it is reproducible (also if adding Then I tried to use But |
@Louwrensth Even |
@boegel : Thanks for the comment. Regarding git version constraints: no issue afaik. I have tried it with git 2.10.1 and git 1.8.3.1 with identical results. I can look into the git history on how Regarding github's changes: no issue afaik. Because the trick is not to run |
7c75bd0
to
c6b1aa6
Compare
c6b1aa6
to
171139d
Compare
171139d
to
123f7bc
Compare
Okay, now Travis and the Hound are happy :) @boegel I looked into the history of |
@boegel Please let me know if there is anything left you want to discuss. |
@boegel: I take it you're very busy, but still I let you know that this PR is ready for your final review and merge. |
@Louwrensth The changes look good and make sense to me. I tested this with the following easyconfig (which is useless, but fine as a test case here): easyblock = 'Tarball'
name = 'easybuild-framework'
version = '3.8.0'
homepage = 'http://easybuilders.github.io/easybuild'
description = "EasyBuild framework"
toolchain = {'name': 'dummy', 'version': 'dummy'}
sources = [{
'filename': SOURCE_TAR_GZ,
'git_config': {
'url': 'https://github.com/easybuilders',
'repo_name': 'easybuild-framework',
'tag': 'easybuild-framework-v%(version)s',
'recursive': True,
},
}]
sanity_check_paths = {
'files': ['eb'],
'dirs': ['easybuild/framework'],
}
moduleclass = 'tools' The changed implementation (still) works fine:
However, I'm still getting a different checksum when testing on two (very) different systems:
Perhaps it's a bit unfair to expect getting the exact same checksum on two systems that are so different, but I just wanted to bring that up... Thoughts? Would using One additional thing we should keep in mind that changing the implementation will also result in getting different checksums on the same system when using an EasyBuild version that includes these changes. Maybe we should emit a clear warning when checksums are being used in easyconfigs that use |
@Louwrensth Any progress on this? I.e. regarding the last comment from @boegel |
@boegel :
I do not know how to get around this... Maybe bzip2 would help, but maybe it's the OS/filesystem that makes the files always slightly different before zipping... Maybe we can live with the warning? Or maybe we make use of the git hash instead of zipping+checksumming? It will brake with EB method of keeping a source tarball of each installation. |
I think the main cause of getting different checksums is a different version of the tools that come into play ( One option would be to compute a collective checksum on the contents of the unpacked sources, without packing it into a tarball at all, since the contents are exactly the same on different systems, and that's what we actually care about... |
By using
gzip --no-name
we omit including filenames and timestamps, thus the checksum of the resulting tarball should be constant.