- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.8k
Bundler lockfile checksums #5808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Thanks for opening a pull request and helping make RubyGems and Bundler better! Someone from the RubyGems team will take a look at your pull request shortly and leave any feedback. Please make sure that your pull request has tests for any changes or added functionality. We use GitHub Actions to test and make sure your change works functionally and uses acceptable conventions, you can review the current progress of GitHub Actions in the PR status window below. If you have any questions or concerns that you wish to ask, feel free to leave a comment in this PR or join our #rubygems or #bundler channel on Slack. For more information about contributing to the RubyGems project feel free to review our CONTRIBUTING guide | 
| There's a lot of work in updating the specs (see above TODOs), so please provide feedback firstly on: 
 | 
| 
 I believe a new section is the best option, since older versions of Bundler should be able to ignore new lockfile sections. You might also be able to experiment with adding the checksum after the gem name and version number and see if the parser ignores the extra field. 
 Yes, I think it should be enabled by default. Gems changing content between installs is always a bad sign, because the goal of Bundler is for gems to not change content between installs. :) 
 I think the way to test this would be to test the lockfile from this PR against the lockfile parser from Bundler 1.0.x, 1.1.x, 1.12.x, and 2.0.x, so that we know what warnings to print at the time lockfiles are upgraded to include checksums. It would still be ok to ship this PR even if some older Bundler versions can't parse the lockfile, but we would want to know which versions work and don't work before shipping. 
 As long as the checksums support separate checksums for the same gem and version but different platforms, this should be ok. For examples of this kind of gem, try checking out Nokogiri 1.13.8, which has at least 10 separate .gem files with 10 separate checksums for various platforms. | 
| I agree with @indirect. By the way I'm sorry we hardcode so many lock files in specs, makes this kind of mass lock file updates painful, but I guess it also will highlight a bunch of edge cases, so probably worth going through it anyways. | 
697ed2b    to
    0f1e31d      
    Compare
  
    a329261    to
    b8d0871      
    Compare
  
    | All specs now pass, save for one. I think this is a legit case I did not handle. I will be away for a month - so will pick this back up in November  | 
1af2ae4    to
    40df63a      
    Compare
  
    | I think I have  updated all the specs now to use the new  Only https://github.com/kuahyeow/rubygems/actions/runs/3407168585/jobs/5668783991 is failing now for Windows - no idea why. Talking about formats.... Any preferences ? For reference, here's what yarn.lock, and go.mod have  | 
| if spec.is_a?(LazySpecification) && spec.generic_platform_with_specific_installed_platform? | ||
| # If a spec started off as a generic ruby platform, and is resolved | ||
| # to something more specific on install time, we skip recording the | ||
| # checksum. | ||
| # For example if we have: | ||
| # | ||
| # specs: | ||
| # pry (0.11.3) | ||
| # | ||
| # PLATFORMS | ||
| # ruby | ||
| # java | ||
| # | ||
| # Recording the checksum for ruby does not make sense, because we | ||
| # will end up with something like: | ||
| # | ||
| # specs: | ||
| # pry (0.11.3) | ||
| # | ||
| # CHECKSUMS | ||
| # pry(0.11.3-java) | ||
| # 1f0bd9ce0282e8a20151011e4677746206a0c84b648e0a1c981b3232bb47a3b9 | ||
| # pry(0.11.3-java) | ||
| # 1f0bd9ce0282e8a20151011e4677746206a0c84b648e0a1c981b3232bb47a3b9 | ||
| # | ||
| # PLATFORMS | ||
| # ruby | ||
| # java | ||
| next | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to #3379 (comment), I had to skip "generic" platforms like "ruby".
I think this means gems like nokogiri will not have their checksums locked under legacy lockfiles where ruby is the only PLATFORMS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the cache of compact entries, here is a non-exhaustive list of gems that have more than one platform
index = Bundler::Fetcher::CompactIndex.new(nil, Bundler::Source::Rubygems::Remote.new(Bundler::URI("https://rubygems.org")), nil)
dependencies = Dir[index.send(:compact_index_client).instance_variable_get(:@cache).directory.join("info").join('*')].map {|d| File.basename d}
dependencies.select do |d|
  index.send(:compact_index_client).instance_variable_get(:@cache).dependencies(d).group_by(&:first).values.last.size > 1
end
["nio4r", "aws-crt", "msgpack", "mixlib-shellout", "ffi-yajl", "concurrent-ruby-ext", "strscan", "mysql2", "sorted_set", "puma", "rest-client", "io-wait", "hpricot", "io-console", "libxml-ruby", "unf_ext", "grpc", "win32console", "google-protobuf", "ruby-prof", "bcrypt", "chef", "hamlit", "digest", "date", "websocket-driver", "hiredis", "bcrypt_pbkdf", "binding_ninja", "murmurhash3", "ptools", "atomic", "eventmachine", "mongrel", "pg", "psych", "cgi", "bcrypt-ruby", "win32-api", "sassc", "json", "stringio", "snappy", "mixlib-archive", "ffi", "sqlite3", "racc", "pry", "libv8", "thread_safe", "jaro_winkler", "character_set", "sys-admin", "ed25519", "nokogiri", "jruby-pageant", "regexp_property_values", "linecache"]
| Personally I would lean towards the "one line with prefix" format, but I think any of these would be a great improvement over the lack of lockfile checksums that we have now. :) | 
f9310b5    to
    7fdd674      
    Compare
  
    4c52b43    to
    0a7b217      
    Compare
  
    0a7b217    to
    54b8da6      
    Compare
  
    7568170    to
    4c55c26      
    Compare
  
    We lock the checksum for each resolved spec under a new CHECKSUMS section in the lockfile. The checksum is calculated from the SHA256 checksum of the cached gem. If the locked spec does not resolve for the local platform, we preserve the locked checksum, similar to how we preserve specs.
727808e    to
    b4ee6bb      
    Compare
  
    If ruby platform specs resolve into specific platforms, skip locking checksum for that spec, as its legacy behaviour
We switch to a different gem name to avoid this conflict
Saves some disk space
0b1b831    to
    e19b499      
    Compare
  
    The special case of adding checksum for gems with ruby only lockfile, and ruby only gem (i.e. no platform specific gems) is probably YAGNI.
e19b499    to
    440e787      
    Compare
  
    | 
 | 
Checksum locking only makes sense on install. The compact index information is only available then.
440e787    to
    597932a      
    Compare
  
    | Merged in #6374. | 
What was the end-user or developer problem that led to this PR?
This mitigates issues such as cache poisoning, or dependency confusion. If the remote gem server has gems compromised via those issues, then the application installing such gems can use the locked checksum to warn
there is a potential issue. e.g. GHSA-2jmx-8mh8-pm8w
See also discussion in #3379
What is your fix for the problem, implemented in this PR?
Lock the checksum of each gem file in the lockfile, in a new
CHECKSUMSsectionNote only installed gems will lock the checksum. Gem that are not matching the current local platform will not lock its checksum (because it was never resolved) - however if they had a previous locked checksum it should be preserved)
Once we have locked checksums, we can then later add install-time checking (not in this PR)
Make sure the following tasks are checked
This is still in draft, but it shows one way to add a backwards compatible Checksums section to the lockfile
TODOs:
NAME_VERSIONregexc1191bd01cbe54c3bec85a33e16f87162f47f9ab08d0cbe6e64232b52cd84d17vssha256-c1191bd01cbe54c3bec85a33e16f87162f47f9ab08d0cbe6e64232b52cd84d17)build_gemrubyPLATFORM)CHECKSUMSsectionrack-1.0.0)