Skip to content

YAML cache "bypasses" permitted_classes in some cases #434

Closed

Description

I made sure the issue is in bootsnap

Steps to reproduce

The YAML cache provided by Bootsnap loads some types that are permitted by MessagePack, but not by YAML by default. YAML.load_file uses YAML.load under the covers which restricts the set of classes that can be loaded dynamically. E.g., YAML.load will load Symbol, but not Date or Time. MessagePack will load Date and Time, but not DateTime. While I experimented with a few different serialized objects, I'm not sure that the list of potentially affected classes is exhaustive.

To illustrate the issue, I wrote a small script that attempts to a YAML document both before and after Bootsnap has been loaded. Since loading Bootsnap with the YAML cache will monkeypatch YAML, each example needs to be run in a new process to ensure everything starts in a clean state. I naively handled that in the script by making the user supply the class name (with minimal validation) or it'll default to Date.

You'd run it with one of the following:

$ ruby yaml_cache.rb
$ ruby yaml_cache.rb date
$ ruby yaml_cache.rb time
$ ruby yaml_cache.rb datetime
$ ruby yaml_cache.rb symbol

The script:

require 'bootsnap'
require 'fileutils'
require 'tempfile'
require 'yaml'

DOCUMENTS = {
  Date: "--- 2023-01-20\n",
  Time: "--- 2023-01-20 13:18:31.083375000 -05:00\n",
  Datetime: "--- !ruby/object:DateTime 2023-01-20 13:22:08.204345000 -05:00\n",
  Symbol: "--- :abc\n"
}

class_type = ARGV.empty? ? :Date : ARGV[0].capitalize.to_sym
yaml_doc = DOCUMENTS[class_type]

raise "Invalid class type '#{class_type}'" if yaml_doc.nil?

Tempfile.open("#{class_type}.yml") do |file|
  file.write yaml_doc
  file.close

  begin
    YAML.load_file file.path
  rescue Psych::DisallowedClass
    puts "Caught bad load (#{class_type}): #{$!}"
  else
    puts "#{class_type} loaded"
  end

  Bootsnap.setup(cache_dir: 'tmp/cache', compile_cache_yaml: true)

  begin
    YAML.load_file file.path
  rescue Psych::DisallowedClass
    puts "Caught bad load (#{class_type}): #{$!}"
  else
    puts "#{class_type} loaded"
  end
end

FileUtils.rm_rf 'tmp'

Expected behavior

I'd expect YAML.load_file to be functionally the same before and after loading Bootsnap's YAML cache. I.e., I'd expect the cache is an optimization with no visible effects on behavior.

Actual behavior

Loading the Bootsnap YAML cache on supported Ruby interpreters (where "supported" is defined by Bootsnap internals) will allow the loading of some classes that wouldn't be allowed if the YAML cache were not used:

> ruby -v yaml_cache.rb date
ruby 3.2.0 (2022-12-25 revision a528908271) [arm64-darwin21.5.0]
Caught bad load (Date): Tried to load unspecified class: Date
Date loaded

> ruby -v yaml_cache.rb time
ruby 3.2.0 (2022-12-25 revision a528908271) [arm64-darwin21.5.0]
Caught bad load (Time): Tried to load unspecified class: Time
Time loaded

In contrast, on a Ruby interpreter where the YAML compile cache is not supported, such as TruffleRuby, we see:

> ruby -v yaml_cache.rb date
truffleruby 23.0.0-dev-71f07786, like ruby 3.1.3, GraalVM CE Native [aarch64-darwin]
Caught bad load (Date): Tried to load unspecified class: Date
Caught bad load (Date): Tried to load unspecified class: Date

> ruby -v yaml_cache.rb time
truffleruby 23.0.0-dev-71f07786, like ruby 3.1.3, GraalVM CE Native [aarch64-darwin]
Caught bad load (Time): Tried to load unspecified class: Time
Caught bad load (Time): Tried to load unspecified class: Time

To load the code properly on an interpreter not supported by Bootsnap's YAML cache, we would have to supply the permitted_classes option to YAML.load_file (e.g., YAML.load_file("my.yml", permitted_classes: [Date, Time]). While that will allow the code example to work the same on all Ruby interpreters, it bypasses the YAML cache entirely because the permitted_classes kwarg is not a supported option for the cache.

System configuration

Bootsnap version: 1.15.0

Ruby version: ruby 3.2.0 (2022-12-25 revision a528908271) and TruffleRuby 23.0.0-dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions