Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MiniRacer for TruffleRuby #253

Merged
merged 1 commit into from
Jul 21, 2022
Merged

Conversation

bjfish
Copy link
Contributor

@bjfish bjfish commented Jul 12, 2022

This PR implements mini_racer for TruffleRuby using the GraalVM's JavaScript implementation.

It can be run locally using:

% rbenv install truffleruby+graalvm-dev
% rbenv shell truffleruby+graalvm-dev
% gu install js
% export TRUFFLERUBYOPT='--jvm --polyglot'
% bundle install 
% bundle exec rake compile
% bundle exec rake test

A previous PR with some code review is here: bjfish#1

cc: @eregon

@eregon
Copy link
Contributor

eregon commented Jul 13, 2022

Some extra details: this is a pure-Ruby implementation of the gem using the GraalVM Polyglot API and inner contexts for isolation.
It uses the same interface as the C extension and reuses all the Ruby code in lib/.
It does not use libv8-node or C/C++ code.

This fixes oracle/truffleruby#1827.

@SamSaffron
Copy link
Collaborator

Wow, Amazing, thank you so much

@SamSaffron
Copy link
Collaborator

I hope to get this reviewed early next week, @gschlager is super interested in fully testing this and he is back next week.

wrapped = lambda do |*args|
converted = @parent.send(:convert_js_to_ruby, args)
begin
result = @callback.call(*converted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MiniRacer with V8 seems to call the @callback with unfrozen strings while TruffleRuby uses frozen strings. I don't see a test for it, but I think we should keep that behavior. Would it be possible to have unfrozen strings in converted?

Here's the spec error I'm seeing in Discourse:

FrozenError:
can't modify frozen String

# <internal:core> core/string.rb:882:in `gsub!'
# ./lib/pretty_text/helpers.rb:112:in `category_tag_hashtag_lookup'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:273:in `block in attach'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer/truffleruby.rb:15:in `block in notify_v8'
# (mini_racer):13:in `addHashtag'
# (mini_racer):93:in `applyRule'
# (mini_racer):154:in `textPostProcess'
# (mini_racer):173:in `replacer'
# (mini_racer):69:in `textReplace'
# (mini_racer):177:in `:anonymous'
# (mini_racer):4221:in `Core.process'
# (mini_racer):8431:in `MarkdownIt.parse'
# (mini_racer):8446:in `MarkdownIt.render'
# (mini_racer):578:in `cook'
# (mini_racer):126:in `cook'
# (mini_racer):1:in `:program'
# <internal:core> core/truffle/polyglot.rb:37:in `eval'
# (mini_racer):1:in `eval_in_context'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer/truffleruby.rb:132:in `block in eval_unsafe'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer/truffleruby.rb:184:in `translate'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer/truffleruby.rb:131:in `eval_unsafe'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:228:in `block (2 levels) in eval'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:369:in `timeout'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:227:in `block in eval'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:225:in `synchronize'
# /home/gerhard/.gem/truffleruby/3.0.3/bundler/gems/mini_racer-6206a1e15d3c/lib/mini_racer.rb:225:in `eval'
# ./lib/pretty_text.rb:245:in `block in markdown'
# ./lib/pretty_text.rb:605:in `block in protect'
# ./lib/pretty_text.rb:604:in `synchronize'
# ./lib/pretty_text.rb:604:in `protect'
# ./lib/pretty_text.rb:188:in `markdown'
# ./lib/pretty_text.rb:317:in `cook'
# ./spec/lib/pretty_text_spec.rb:1421:in `block (2 levels) in <top (required)>'
# ./spec/rails_helper.rb:278:in `block (2 levels) in <top (required)>'
# /home/gerhard/.gem/truffleruby/3.0.3/gems/webmock-3.14.0/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
# <internal:core> core/kernel.rb:376:in `load'
# <internal:core> core/kernel.rb:376:in `load'
# <internal:core> core/kernel.rb:376:in `load'
# <internal:core> core/kernel.rb:376:in `load'
# <internal:core> core/kernel.rb:376:in `load'
# <internal:core> core/kernel.rb:376:in `load'
# ------------------
# --- Caused by: ---
# Polyglot::ForeignException:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design for general GraalVM interoperability, Strings from other languages are considered immutable/frozen.
So JS Strings are immutable, and when we get a corresponding Ruby String for it, it is frozen on purpose to reflect the fact that mutating the Ruby String wouldn't mutate the JS String (not possible).

I think it would be better to modify the caller code to use .dup or .gsub instead, rather than relying on a library (or a gem) returning a mutable string, that always seems an anti-pattern to me, to assume that a library returns a mutable string, and mutate it (e.g., what if the library uses that string in a Hash/cache, then you'd break the Hash/cache; or if the library would like to return a constant string, it would need to be frozen to be safe).

That said, the semantics of mini_racer for passing arguments seems to be more or less "the same as JSON.load(JSON.dump(obj))", and so a "copy" is expected.
If needed, we can change this by calling .dup in convert_js_to_ruby for strings, but I wonder if the current behavior is not safer for users. IMHO it encourages the better practice to not assume that the method returns mutable strings and you can mutate it without other side effects in the system.

Copy link
Contributor

@gschlager gschlager Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was quite surprised that MiniRacer (V8) passes mutable strings to callback functions and returns mutable strings when you call .eval. I agree, it would be best to always use frozen strings and let the caller decide what to do with it. I'm in favor of introducing a breaking change that makes MiniRacer pass/return frozen strings even for V8.

@SamSaffron What's your opinion on this subject? In either case, I think there should be tests that ensure/document that behavior.

Copy link
Collaborator

@SamSaffron SamSaffron Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the dilema here.

In the V8 embedding at the moment we always do copies on the boundary, I can get that certain specific callers may get annoyed if they have to make yet another just to enforce this contract.

In the case of Discourse this is actually quite wasteful, given we would be copying potentially a 100k long string twice just so we can mutate it.

Maybe a to keep existing performance of MRI we add a tiny line to Discourse here?

# truffle mini racer compatability 
string = string.dup if string.frozen?

Alternatives. I can think about

  1. The contract is to provide mutable strings on boundary
    • backwards compat
    • most efficient for v8 embedding
    • wasteful in truffle land due to unconditional .dup
  2. "Flexible" contract
    • callers need to remember to do more work in truffle case
  3. Change contract to require frozen strings
    • backwards incompatible
    • any mutating code will need an extra copy of a potentially huge string
    • consistent with truffle implementation
  4. Hybrid, flag based approach
    • If you need a mutable string you do your eval with a special flag
    • potentially at the global level you define if you always want mutable back or not

I am tempted to just go with (1) here and add specification for it. Cause .dup is presumably very cheap in truffle and only starts getting expensive once you mutate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either 1 or 3 are best.
I like 3 because of the safe semantics, and IMHO a method mutating a string argument is almost always a bug.
But I understand compatibility is important of course, and that's why I think it should be both are frozen or both are not.

Yes, performance-wise the .dup should be pretty cheap on TruffleRuby once JIT-compiled, probably free even (the allocation of the RubyString before the .dup will be escape-analyzed away).
.dup is also cheap on CRuby since it's copy-on-write.

Mutating a string OTOH is never free, especially on CRuby where gsub! will e.g. have to copy all bytes after each match/replacement (on TruffleRuby Ropes and notably lazy substrings are used so there are no byte copies going on for gsub!).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this PR it doesn't make much sense to change semantics, and being compatible with the current behavior is more important. @bjfish Could you add that .dup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eregon Yes, I've added .dup to the String branch in convert_js_to_ruby.

@gschlager Please retry this spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjfish Thank you! It's looking good. All specs for PrettyText in Discourse are green. I can't run all Discourse specs with TruffleRuby, because I'm running out of memory (32GB), but PrettyText has the most important tests.

@SamSaffron I think this is good to go. 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't run all Discourse specs with TruffleRuby, because I'm running out of memory (32GB)

We should look into that at some point, maybe something leaks. Which commands did you use to run the specs or is there documentation we could follow to run Discourse specs?

Copy link
Contributor

@gschlager gschlager Jul 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I did:

(Note: You'll need to setup a Discourse dev environment as described here in https://github.com/discourse/discourse/blob/main/docs/DEVELOPER-ADVANCED.md, but @bjfish probably has that already set up.)

# install ruby-build
git clone https://github.com/rbenv/ruby-build.git
sudo PREFIX=/usr/local ./ruby-build/install.sh

# install TruffleRuby dev build and bundler
ruby-build truffleruby+graalvm-dev ~/.rubies/truffleruby+graalvm-dev
export TRUFFLERUBYOPT='--jvm --polyglot'
chruby truffleruby+graalvm-dev
gu install js
gem install bundler:2.3.16

# checkout my TruffleRuby compatibility changes
git clone https://github.com/discourse/discourse.git
cd discourse
git checkout truffleruby

bundle install
bundle exec rspec

bundle exec rspec was chewing away on memory quite fast!
image

Also, please note that I had to use Bundler 2.3.16 to get it working, because 2.3.18 installed the wrong version of nokogiri. I think rubygems/rubygems#5694 didn't fix the problem yet.

@SamSaffron SamSaffron merged commit e0f5a7a into rubyjs:master Jul 21, 2022
@SamSaffron
Copy link
Collaborator

Amazing work, thanks heaps.

@lloeki plans to update our v8 version, when we do that I think we can announce a new release of mini_racer!

@eregon
Copy link
Contributor

eregon commented Jul 25, 2022

Could you write here when the new release is done? Then I can close oracle/truffleruby#1827.
(I subscribed to this repo's releases but I think it doesn't notify for tags)

@SamSaffron
Copy link
Collaborator

Just cut a new release @eregon , is it working on truffle?

@eregon
Copy link
Contributor

eregon commented Aug 16, 2022

Thanks!
Yes it works:

$ TRUFFLERUBYOPT="--jvm --polyglot" ruby -rmini_racer -ve 'ctx = MiniRacer::Context.new; p ctx.eval "1 + []"'
truffleruby 22.3.0-dev-4054469b, like ruby 3.0.3, GraalVM CE JVM [x86_64-linux]
"1"

The warnings when --jvm --polyglot are not passed aren't working correctly though, I'll look at that.
EDIT: that's fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants