-
Notifications
You must be signed in to change notification settings - Fork 52
Make URI friendly to Ractor #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
41ac183
to
a57e1a5
Compare
Ping @ko1 @marcandre @akr - I'd love to make URI working on Ractor and I'd appreciate your reviews. Thanks! |
Looks good. Did you check the @@to_s = Kernel.instance_method(:to_s)
def inspect
@@to_s.bind_call(self)
end |
|
Oh, I see that you're referring to |
It is always quite difficult to know what happens in real life; they would be best fixed, although it could be in a different PR if you don't want to do it. |
def self.register_scheme(name, mod) | ||
updated = SCHEMES.merge(name.to_s => mod).freeze | ||
remove_const(:SCHEMES) | ||
const_set(:SCHEMES, updated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems quite hacky and is extremely inefficient as well: for every call to register_scheme
you will typically invalidate all constant lookup caches under URI
, and potentially cause deoptimizations for JITed code.
Also the fact it's exposed as a public API seems a huge footgun given this drawback.
There must be a better way to handle this.
If there isn't, then something should be improved in Ractor.
For instance maybe having SCHEMES
initially mutable and at the end of lib/uri.rb
make it shareable?
I guess that doesn't work for "Adding custom URIs" though.
I think using an instance variable on URI
would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it's obviously not thread-safe and could lead to errors like no constant URI::SCHEMES
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are great points. I agree that we might need to come up with something better.
To me this problem sounds generic to all Ruby apps: there's a boot phase when some things might get mutated, and there's a runtime phase where multiple threads/ractors will run where we want all global state to be frozen.
It would be ideal if we could somehow make that a pattern, and then we could freeze SCHEMES
only once the code is fully loaded and everyone is done adding custom URIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for every call to
register_scheme
which is going to be called a half dozen of times when loading the library...
the fact it's exposed as a public API
why is that an issue though? It looks to me as it is a feature that there's a API for this, although it need to be documented.
If it's not to be public, then it could be made private.
Also it's obviously not thread-safe
Right. That should be in the docs. I don't see this as an issue, do not change the configuration after your multi-thread/ractor app has started.
I think using an instance variable on URI would work.
It won't. You can't access instance variables from shareable objects (like Module or Ractor)
ruby -e 'String.singleton_class.attr_accessor :foo; String.foo = 42; Ractor.new { p String.foo }.take'
# => IsolationError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be ideal if we could somehow make that a pattern
I'm planning on having a small ractor-config
gem to do this but using the same trick of using const_set
(with the same caveats, i.e. config only at load time coz it's not thread/ractor safe).
For anything that requires being mutable and shared between ractors, there is no other choice than to involve ractor communication. My gem ractor-server
makes it easy, but it seems way overkill. A huge issue is also that all apps are currently single-ractor and that the first time another Ractor is started it changes things (I think it lowers performance too). So making a Ractor-compatible & safe config without starting a new Ractor is tricky and would require a dedicated Thread using Ractor.receive_if
, but this also adds a constraint that the main Ractor can not ever do Ractor.receive
(or else it could intercept communication for that config thread).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's clear to everyone this is pretty ugly hack.
And IMHO it has no place in any stdlib or gem.
Rather, I think it shows a general problem of Ractor programming being too limiting, for no fundamental reason.
And I agree creating an extra Ractor for this is complete overkill (in terms of code, complexity, extra native thread, extra footprint).
@ko1 What do you think? This is a hack, yes?
There must be a better way.
What about allowing to read/write instance variables of modules, as long as the value is shareable?
Then it would be similar to constants, but it would actually make sense to change it multiple times.
And it would be thread-safe (no time where the ivar is not set except initially).
Another idea is we could have a hook just before the first non-main Ractor is started.
Then you could do something like Ractor.on_first_ractor { Ractor.make_shareable(URI::SCHEMES) }
.
But I'm not sure it would work well, e.g., if some code creates a Ractor while code is still loading (and that code might be autoloaded so the order is not predictable, and even if loaded explicitly, gems are often loaded "in the middle of loading the app" which could still be problematic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which is going to be called a half dozen of times when loading the library...
Yes, maybe not a performance issue in this specific case, but the pattern clearly could be.
Do you know any other constant being redefined 6 times?
I think the more important point is redefining constants many times does not make sense conceptually.
It goes against the concept of constants itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about allowing to read/write instance variables of modules, as long as the value is shareable
That would be really nice 👍 (at minimum reading)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcandre Could file a issue on Redmine to suggest supporting reading/writing @ivars
of Module? I'll support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! https://bugs.ruby-lang.org/issues/17592
I've asked for read-only, I'll let you argue for read-write if it feels best to you.
I've merged #22 . If you have any concerns about this, Can you file a new issue or pull-request? Thanks. |
I found a clean solution to support Ractor in #26 by using a module to map scheme names to scheme classes, which also works with Ractor. |
This PR is an updated version of ruby/ruby#4007.
Right now, something as basic as parsing an URI cannot be done from a Ractor (I raised that earlier in https://bugs.ruby-lang.org/issues/17180):
fails with:
IMO, something as tiny and safe as URI::parse should be allowed to be called from a Ractor.
This PR changes the
URI
class to no longer use class instance variables, and makes it a frozen Hash that is safe to be used by multiple Ractors.I'm open to other ideas how we can make it work.
@ko1 @marcandre @akr
Related issue on ruby-lang: https://bugs.ruby-lang.org/issues/17569