Skip to content

Add and test JRuby implementation #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

headius
Copy link

@headius headius commented May 10, 2022

This PR incorporates a pure-Ruby port of JRuby's native timeout logic into this library.

The existing implementation spins up a new native thread per timeout request, which is inefficient for several reasons:

  • Starting a new native thread has a large nontrivial cost.
  • Setting up Ruby thread state associated with that native thread also has a large cost.
  • Waiting for the timeout thread to terminate is slow and expensive and delays processing of what would otherwise have run quickly.
  • Operations that do not time out still incur all these costs, and such operations generally make up a huge majority of timeout requests.

This last point is perhaps the most worrisome: nearly all operations that pass through this code will never time out and do not need the timeout thread, but it is still created and joined every time.

Instead, we use the JDK ScheduledThreadPoolExecutor, which spins up a single thread and schedules timeout requests into it. No more than one thread will ever be started, and operations that do not time out incur only the cost of enqueueing the timeout request.

Performance is naturally much faster than spinning up a thread each time (see benchmark and performance below).

This PR starts out as a draft for a few reasons:

  • No work has been done to optimize this naive port of our Java implementation. It is functionally nearly the same, but there's additional Ruby execution overhead that could be improved.
  • The base timeout library could be improved and used by all implemenations by using logic similar to JDK's ScheduledThreadPoolExecutor. This simplest way to do this would be to use the version of timeout that @evanphx wrote for Rubinius (now incorporated into TruffleRuby): https://github.com/oracle/truffleruby/blob/master/lib/truffle/timeout.rb
  • This logic uses JRuby internal methods JRuby.runtime (which requires us to require 'jruby') and org.jruby.RubyTime#convert_time_interval

This implementation currently passes all but one test, but I am unsure how important this test is or what exactly the expected behavior is supposed to be (test_skip_rescue):

Failure: test_skip_rescue(TestTimeout):
  [Bug #8730].
  Timeout::Error expected but nothing was raised.
/Users/headius/projects/timeout/test/lib/core_assertions.rb:411:in `assert_raise'
/Users/headius/projects/timeout/test/lib/core_assertions.rb:449:in `block in assert_raise_with_message'
/Users/headius/projects/timeout/test/lib/envutil.rb:254:in `with_default_internal'
/Users/headius/projects/timeout/test/lib/core_assertions.rb:448:in `assert_raise_with_message'
/Users/headius/projects/timeout/test/test_timeout.rb:41:in `test_skip_rescue'
     38:   def test_skip_rescue
     39:     bug8730 = '[Bug #8730]'
     40:     e = nil
  => 41:     assert_raise_with_message(Timeout::Error, /execution expired/, bug8730) do
     42:       Timeout.timeout 0.01 do
     43:         begin
     44:           sleep 3

Benchmark and results are below (on JRuby master/9.4):

require 'timeout'
require 'benchmark'

20.times {
  puts Benchmark.measure {
    10000.times { Timeout.timeout(1) { } }
  }
}

Before:

  5.450000   1.300000   6.750000 (  2.802191)
  2.770000   1.010000   3.780000 (  1.923030)
  2.140000   0.880000   3.020000 (  1.527967)
  1.350000   0.880000   2.230000 (  1.611158)
  1.130000   0.790000   1.920000 (  1.380240)
  1.170000   0.780000   1.950000 (  1.382086)
  1.170000   0.800000   1.970000 (  1.413524)
  1.160000   0.800000   1.960000 (  1.442907)
  1.180000   0.810000   1.990000 (  1.554301)
  1.230000   0.870000   2.100000 (  1.582411)
  1.290000   0.890000   2.180000 (  1.632331)
  1.120000   0.760000   1.880000 (  1.346911)
  1.140000   0.770000   1.910000 (  1.359328)
  1.150000   0.780000   1.930000 (  1.361715)
  1.120000   0.770000   1.890000 (  1.364992)
  1.150000   0.780000   1.930000 (  1.373353)
  1.130000   0.780000   1.910000 (  1.379838)
  1.140000   0.780000   1.920000 (  1.370240)
  1.120000   0.760000   1.880000 (  1.350525)
  1.120000   0.770000   1.890000 (  1.356122)

After:

  1.920000   0.160000   2.080000 (  0.790326)
  0.920000   0.110000   1.030000 (  0.475900)
  0.910000   0.080000   0.990000 (  0.391800)
  0.880000   0.090000   0.970000 (  0.427838)
  0.700000   0.070000   0.770000 (  0.351840)
  0.770000   0.110000   0.880000 (  0.348264)
  0.500000   0.070000   0.570000 (  0.396809)
  0.400000   0.060000   0.460000 (  0.204967)
  0.640000   0.070000   0.710000 (  0.254044)
  0.370000   0.060000   0.430000 (  0.176363)
  0.340000   0.050000   0.390000 (  0.173846)
  0.560000   0.080000   0.640000 (  0.231835)
  0.220000   0.060000   0.280000 (  0.171603)
  0.150000   0.040000   0.190000 (  0.142593)
  0.170000   0.050000   0.220000 (  0.162614)
  0.380000   0.050000   0.430000 (  0.200650)
  0.150000   0.040000   0.190000 (  0.137969)
  0.150000   0.040000   0.190000 (  0.139971)
  0.390000   0.060000   0.450000 (  0.205751)
  0.150000   0.050000   0.200000 (  0.137077)

And using our Java implementation, as an example of the lost performance from this naive port (which is about 4x slower):

  0.360000   0.050000   0.410000 (  0.107624)
  0.180000   0.050000   0.230000 (  0.082377)
  0.190000   0.040000   0.230000 (  0.073103)
  0.220000   0.050000   0.270000 (  0.081130)
  0.100000   0.020000   0.120000 (  0.037514)
  0.130000   0.030000   0.160000 (  0.052505)
  0.110000   0.030000   0.140000 (  0.041658)
  0.010000   0.010000   0.020000 (  0.021479)
  0.020000   0.020000   0.040000 (  0.024502)
  0.020000   0.020000   0.040000 (  0.022926)
  0.020000   0.010000   0.030000 (  0.023047)
  0.020000   0.020000   0.040000 (  0.020899)
  0.100000   0.030000   0.130000 (  0.057042)
  0.040000   0.030000   0.070000 (  0.047345)
  0.060000   0.030000   0.090000 (  0.039323)
  0.060000   0.030000   0.090000 (  0.039203)
  0.040000   0.020000   0.060000 (  0.029292)
  0.030000   0.020000   0.050000 (  0.040910)
  0.030000   0.030000   0.060000 (  0.044159)
  0.030000   0.030000   0.060000 (  0.044382)

Fixes #11.

headius added 2 commits May 11, 2022 01:53
This implementation is based on the JDK's
ScheduledThreadPoolExecutor class, which provides an efficient way
to schedule many tasks in the future without spinning up a thread
for each. This logic is ported from a Java implementation of
timeout that JRuby has shipped for many years, and passes all but
one of the timeout tests.
@eregon
Copy link
Member

eregon commented May 11, 2022

It would be great if we can get a pure-Ruby version in this gem, and then that would be used by all Ruby implementations.

I think the one part of https://github.com/oracle/truffleruby/blob/master/lib/truffle/timeout.rb which is not just standard Ruby is receive_timeout.
Truffle::Channel is just a tiny wrapper for Queue we can remove, but which has that extra receive_timeout method, which is like Queue#pop but with a timeout.
Queue currently doesn't have that: https://docs.ruby-lang.org/en/master/Thread/Queue.html
There is https://bugs.ruby-lang.org/issues/17363 to add it and several other discussions IIRC.

Maybe we can emulate receive_timeout by using the blocking Queue#pop + another thread that takes it after the timeout (i.e., it sleeps for the duration and then push some marker on the queue)?

WEBrick also has its own timeout implementation in https://github.com/ruby/webrick/blob/e4182968b1ae34f38b71247fae05acf22a8f258e/lib/webrick/utils.rb#L182-L184.
That seems to create extra threads in some cases, but I don't understand the logic from a quick look..

Performance is now roughly the same as the Java version.
@headius
Copy link
Author

headius commented May 11, 2022

I have refactored the implementation and it is now about the same speed as the Java version:

  1.970000   0.130000   2.100000 (  0.460333)
  0.620000   0.090000   0.710000 (  0.161111)
  0.070000   0.040000   0.110000 (  0.073046)
  0.060000   0.040000   0.100000 (  0.071089)
  0.080000   0.050000   0.130000 (  0.071490)
  0.070000   0.040000   0.110000 (  0.073537)
  0.070000   0.040000   0.110000 (  0.063814)
  0.050000   0.040000   0.090000 (  0.056137)
  0.050000   0.030000   0.080000 (  0.055846)
  0.050000   0.030000   0.080000 (  0.055712)
  0.060000   0.040000   0.100000 (  0.053735)
  0.050000   0.030000   0.080000 (  0.054342)
  0.060000   0.040000   0.100000 (  0.058917)
  0.060000   0.030000   0.090000 (  0.061450)
  0.050000   0.030000   0.080000 (  0.052978)
  0.050000   0.040000   0.090000 (  0.053908)
  0.050000   0.030000   0.080000 (  0.046397)
  0.040000   0.030000   0.070000 (  0.052840)
  0.050000   0.030000   0.080000 (  0.049004)
  0.070000   0.040000   0.110000 (  0.063720)

@headius
Copy link
Author

headius commented May 11, 2022

It would be great if we can get a pure-Ruby version in this gem

@eregon I agree, and the implementation by @evanphx is probably efficient enough for everyone. The lack of a queue pop timeout is the main problem, since we could not support that across all impls until 3.2. I don't see a way around that in the short term, since emulating pop timeout with a separate thread would mean we are basically implementing a timeout thread for Queue#pop in order to implement Timeout using Queue#pop.

@headius headius marked this pull request as ready for review May 11, 2022 09:33
@headius
Copy link
Author

headius commented May 11, 2022

I am marking this ready for review even though there's still the one failure in the weird rescue-skipping test. That test depends on throwing the exception object rather than raising it, in order to skip rescues. I'm not sure I agree with that behavior.

@eregon
Copy link
Member

eregon commented May 11, 2022

I'll check how we could emulate Queue#pop with timeout.
An extra thread (so just 2 threads total) is an idea,
another idea would be to use a IO.pipe, as IO.select does support a timeout (could also be via wait_readable or so if we can require io/wait).

@headius
Copy link
Author

headius commented May 11, 2022

One concern I have with the threaded impl ls the nasty races that might come from having two threads fighting over the state of the timeout job, but perhaps the as-yet-unwritten implementation can avoid serious problems. I also considered an IO.pipe but I think that would mean serializing jobs or some other inefficient hand-off mechanism.

@eregon
Copy link
Member

eregon commented May 11, 2022

though there's still the one failure in the weird rescue-skipping test. That test depends on throwing the exception object rather than raising it, in order to skip rescues. I'm not sure I agree with that behavior.

I think it'd be good to unify behavior of all Ruby implementations for that, i.e., either change so CRuby no longer uses throw, or make the logic to use throw also work for JRuby.

The test is

def test_skip_rescue
bug8730 = '[Bug #8730]'
e = nil
assert_raise_with_message(Timeout::Error, /execution expired/, bug8730) do
Timeout.timeout 0.01 do
begin
sleep 3
rescue Exception => e
end
end
end
assert_nil(e, bug8730)
end

MRI issue: https://bugs.ruby-lang.org/issues/8730
And the code implementing it:

timeout/lib/timeout.rb

Lines 29 to 50 in 9b18d10

class Error < RuntimeError
attr_reader :thread
def self.catch(*args)
exc = new(*args)
exc.instance_variable_set(:@thread, Thread.current)
exc.instance_variable_set(:@catch_value, exc)
::Kernel.catch(exc) {yield exc}
end
def exception(*)
# TODO: use Fiber.current to see if self can be thrown
if self.thread == Thread.current
bt = caller
begin
throw(@catch_value, bt)
rescue UncaughtThrowError
end
end
super
end
end

I also remember some bug, maybe in Rails that due to using throw, a rescue Exception wouldn't actually do the needed cleanup e.g. for a transaction. That's an argument against the complicated throw behavior. But it's probably been worked around by now.

@eregon
Copy link
Member

eregon commented May 11, 2022

I found that reference about Rails + transactions + Timeout using throw: https://bugs.ruby-lang.org/issues/15567#note-20
@jeremyevans' rationale makes perfect sense to me, we should never use throw for Timeout, a timeout is an error/exception in the sense it did not complete in time, it should definitely abort a transaction.
break/return/throw OTOH should not abort a transaction (and there is no way to differentiate throw from the others AFAIK so we can't differ just for throw even if we wanted).
Not to mention it's completely unintuitive that inside it's not a Timeout::Error but outside it is.
cc @ioquatix I think you agree on this as well.

@eregon
Copy link
Member

eregon commented May 11, 2022

https://spin.atomicobject.com/2014/07/07/ruby-queue-pop-timeout/ actually shows a way to implement it without an extra thread or file descriptor, just using a ConditionVariable, that seems a good way to emulate it, and then we could use the new Queue#pop with :timeout where available: https://bugs.ruby-lang.org/issues/18774

@eregon
Copy link
Member

eregon commented May 11, 2022

Currently,

p Timeout::Error.ancestors
# => [Timeout::Error, RuntimeError, StandardError, Exception, ...]

So maybe this is a reason why throw was used, because a simple begin; ...; rescue; ...; end could swallow it.
Maybe Timeout::Error should inherit from Exception, but I guess we can't change that for compatibility.
Subclassing StandardError for timeout-out like exceptions also makes some sense (it's not a "fatal exception"): https://bugs.ruby-lang.org/issues/18630#note-14

So probably it should either be another exception class which < Exception, or Exception itself to interrupt the thread timing out, and then convert to Timeout::Error outside the Timeout.timeout block (as already done).

@eregon
Copy link
Member

eregon commented May 12, 2022

See #15, that's a pure-Ruby implementation with a single Thread.
@headius Could you benchmark that on JRuby?

@eregon
Copy link
Member

eregon commented May 13, 2022

Since #15 is faster on the given benchmark, more portable and avoids duplication, I think there is no reason to merge this.

@ioquatix
Copy link
Member

@eregon it would be nice to have a root TimeoutError < StandardError and have all other cases derived from that IMHO.

@eregon
Copy link
Member

eregon commented May 14, 2022

@ioquatix That's out of scope, #15 is fully compatible on purpose, let's discuss that on https://bugs.ruby-lang.org/issues/17363 or in a separate issue.

@headius
Copy link
Author

headius commented May 15, 2022

The implementation in #15 from Rubinius, modified by @eregon for compatibiity and performance, appears to be the fastest on JRuby, so this PR is no longer needed.

https://gist.github.com/headius/71e6721577872b03125f4ad2c0c024fa

@headius headius closed this May 15, 2022
@eregon eregon mentioned this pull request May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JRuby support
3 participants