Description
Despite the relatively limited use that tokio-rs/simulation has seen so far, @gardnervickers has been able to reproduce several complex bugs in H2 and Tonic. To make tokio-rs/simulation
more accessible to more people, I think it's worth opening a discussion on re-exporting simulation's networking types from Tokio. I believe (and I might be misquoting @gardnervickers!) that the desired end state for this feature would enable a library like Hyper to be deterministically simulated just by changing which feature flags are enabled within Tokio.
To enable this, I think that Tokio will need to make the following changes:
- A generalization of the removed
tokio::runtime::Builder::clock
to allow for setting of other hooks in Tokio as to enable deterministic simulation. - The exposure of simulation's types such as
FaultyTcpStream
through a rename and re-export under the name of the non-simulated type (In this case,tokio::net::TcpStream
) if a feature flag enablingsimulation
is toggled. - Enforce that the single-threaded runtime wakes tasks in a deterministic order. I believe the current task waking order is accidental.
A discussion between @gardnervickers, @carllerche, and I covering how this would be implemented. It's slightly messy, I apologize about that.
gvickers: I've walked back a bit lately on having the Environment handle. I think eventually we'll just want to have a way to swap TcpStream/TcpListener implementation in Tokio, like how we do with tokio_timer::Now/Delay/Timeout today.
[4:15 PM] davidbarsky: yeah, i see why it's clone right now—spawn requires 'static, so it has to be an owned value
[4:15 PM] davidbarsky: oh, can you say more about that?
[4:20 PM] gvickers: Tokio today allows you to use
tokio_timer::timer::set_default and tokio_timer::clock::with_default to setup
the internal clock used by tokio_timer. The DeterministicRuntime sets these to
the mock clock, so you don't actually need to use the Environment handle to
create Delay's or query simulation time with tokio::clock::now(). I want to have
something similar for TcpStream/TcpListener, where we can use
tokio::reactor::with_default() to set a global which can be used to create
TcpStream/TcpListener instances backed by the simulation library's network.
[4:21 PM] davidbarsky: oh, neat, yeah.
[4:21 PM] gvickers: This should let us use existing libs without too many
changes, much in the same way that existing libs which create timeouts with
tokio::timer::Timeout can work with Simulation today.
[4:25 PM] gvickers: This will probably require something like task locals though in order to scope TcpListener binds to a virtual net::SocketAddr.
[4:25 PM] gvickers: Also, to give some sort of virtual filesystem/namespacing to files
[4:26 PM] gvickers: Luckily it doesen't have to be fast or anything, so just installing a thread local at the start of every task poll should be sufficient if I understand correctly.
[4:28 PM] davidbarsky: makes sense. this would require hooks into tokio, correct?
[4:29 PM] gvickers: Yes, I think something like tokio::reactor::with_default
should be sufficient. Carl mentioned a while back that the timer hooks are going
away temporarily for the new Tokio release though, so things might change around
that.
[4:39 PM] davidbarsky: @gvickers small PR that makes it slightly more "idiomatic": https://github.com/tokio-rs/simulation/pull/9
GitHub
slightly more idiomatic rust by davidbarsky · Pull Request #9 · ...
[4:39 PM] gvickers: Nice, thank you!
[4:40 PM] davidbarsky: (take it or leave it! no need to accept it)
[4:42 PM] gvickers: I see this used a bit, type Err = Box<dyn std::error::Error + Send + Sync + 'static>;
Is the pattern to have that as the top level error type for rust applications?
[4:55 PM] davidbarsky: for examples, I like to place a boxed error up there.
[4:56 PM] davidbarsky: it's kinda like a RuntimeException; don't really care too much about what the specific error is.
[4:56 PM] davidbarsky: so to answer for your question—I think it is, for smaller applications.
[4:56 PM] davidbarsky: for bigger ones, a proper enum variant.
[4:56 PM] davidbarsky: and I see what you mean by relying on Tokio's plumbing: https://tokio-rs.github.io/tokio/doc/src/tokio/runtime/handle.rs.html#11-25
handle.rs.html -- source
Source to the Rust file `tokio/src/runtime/handle.rs`.
[5:00 PM] gvickers: Yea, I have no idea how to do it but I think it should be possible to have TcpStream/TcpListener creation dispatch to alternate internal implementations.
[5:03 PM] davidbarsky: I think some additional builder methods on
https://tokio-rs.github.io/tokio/doc/tokio/runtime/struct.Builder.html that let
you specify with_timer, with_clock (maybe last two can be collapsed?),
with_io_handle, with_spawner, etc. should do the trick.
tokio::runtime::Builder - Rust
API documentation for the Rust `Builder` struct in crate `tokio`.
[5:04 PM] gvickers: Yea that would be ideal
[5:05 PM] davidbarsky: not sure how FaultyTcpStream can be exposed, though
[5:05 PM] gvickers: The DeterministicRuntime in simulation pretty much just copies the CurrentThread runtime
[5:06 PM] gvickers: I think FaultyTcpStream could just be made to provide the backing implementation for TcpStream.
[5:06 PM] davidbarsky: (cc: @Carl Lerche, if you're online)
[5:06 PM] davidbarsky: hmm, yeah. i think see what you mean.
[5:07 PM] gvickers: My main concern is if there would be a performance impact,
but I guess each TcpStream/TcpListener now has to lookup and see if there is a
backing reactor to register with anyway, so maybe not too bad?
[5:09 PM] davidbarsky: in test or production?
[5:09 PM] davidbarsky: but yeah, i think this is something that can be benchmarked
[5:12 PM] gvickers: I guess it would be in both test and production. I think at the least it would incur a vtable lookup on poll for both TcpStream and TcpListener.
[5:19 PM] Carl Lerche: @gvickers there is a feature flag to enable this kind of stuff
[5:19 PM] Carl Lerche: test-util
[5:19 PM] Carl Lerche: Look at how time freezing is done on master now
[5:28 PM] gvickers: Right, but in order to support simulation the mock time
model on master will need to be changed a bit (or hooks introduced into the
runtime to drive it). I suppose the test-util feature flag probably couldn't be
overloaded to support both simulation and normal unit tests, since the time
models are different.
[6:05 PM] Carl Lerche: @gvickers you can add hooks in the runtime for testing that are enabled w/ test-util
[6:05 PM] Carl Lerche: @gvickers my main point is that it is ok to have "overhead" (minimal) to support testing, we just enable it w/ the feature flag
[6:05 PM] davidbarsky: would you accept a PR soonish, or only after 0.2 is out?
[6:05 PM] Carl Lerche: not until after 0.2
[6:06 PM] davidbarsky: makes sense.
[6:06 PM] Carl Lerche: basically, make a proposal
[6:09 PM] davidbarsky: sounds good. @gvickers do you have a sense as to how you'd like this API to look like?
[6:18 PM] gvickers: I think for the timer implementation, to replace Park we
need a runtime hook which fires when no task can make progress. The timer should
also allow accessing the Instant at which the next future can progress. These
two hooks can then be used to advance the clock with advance.
[6:21 PM] gvickers: As for the TcpStream/TcpListener API, I'm not sure. All the simulation really needs there is to provide alternate AsyncRead + AsyncWrite implementations and peer_addr()/local_addr().
[6:41 PM] davidbarsky: makes sense. I think it'll be possible to have the "same" TCPStream/TCPListener re-exported under the same path/name but activated under a different feature flag
[6:42 PM] davidbarsky: so, [cfg[not("simulated")]]—or whatever the syntax is—would export the FaultyVariant that's in simuation
[6:42 PM] davidbarsky: I don't know if that's what you have in mind.
[6:45 PM] gvickers: I'm not entirely sure what the limitations of compile flags
are, but the goal would be for libraries like Hyper which call TcpListener::bind
to work under simulation.