-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document actor system #1748
Document actor system #1748
Conversation
eaede77
to
906b9dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. I left some minor suggestions.
Co-authored-by: Jason Bryan <jbryan@elastic.co> Co-authored-by: Brad Deam <54515790+b-deam@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks! This helps in building the Rally mental map.
I went through PrepareBenchmark
so far. A few remarks, a wish list for later mostly:
- For consistency,
TrackPreparator
should be renamed toTrackPreparationActor
in the charts as that's the name of the class I think. - It would be great to include 10k feet description of what each component is doing, e.g. 1)
TrackPreparationActor
loads a track and track plugins (custom extensions such as parameter sources and runners), gets a list of track processors (TrackProcessor
interface) fromTrackProcessorRegistry
and stores them in a queue. Then it sends each processor for execution toTaskExecutionActor
. 2)TackExecutionActor
executes each processor in a background task and verifies its status periodically. Once task completes, it reports its readiness withWorkComplete
. Not sure what level of details makes sense here - too much is just a repetition of code, too little forces us to go through the flow again if we revisit this after enough time. - It would be nice to explain how individual components grow as the scale of the system increases, e.g.
TrackPreparationActor
set grows with the number of load driver hosts, whileTaskExecutionActor
set can grow with the number of cores declared per load driver.
docs/architecture/actor_system.md
Outdated
|
||
At its heart, Rally is a distributed system. It has been designed that way to | ||
allow using multiple load drivers in the same benchmark, to ensure that Rally | ||
is never a bottleneck. In the vast majority of cases, using a powerful load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason (the first one, IIRC) is that distributing Rally also provides an easy way to handle multiple target hosts and ensure that they all build/install/start the required ES version via the race
subcommand (this is for example the mode used on the Hetzner nightly benchmarks).
Later on we got the dedicated/decoupled esrally build/install/start/stop
subcommands (which we use e.g. with cloud based benchmarks) and these are decoupled from the actor system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current formulation isn't correct then, however my understanding is that this is achievable without a full-fledged actor system, while we could easily end up re-implementing Thespian badly to support multiple load drivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however my understanding is that this is achievable without a full-fledged actor system, while we could easily end up re-implementing Thespian badly to support multiple load drivers.
I mostly agree. I don't think the risk of supporting >1 load drivers without the actor system would necessarily result in re-implementing Thespian, but as always, the devil is in the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the wording slightly in ac1aec8.
I fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM -- so great to see these internal aspects of Rally documented.
One thing I wanted to ask is how come this isn't living under the official docs e.g. in the Reference Documentation or Additional Information sections? (is it because Mermaid isn't available?)
It should be possible to use Mermaid with Sphinx but it would require additional package, see https://github.com/mgaitan/sphinxcontrib-mermaid. That's work in progress. If/when this matures we can include this in official documentation. |
You can see this live at https://github.com/pquentin/rally/blob/document-actor-system/docs/architecture/actor_system.md.