-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime_api subsystem queues requests internally and likely does redundant work #829
Comments
The short term fix to get insight here, is to use the |
I think that is hardly less work than the proper fix. From looking at the code, this should be a pretty trivial change. |
@sandreim once we have results on the cache misses we have because of this, please add them here so we get an idea how important this is. Also do we have data how much faster the cache is versus the execution? I am guessing the difference will be quite significantly, but do we know? |
Burned in paritytech/polkadot#5541 in Versi and data shows that there are indeed cache misses due to parallelizing same requests. ToF went up by 2.5x so and we need to move back to 4 parallel requests to see how much requests take on average to execute. With this data we can decide if it is worthy to prioritize the cache optimization sooner rather than later. |
Created #822 for the cache improvement. |
* base script * basic script * revert pnpm lock * use the version of cli * base script * fixed scripts * tabs to space * updated cumulus * renamed file * fixed packages * updated dependencies * use ethereum * update cumulus
* Add script for releasing polkadot branch Signed-off-by: koushiro <koushiro.cqx@gmail.com> * Apply review suggestions Signed-off-by: koushiro <koushiro.cqx@gmail.com> * Fix Signed-off-by: koushiro <koushiro.cqx@gmail.com> Signed-off-by: koushiro <koushiro.cqx@gmail.com>
Use the queue Luke
Instead of imposing back pressure on the incoming channel. Conceptually it is rather pointless to internally queue requests unbounded, while receiving messages over a bounded channel. The subsystem at least files a warning when that internal buffer reaches a certain size, still backpressuring on the receiving channel would the very least help in making issues visible immediately.
We now have metrics for ToF of messages through a channel, we also monitor channel sizes and will soon have a metric for whenever a channel gets full (and thus imposes back pressure) - we would like this tooling to expose issues as well as possible, so for this reason alone we should fix that.
A similar issue exists for PVF execution, it also uses internal unbounded channels.
Be lazy
What we also noticed, that we will execute requests in parallel and have a cache. The issue here is, that we will be doing unnecessary work if identical requests arrive at roughly the same time: Because we will spawn each one of them, instead of only spawning one and then serving the others from the cache once that one succeeds.
What we should be doing, is keep track of the requests that are currently being executed and not spawn identical requests, while they are running, but instead just wait for the result hitting the cache and then serve all of them.
The text was updated successfully, but these errors were encountered: