runtime_api subsystem queues requests internally and likely does redundant work #829

eskimor · 2022-05-10T11:58:08Z

Use the queue Luke

Instead of imposing back pressure on the incoming channel. Conceptually it is rather pointless to internally queue requests unbounded, while receiving messages over a bounded channel. The subsystem at least files a warning when that internal buffer reaches a certain size, still backpressuring on the receiving channel would the very least help in making issues visible immediately.

We now have metrics for ToF of messages through a channel, we also monitor channel sizes and will soon have a metric for whenever a channel gets full (and thus imposes back pressure) - we would like this tooling to expose issues as well as possible, so for this reason alone we should fix that.

A similar issue exists for PVF execution, it also uses internal unbounded channels.

Be lazy

What we also noticed, that we will execute requests in parallel and have a cache. The issue here is, that we will be doing unnecessary work if identical requests arrive at roughly the same time: Because we will spawn each one of them, instead of only spawning one and then serving the others from the cache once that one succeeds.

What we should be doing, is keep track of the requests that are currently being executed and not spawn identical requests, while they are running, but instead just wait for the result hitting the cache and then serve all of them.

drahnr · 2022-05-12T16:15:10Z

The short term fix to get insight here, is to use the metered::* types internally as well. That would not solve the in principal issue, but at least shed some light.

eskimor · 2022-05-12T21:25:41Z

I think that is hardly less work than the proper fix. From looking at the code, this should be a pretty trivial change.

eskimor · 2022-05-13T12:48:44Z

@sandreim once we have results on the cache misses we have because of this, please add them here so we get an idea how important this is. Also do we have data how much faster the cache is versus the execution? I am guessing the difference will be quite significantly, but do we know?

sandreim · 2022-05-17T15:51:06Z

Burned in paritytech/polkadot#5541 in Versi and data shows that there are indeed cache misses due to parallelizing same requests. ToF went up by 2.5x so and we need to move back to 4 parallel requests to see how much requests take on average to execute. With this data we can decide if it is worthy to prioritize the cache optimization sooner rather than later.

sandreim · 2022-05-18T09:03:51Z

The runtime calls are on average very cheap as it can be seen in the picture below. I think we should not implement this cache improvement now, but keep it for later, when we are out of big opportunities for optimisation.

sandreim · 2022-05-18T09:53:50Z

Created #822 for the cache improvement.

* base script * basic script * revert pnpm lock * use the version of cli * base script * fixed scripts * tabs to space * updated cumulus * renamed file * fixed packages * updated dependencies * use ethereum * update cumulus

* Add script for releasing polkadot branch Signed-off-by: koushiro <koushiro.cqx@gmail.com> * Apply review suggestions Signed-off-by: koushiro <koushiro.cqx@gmail.com> * Fix Signed-off-by: koushiro <koushiro.cqx@gmail.com> Signed-off-by: koushiro <koushiro.cqx@gmail.com>

eskimor changed the title ~~runtime_api subsystem queues requests internally~~ runtime_api subsystem queues requests internally and likely does redundant work May 13, 2022

sandreim self-assigned this May 13, 2022

sandreim mentioned this issue May 17, 2022

runtime api: cache test paritytech/polkadot#5541

Closed

This was referenced May 18, 2022

runtime-api: remove internal queue to make ToFs relevant again paritytech/polkadot#5545

Merged

runtime_api: improve request caching #822

Open

ordian added the T5-parachains_protocol label Aug 16, 2022

eskimor added T4-parachains_engineering and removed T5-parachains_protocol labels Aug 17, 2022

Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023

the-right-joyce added T8-parachains_engineering and removed T4-parachains_engineering labels Aug 25, 2023

the-right-joyce removed the T8-parachains_engineering label Oct 23, 2023

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

relay_loop().await from main relay function (paritytech#829)

ea97327

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

relay_loop().await from main relay function (paritytech#829)

5c993f7

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

relay_loop().await from main relay function (paritytech#829)

dc7f56b

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 8, 2024

relay_loop().await from main relay function (paritytech#829)

e5b4970

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

9ccaeec

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

10b096d

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

8a2d15a

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

ea2fe6b

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

508e22a

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 9, 2024

relay_loop().await from main relay function (paritytech#829)

2fffbce

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024

relay_loop().await from main relay function (paritytech#829)

d91f908

serban300 pushed a commit to serban300/polkadot-sdk that referenced this issue Apr 10, 2024

relay_loop().await from main relay function (paritytech#829)

6b80c94

bkchr pushed a commit that referenced this issue Apr 10, 2024

relay_loop().await from main relay function (#829)

c95b1eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime_api subsystem queues requests internally and likely does redundant work #829

runtime_api subsystem queues requests internally and likely does redundant work #829

eskimor commented May 10, 2022 •

edited

Loading

drahnr commented May 12, 2022

eskimor commented May 12, 2022

eskimor commented May 13, 2022

sandreim commented May 17, 2022 •

edited

Loading

sandreim commented May 18, 2022 •

edited

Loading

sandreim commented May 18, 2022

runtime_api subsystem queues requests internally and likely does redundant work #829

runtime_api subsystem queues requests internally and likely does redundant work #829

Comments

eskimor commented May 10, 2022 • edited Loading

Use the queue Luke

Be lazy

drahnr commented May 12, 2022

eskimor commented May 12, 2022

eskimor commented May 13, 2022

sandreim commented May 17, 2022 • edited Loading

sandreim commented May 18, 2022 • edited Loading

sandreim commented May 18, 2022

eskimor commented May 10, 2022 •

edited

Loading

sandreim commented May 17, 2022 •

edited

Loading

sandreim commented May 18, 2022 •

edited

Loading