Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and use claim queue for asynchronous on-demand operability #1797

Open
6 of 8 tasks
Tracked by #1829
eskimor opened this issue Oct 5, 2023 · 3 comments
Open
6 of 8 tasks
Tracked by #1829

Expose and use claim queue for asynchronous on-demand operability #1797

eskimor opened this issue Oct 5, 2023 · 3 comments
Assignees

Comments

@eskimor
Copy link
Member

eskimor commented Oct 5, 2023

In order for on-demand (and other forms of more exotic scheduling) to take advantage of asynchronous backing we need to expose the claim queue to the node side and have validators have accept collations for any ParaId that is in the claim queue, not just the next scheduled entry.

  • Expose Expose ClaimQueue via a runtime api and use it in collation-generation  #3580
  • Make use of it in collator protocol so collations will be accepted in advance.
  • Make use of it in collators: Implement strategies for the right lookahead in collators (see comment below)
  • Fix validator usage as well. E.g. in backing we will need to be more lenient and accept statements for any para in the current claim queue.
  • Orthogonal: Is the runtime (inclusion) already lenient enough. This is not directly related to exposure of the claim queue, but we need to double check that the runtime itself is taking it into account everywhere.
  • Ensure fairness between paras in the collator protocol.
  • Statement distribution and other subsystems need to take full claim queue into account, not only the next scheduled.
  • Verify asynchronous backing working as expected with a core shared by at least three paras, in round robin.

Fairness in collator protocol

To ensure one para can not starve another in the collator protocol, we should become smarter on what collations to fetch. E.g. if we have the following claim queue:

[a, b, c, d]

and we have the following advertisements:

a,a,a,a,a,a,b,c,d

Our fetches should look like round robin:

a,b,c,d,a,a,a,a,a
@eskimor
Copy link
Member Author

eskimor commented Mar 12, 2024

Little illustration how collation providing will work with the claim queue. Most of the time (when running) the second item in the claim queue will be the most relevant one, as this is the one that can be provided "on time". The first item in the claim queue is meant to be backed in the very next block (synchronous backing). So when starting up, we will provide the first time, but will be too slow, so it will go into the block after the next one. In the meantime we will already provide the second item, which then will be ready for the block afterwards - now we are "in sync" and will always prepare the second item in the claim queue, while the first one is already waiting to be picked up by the block producer.

Image

If a para b sees the following claim queue for the current relay parent, it should start preparing the collation to get it in on time:

[a, b, c]

Looking even further ahead (if claim queue is long enough), is not required for operation, but is a legitimate block production strategy. E.g. to avoid more relay chain forks (and thus unnecessary work), one could on purpose pick a relay parent that already has a child block and then pick the third item in the claim queue (which corresponds to the second item for the most recent relay parent claim queue).

github-merge-queue bot pushed a commit that referenced this issue Mar 20, 2024
…tion` (#3580)

The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.

Related to #1797
dharjeezy pushed a commit to dharjeezy/polkadot-sdk that referenced this issue Mar 24, 2024
…tion` (paritytech#3580)

The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.

Related to paritytech#1797
@alindima
Copy link
Contributor

This code in prospective-parachains will also need to be changed to use the claimqueue API:

async fn fetch_upcoming_paras<Context>(

I think it should lookahead according to allowed_ancestry_len or scheduling_lookahead, in order to allow collations coming from upcoming paras.

@eskimor eskimor changed the title Expose claim queue for asynchronous on-demand operability Expose and use claim queue for asynchronous on-demand operability Apr 5, 2024
@bkchr
Copy link
Member

bkchr commented Apr 9, 2024

  • Make use of it in collators: Implement strategies for the right lookahead in collators (see comment below)

#3168 is included there. The collation generation subsystem will not be used for this anymore. Already the lookahead collator is not using the collation generation subsystem for triggering the block production.

github-merge-queue bot pushed a commit that referenced this issue Jun 19, 2024
Implements most of
#1797

Core sharing (two parachains or more marachains scheduled on the same
core with the same `PartsOf57600` value) was not working correctly. The
expected behaviour is to have Backed and Included event in each block
for the paras sharing the core and the paras should take turns. E.g. for
two cores we expect: Backed(a); Included(a)+Backed(b);
Included(b)+Backed(a); etc. Instead of this each block contains just one
event and there are a lot of gaps (blocks w/o events) during the
session.

Core sharing should also work when collators are building collations
ahead of time

TODOs:

- [x] Add a zombienet test verifying that the behaviour mentioned above
works.
- [x] prdoc

---------

Co-authored-by: alindima <alin@parity.io>
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this issue Aug 2, 2024
Implements most of
paritytech#1797

Core sharing (two parachains or more marachains scheduled on the same
core with the same `PartsOf57600` value) was not working correctly. The
expected behaviour is to have Backed and Included event in each block
for the paras sharing the core and the paras should take turns. E.g. for
two cores we expect: Backed(a); Included(a)+Backed(b);
Included(b)+Backed(a); etc. Instead of this each block contains just one
event and there are a lot of gaps (blocks w/o events) during the
session.

Core sharing should also work when collators are building collations
ahead of time

TODOs:

- [x] Add a zombienet test verifying that the behaviour mentioned above
works.
- [x] prdoc

---------

Co-authored-by: alindima <alin@parity.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Status: Todo
Development

No branches or pull requests

4 participants