Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVF should accept data based on preimages #811

Open
rphmeier opened this issue Jun 6, 2022 · 6 comments
Open

PVF should accept data based on preimages #811

rphmeier opened this issue Jun 6, 2022 · 6 comments
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.

Comments

@rphmeier
Copy link
Contributor

rphmeier commented Jun 6, 2022

A common issue is that parachains need to include data which is large and takes up PoV space. We can use the Preimage pallet to have validators load this data from the relay chain state and pass it into the PVF instead. An initial use-case is smart contract code. PVF upgrades may also be optimized with this method cc #971

Summary of changes:

  • CandidateReceipts should have a data_dependencies: Vec<Hash> which are the hashes of all the data that should be passed into the PVF
  • The relay-chain runtime should reject candidates which either have too many (governance configurable) hashes or candidates which have any hashes where the preimage pallet doesn't have the preimage for.
  • Preimages should last for T (~24 hours) after being 'removed'. During this 'removal' phase they cannot be referenced by candidate receipts
  • The PVF should accept a parameter Vec<(Hash, Vec<u8>)> which are all the preimages for the candidate receipt.
  • ValidateFromExhaustive should accept a HashMap<Hash, Vec<u8>> for candidate execution
  • Validators, before executing a block (in backing, approval, or disputes), should pull the preimages from the state with a runtime API called at any recent relay-chain block. If any preimages in a candidate are unavailable, then the block is invalid

We don't want to allow too many preimages because it adds runtime, network, and execution overhead for every candidate because of the added checks and candidate receipt size.

This also allows data (e.g. smart contract code) to be reused across parachains in a way that doesn't add to the PoV size. Collator logic and parachain runtimes can be customized to detect data at block authoring time which is available through the preimages pallet and opt to include it that way as opposed to the PoV state. Users will need to pay to store data on the relay-chain.

@Polkadot-Forum
Copy link

This issue has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/altering-polkadots-fork-choice-to-reduce-da-load/3389/4

@burdges
Copy link

burdges commented Jul 12, 2023

An example of a pre-image would be a block meant to be committed to now, but executed somewhat later?

@shawntabrizi
Copy link
Member

shawntabrizi commented Jul 12, 2023

@burdges I think common preimages would be, for example, popular smart contracts.

Or any other data which stays mostly the same, but is read often. I could imagine multisigs would be another candidate for example, although impact to DA may be small in this example.

But I do think it would be cool if, for an on-demand parachain, they can just upload the whole block beforehand, as a part of the collation, and then it would not involve the DA limits at all. I could imagine such a block could be many times the maximum block size, since multiple transactions could provide parts of the final block which is eventually concatenated together.

Stuff like that.

@Sophia-Gold Sophia-Gold transferred this issue from paritytech/polkadot Aug 24, 2023
@the-right-joyce the-right-joyce added I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. T8-parachains_engineering and removed I10-optimisation labels Aug 25, 2023
@the-right-joyce the-right-joyce added the I6-meta A specific issue for grouping tasks or bugs of a specific category. label Oct 11, 2023
claravanstaden pushed a commit to Snowfork/polkadot-sdk that referenced this issue Dec 8, 2023
* Cache milagro G1 pubkeys

* Update cumulus

* More refactoring

* Fix parachain test in CI

* Revert to use BoundedVec for SyncCommitteePrepared

* Use DispatchError

* Revert "Revert to use BoundedVec for SyncCommitteePrepared"

This reverts commit 22274fad4e604152a347fc03cfb9eefea28fbc6b.

* Revert "Revert to use BoundedVec for SyncCommitteePrepared"

This reverts commit 22274fad4e604152a347fc03cfb9eefea28fbc6b.

* optimize things

* Update Milagro

* More refactoring

* Uncomment benchmark

* Pin cumulus branch

* Update rust toolchain & More polish

* Fix build

* Update cumulus

---------

Co-authored-by: Vincent Geddes <vincent.geddes@hey.com>
@eskimor
Copy link
Member

eskimor commented Apr 5, 2024

Plain pre-images won't work as we require probabilistic finality on them for use in parachains.

@shawntabrizi
Copy link
Member

Plain pre-images won't work as we require probabilistic finality on them for use in parachains.

Can you elaborate here?

If data exists on the relay chain, (even if the block is not yet finalized), why can't it be used in the parachain validation function?

Either the block, the data, and the parachain is all finalized, or none of it is.

Perhaps we are not eye to eye on what this issue is trying to call out.

@burdges
Copy link

burdges commented Apr 5, 2024

We need all PVF inputs to be available to all validators, meaning either (1) the data exists in relay chain state, or else (2) we delcared the data available in the availability system. We're damaging (1) slightly in async backing, so the simplest solution becomes probabalistic finality, aka two epochs delay, so like PVFs work now.

Afaik, Jam specifically avoids solving this problem, by doing relay chain IO only from the accumulate function, maybe unrealstic.

We should probably utilize the availability system more agressively here, meaning parablock headers could name multiple availability stored blobs which they desire, even portions of blobs give like (hash,start,end). This would be a bunch of work, but it gives us more flexibility going forward than other solutions, which also require lots of work.

In particular, I doubt we remember who possesses which availability chunks now. We'd retain this off-chain somehow, but off-chain messaging requires this anyways, and worse that validators provide this information to collators simular to how bittorrent trackers work.

@athei athei moved this to Nice to have in Smart Contracts Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.
Projects
Status: Nice to have
Status: Backlog
Status: To do
Development

No branches or pull requests

8 participants
@burdges @eskimor @shawntabrizi @rphmeier @the-right-joyce @Polkadot-Forum and others