-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AccountsDB2.0: base layer for account state outside of validator #15
Comments
Some notes from a discussion with @grooviegermanikus on a streaming architecture for account state: Terms:
Overall architecture
So if the tool we want to create that recreates accountsdb outside of validator is called
The accountsdb-consumer would rebuild the database starting from 1231121 and also start a grpc subscription that feeds any account_writes from slot 1231121 onwards. Producer pluginThe goal of the producer plugin in the validator would be to only produce a buffer of account_writes and blocks. This could involve inserting things in a postgres database, pushing things to a kafka queue or building an in-memory buffer. This plugin would be pretty simple. For example, in the postgres state it could involve just inserting account writes into a table:
A separate process could prune according to some rule - e.g n-days, n-updates, n-slots, n-disk size etc. Consumer grpc or other means?In theory if the producer plugin inserts into PG we could have a consumer that also reads directly from the PG buffer. However, I propose for the initial design that we make this a grpc interface. Why? Because most of the interesting logic to resolve account state and produce an accountsdb copy will be in the However, another option would be to make the consumer just some kind of crate or sth that could be imported into any project that want to write a consumer which is a source or sink. Each producer plugin/buffer source would need the corresponding Why not make the
|
Thanks a lot for the writeup! I wonder if we can throw some real-world quantities, number of messages, gigabytes to handle.
|
building blocks
|
data size (PosgreSQL store): 600GB for all AccountsDB (compressed account state on validator: 40GB) |
code goes here: https://github.com/solana-rpc-community/solana-rpc2-accountsdb |
A lot of good points here, a few thoughts from going through it DAS for token methods
Sourcing account hashes
Deletes
General thought on the streaming architecture. Wondering if we should consider replacing or adding in addition to streaming, the option to pull data from geyser instead of relying on a stream of data. This has a few benefits
Some downsides
|
Implement an accountsdb outside of the validator the matches accountsdb.
Requirements:
How to drive the account state/export it from the validator?
Important part is that we need to ensure at-least-once delivery for every account so that we can accurately resolve account state.
Is there an argument for not touching token methods/token support and letting https://github.com/metaplex-foundation/digital-asset-rpc-infrastructure handle that?
There are a few tricky issues to handle within this:
There are some existing discussions that are worth looking at :
The text was updated successfully, but these errors were encountered: