-
Notifications
You must be signed in to change notification settings - Fork 99
Description
This issue supersedes both #527 and #616, and probably splits out into a few sub-issues itself.
The goals are:
- Let clients sync to an account state close to the chain tip - but not necessarily to the chain tip itself. This is needed for FPI calls where while a transaction is in progress, the chain tip may move, and so the block against which we are executing a transaction may be slightly off from the chain tip.
- Do not require clients to download full account state (specifically all contents of storage maps and the entire vault) to be able to work with accounts. This basically means that clients should be able to download partial vault and partial maps. Another consequence of this is that all responses should have bounded size (i.e., not bigger than 2MB - 4MB).
- Make the requests and the sync process as simple and "light-weight" as possible for the general case (i.e., small accounts), while still supporting edge cases of large accounts.
The way I think to accomplish these goals is to split the process into steps:
- The user makes a request to a single endpoint to get basic info about an account (or a set of accounts). For small accounts, this could return all account data.
- For larger accounts, the user would follow this up with calls to
SyncStorageMapsandSyncAccountVaultendpoints to download the data that was not returned from the first request.
We already have SyncStorageMaps and SyncAccountVault endpoints, and so this issue is about the first point above.
My current think is that this endpoint should work with just a single account. The main reason is that if we make it about more than one account, keeping the response size small will be very challenging. For example, if the user requests data for 10 accounts, even 200KB per account could overflow the limit. And so, we'd probably need to keep the number of requested accounts pretty small. But at that point, might as well just have it work for a single account only.
The endpoint would be more similar to the current AccountProofs, and so I'll call it AccountProof for now, but we should probably come up with a better name. The endpoint should return:
- For private and public accounts:
- Account state commitment.
- Account witness. We can consider making this optional as this is currently only useful for FPI calls, but I think this may be generally useful in the future.
- For public accounts only:
- Account code, if needed. If included, this would not exceed 64KB as this is what we think to impose as a limit for account code (though, there is an argument to make this even bigger - maybe 128KB or bigger).
- Account storage header. This would include values of all value slots and roots of all map slots (and some metadata). Since we can have at most 256 storage slots, this part cannot exceed 16KB.
- Additional storage map data: the user could request values under specific keys for specific maps. Here, we could return either partial storage maps or the list of key-value pairs if the SMTs are small enough. We'd need to come up with a strategy of how to figure what's the appropriate limits here (which will require some benchmarking), but we could probably allocate a couple of MB for this segment.
- Additional account vault data: if the user is interested, we can send limited amount of vault data. For example, if the account contains under 1000 assets, we could probably just send all the assets, and this would take less than 64KB (for accounts that have more than that,
SyncAccountVaultwould need to be used).
The request message for this endpoint could look something like this:
message AccountProofRequest {
// ID of the account for which we want to get data
account.AccountId account_id = 1;
// Block at which we'd like to get this data. Must be close to the chain tip.
fixed32 block_num = 2;
// Request for additional account details; valid only for public accounts.
optional AccountDetailRequest details = 3;
}
message AccountDetailRequest {
// Last known code commitment to the client. The response will include account code
// only if its commitment is different from this value.
//
// We could also extend this methodology to account storage and asset vault to return
// data only if the client doesn't already have it.
primitives.Digest code_commitment = 1;
// A flag indicating whether the response should include asset vault data; assets
// will be returned only if the account contains a small number of assets
// (e.g., under 1000).
//
// We could also make this more granular and request assets under specific keys, but
// I'm not sure this is needed at the moment.
bool include_assets = 2;
// Additional request per storage map.
repeated StorageMapDetailRequest storage_maps = 3;
}
message StorageMapDetailRequest {
// Storage slot index ([0..255])
uint32 slot_index = 1;
oneof slot_data {
// A flag asking to return all storage map data; valid only for small storage maps
// (e.g., with fewer than 1000 entries).
bool all_entries = 1;
// A list of map keys (Digests) associated with this storage slot.
repeated Digest map_keys = 2;
}
}And the response message could look roughly like this:
message AccountProof {
// Account ID, current state commitment, and SMT path
AccountWitness witness = 1;
// Additional details for public accounts
optional AccountDetails details = 2;
}
message AccountDetails {
// Account header.
AccountHeader header = 1;
// Account code; empty if there code commitments matched
optional bytes code = 2;
// Account asset vault data; empty if vault commitments matched
optional AccountVaultDetails vault_details = 3;
// Account storage data; empty if storage commitments matched
optional AccountStorageDetails storage_details = 3;
}
message AccountVaultDetails {
// A flag that is set to true if the account contains too many assets. This indicates
// to the user that SyncAccountVault endpoint should be used to retrieve the
// account's assets
bool too_many_assets = 1;
// When too_many_assets == false, this will contain the list of assets in the
// account's vault
repeated Asset assets = 2;
}
message AccountStorageDetails {
// Account storage header (storage slot info for up to 256 slots)
AccountStorageHeader header = 1;
// Additional data for the requested storage maps
repeated AccountStorageMapDetails map_details = 2;
}
message AccountStorageMapDetails {
// slot index of the storage map
uint32 slot_index = 1;
// A flag that is set to true if all_entries == true was used in the request for this
// storage map and the map contains too many entries. This indicates to the user
// that SyncStorageMaps endpoint should be used to get all storage map data
bool too_many_entries = 1;
oneof data {
// If all_entries where requested OR if the storage map contains a small
// number of entries this field will be populated
repeated StorageMapEntry entries = 1;
// If specific keys were requested AND if the storage map is large, we will
// return a partial storage map
PartialStorageMap partial_map = 2;
}
}To support this structure, we will need to modify the database a bit. Specifically:
We'll need to remove vault and storage fields from the accounts table. These would need to be replaced with vault_commitment and storage_header respectively.
We would also need to add block_num field to the accounts table as it now will contain account records for multiple blocks (we could also add is_latest field, but not sure yet if it is needed). We'd need implement pruning strategy for this table to retain only the data close to the chain tip, but this could be done as a part of #1175.
To be able to serve partial storage maps, we'll need to keep the Merkle path data somewhere. Previously this was in the storage field of the accounts table - but now it'll go away. How to handle this is an issue in itself, but for now, we could probably use the MerkleStore struct to keep all the data in memory. It would work similarly to the account tree, nullifier tree, and chain MMR structures that we have now (i.e., it'll be built on node startup).