Conversation
🧪 Network TestsTo run network tests for this PR, use: gh workflow run network-tests.yml -f pr_number=983Available test options:
Test types: Results will be posted as workflow runs in the Actions tab. |
|
❌ Python formatting check failed in CI. Please run |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #983 +/- ##
==========================================
+ Coverage 54.40% 54.69% +0.29%
==========================================
Files 402 406 +4
Lines 67237 69440 +2203
Branches 67237 69440 +2203
==========================================
+ Hits 36582 37982 +1400
- Misses 28846 29549 +703
- Partials 1809 1909 +100 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cfd4ce6 to
18f40f7
Compare
38f8a69 to
4146a59
Compare
7f35ecc to
9723205
Compare
8d2bc28 to
af375e8
Compare
929cb77 to
45c1d4e
Compare
- add HAS_STATE_PARTS flag to existing blocks in the `block_handles`
+ do not double store partition subtree root cell in the main storage
…d check downloaded
…split state on parts
…AS_STATE_MAIN exists, add HAS_PERSISTENT_SHARD_STATE_PARTS when HAS_PERSISTENT_SHARD_STATE_MAIN exists
5291a09 to
5d89709
Compare
| let index = indices_buffer[i]; | ||
| let hash = unsafe { *keys[i].cast::<[u8; 32]>() }; | ||
| stack.push((index, StackItem::New(hash))); | ||
| let child_hash = unsafe { *keys[i].cast::<[u8; 32]>() }; |
Check failure
Code scanning / CodeQL
Access of invalid pointer High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 16 days ago
General fix: Avoid storing raw pointers that may be null or otherwise invalid and later dereferencing them with unsafe. Instead, store the data itself (here, the 32‑byte hash) or a safe reference whose lifetime is clearly valid, and let Rust’s type system enforce safety.
Concrete approach here:
- Replace
keysfrom[*const u8; 4]with an array of owned hashes:[[u8; 32]; 4]. - When iterating
for hash in &references_buffer, copy*hashdirectly intokeys[preload_count]instead of storinghash.as_ptr(). - Later, when preloading, use
let child_hash = keys[i];instead of dereferencing a raw pointer. - This removes the need for
unsafein this block and makeskeysalways contain valid, fully initialized values for indices< preload_count.
Changes are all local to core/src/storage/persistent_state/shard_state/writer.rs in the shown region around lines 465 and 511. No new methods or imports are required; HashBytes is already known to be a [u8; 32]-like type, and copying it by value is fine.
| @@ -462,7 +462,7 @@ | ||
| let mut reference_indices = SmallVec::with_capacity(references_buffer.len()); | ||
|
|
||
| let mut indices_buffer = [0; 4]; | ||
| let mut keys = [std::ptr::null(); 4]; | ||
| let mut keys: [[u8; 32]; 4] = [[0; 32]; 4]; | ||
| let mut preload_count = 0; | ||
|
|
||
| for hash in &references_buffer { | ||
| @@ -473,7 +473,7 @@ | ||
| entry.insert((remap_index, false)); | ||
|
|
||
| indices_buffer[preload_count] = remap_index; | ||
| keys[preload_count] = hash.as_ptr(); | ||
| keys[preload_count] = *hash; | ||
| preload_count += 1; | ||
|
|
||
| remap_index | ||
| @@ -482,7 +482,7 @@ | ||
| let (remap_index, written) = *entry.get(); | ||
| if !written { | ||
| indices_buffer[preload_count] = remap_index; | ||
| keys[preload_count] = hash.as_ptr(); | ||
| keys[preload_count] = *hash; | ||
| preload_count += 1; | ||
| } | ||
| remap_index | ||
| @@ -508,7 +508,7 @@ | ||
|
|
||
| for i in 0..preload_count { | ||
| let index = indices_buffer[i]; | ||
| let child_hash = unsafe { *keys[i].cast::<[u8; 32]>() }; | ||
| let child_hash = keys[i]; | ||
| stack.push((index, StackItem::New(child_hash))); | ||
| } | ||
| } |
RATIONALE
Support local state data sharding on parts - shard accounts cells tree is split into parts by shards at the configured
split_depth. The top of the state tree is stored in the main cells db, and parts subtrees are stored each in separate physical databases that can be placed on separate disks.Pull Request Checklist
NODE CONFIGURATION MODEL CHANGES
[Yes]
Added
core_storage.state_partssplit_depth: 0- no parts used;Default value is
state_parts: nullthat means no parts configured.BLOCKCHAIN CONFIGURATION MODEL CHANGES
[None]
COMPATIBILITY
Affected features:
Fully compatible.
State will be saved with parts if they are specified in config. If state was saved with parts it will be read with parts. If it was saved without parts (e.g. before update) it will be read without.
Parts map
{key - cell hash: value - shard prefix}will be saved toCellsDB.shard_statesright afterroot cell hash. If no parts used nothing will be added toShardStatesvalue. So existing values inShardStatestable will be treated as "no parts used".A new flag
HAS_STATE_PARTS = 1 << 13added to theBlockHandlebit flags. It means that no parts were used / or all required state parts were successfully stored in separate storages. NowBlockHandle.has_state()returnstrueonly when both new flag and old oneHAS_STATE_MAIN = 1 << 3are set.The migration script (0.0.4 -> 0.0.5) sets
HAS_STATE_PARTSfor all existing block handles.One more new flag
HAS_PERSISTENT_SHARD_STATE_PARTS = 1 << 14added. It means that when persistent stored, no parts were used / or all required persistent parts were successfully stored. NowBlockHandle.has_persistent_shard_state()returnstrueonly when both new flag and old oneHAS_PERSISTENT_SHARD_STATE_MAIN= 1 << 4are set.The migration script (0.0.4 -> 0.0.5) also sets
HAS_PERSISTENT_SHARD_STATE_MAINfor existing block handles which haveHAS_PERSISTENT_SHARD_STATE_MAINflag.Persistent state files will be split on main file and parts if persistent was stored when state split on parts configured. E.g.:
{block_id}.boc{block_id}_part_{shard prefix hex}.bocWhen we store persistent main file we take parts roots, then get their children and make absent cells from them. This way we preserv shard state tree root_hash unchanged and consistent but do not store parts subtrees in main file, only their roots. This way we store part subtree root cell both in main file and in the part.
We additionally store metadata file
{block_id}.meta.json.It stores the split depth used to store persistent parts. It used to preload persistent states to descriptor cache for rpc server.
When node find persistent state it get
part_split_depthvalue. Then node stores state from main file. Ifpart_split_depth > 0then it calculates parts info (hash - prefix) and request parts using prefixes. Then it stores parts and check that their root hashes match with calculated parts info.Previous persistent state file format is fully backward compatible.
Manual compatibility tests were passed:
core_storage.state_parts = nullor remove param from configfeat/split-state-storage-persistentbarch version.temp/config1.json.temp/config1.jsonManual persistent state test 1:
run-node4.shscript where--import-zerostatearg is omitted)If nodes 1-3 where configured without state split on parts and persistent state was created without parts it will be correctly downloaded and stored on the node 4 (that configured with 4 parts). We will have a single persistent state file but state will be split stored into 4 different storages.
SPECIAL DEPLOYMENT ACTIONS
[Not Required]
Without additional changes in the node config it works with a single part without split.
PERFORMANCE IMPACT
[Expected impact]
TESTS
Unit Tests
Network Tests
[No coverage]
Manual Tests
Performance testing:
(metrics are in the PERFORMANCE IMPACT block)