Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove CGC from data_availability checker #7033

Open
wants to merge 5 commits into
base: unstable
Choose a base branch
from

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

Validator custody makes the CGC and set of sampling columns dynamic. Right now this information is stored twice:

  • in the data availability checker
  • in the network globals

If that state becomes dynamic we must make sure it is in sync updating it twice, or guarding it behind a mutex. However, I noted that we don't really have to keep the CGC inside the data availability checker. All consumers can actually read it from the network globals, and we can update make_available to read the expected count of data columns from the block.

@dapplion dapplion added the das Data Availability Sampling label Feb 24, 2025
@dapplion dapplion requested a review from jxs as a code owner February 24, 2025 23:40
@dapplion dapplion added the ready-for-review The code is ready for review label Feb 24, 2025
@@ -104,6 +109,8 @@ impl<E: EthSpec> RpcBlock<E> {
Self {
block_root,
block: RpcBlockInner::Block(block),
// Block has zero columns
custody_columns_count: 0,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An RPC blocks without columns will never read this value

if received_column_count >= total_column_count {
return ReconstructColumnsDecision::No("all columns received");
}
// Only supernodes receive >= 50% of columns
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the "not required for full node" condition as it's redundant and would force us to pipe the custody group here somehow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and didn't see any existing check to ensure this wasn't being called on a non-full node. Why do you say it's redundant?

Could you not do this check with:

if let Some(custody_columns_count) = pending_components
    .executed_block
    .as_ref()
    .map(|diet_block| diet_block.custody_columns_count())
{
    if custody_columns_count != total_column_count {
        return ReconstructColumnsDecision::No("not required for full node");
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you suggest will cause weird behavior:

  • if block is there: reconstruction happens only if custody_columns_count = 100%
  • if block is not there: reconstruction happens only if custody_columns_count = 50%

Block being there or not before attempting to reconstruction depends on network conditions and timings.

let mut expected_block_root = anchor_info.oldest_block_parent;
let mut prev_block_slot = anchor_info.oldest_block_slot;
let mut new_oldest_blob_slot = blob_info.oldest_blob_slot;
let mut new_oldest_data_column_slot = data_column_info.oldest_data_column_slot;

let mut blob_batch = Vec::<KeyValueStoreOp>::with_capacity(blob_batch_size);
let mut blob_batch = Vec::<KeyValueStoreOp>::new();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using with_capacity here feels like a micro-optimization unless I'm missing something. Dropped it as we don't have access to the column count here anymore

@@ -815,6 +815,11 @@ where
(0..self.validator_keypairs.len()).collect()
}

pub fn get_sampling_column_count(&self) -> usize {
// Default column custody count
8
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have no effect on tests I believe

block,
columns.clone(),
// TODO
columns.len(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumes the CGC value is equal to all the columns received, which makes sense. We can change it later if we want to do more complex tests

@@ -145,13 +152,16 @@ impl<E: EthSpec> RpcBlock<E> {
Ok(Self {
block_root,
block: inner,
// Block is before PeerDAS
custody_columns_count: 0,
})
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth having consistent terminology for:

  1. Blocks
  2. Post-Deneb Blocks (blocks with blobs)
  3. Post-PeerDAS Blocks (blocks with custody columns)

Not sure what terminology that would be, but if we don't already have one / can't think of one, we should at least rename the above new function to something like new_with_blobs or something.

Copy link
Collaborator Author

@dapplion dapplion Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to get rid of RpcBlock completely, so this assignment here will become irrelevant. Instead have a chain segment as an enum as you suggest

Copy link
Member

@ethDreamer ethDreamer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests still need to be fixed

Some(block_root),
block,
custody_columns,
expects_custody_columns.len(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expects_custody_columns field on RangeBlockComponentsRequest could use a documentation comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

if received_column_count >= total_column_count {
return ReconstructColumnsDecision::No("all columns received");
}
// Only supernodes receive >= 50% of columns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and didn't see any existing check to ensure this wasn't being called on a non-full node. Why do you say it's redundant?

Could you not do this check with:

if let Some(custody_columns_count) = pending_components
    .executed_block
    .as_ref()
    .map(|diet_block| diet_block.custody_columns_count())
{
    if custody_columns_count != total_column_count {
        return ReconstructColumnsDecision::No("not required for full node");
    }
}

@@ -33,6 +33,8 @@ pub struct NetworkGlobals<E: EthSpec> {
/// The computed sampling subnets and columns is stored to avoid re-computing.
pub sampling_subnets: HashSet<DataColumnSubnetId>,
pub sampling_columns: HashSet<ColumnIndex>,
/// Constant custody group count (CGC) set at startup
custody_group_count: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You think it might be worth using one naming convention (like custody_group_count) everywhere instead of also using custody_columns_count?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this isn't behind any RwLock or anything but it was my understanding that this could change as the node synced more columns / added more validators? Or is that just not implemented yet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not implemented yet

@ethDreamer ethDreamer added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Mar 11, 2025
@dapplion dapplion added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling ready-for-review The code is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants