-
Notifications
You must be signed in to change notification settings - Fork 10
refactor: improve resolve pool logic #591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Sander Pick <sanderpick@gmail.com>
Signed-off-by: Sander Pick <sanderpick@gmail.com>
Signed-off-by: Sander Pick <sanderpick@gmail.com>
| ) | ||
| } | ||
|
|
||
| fn get_pending_read_requests( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some rearranging here to match method order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some rearranging here to match method order
| } | ||
|
|
||
| pub fn get_open_read_requests<BS: Blockstore>( | ||
| pub fn get_read_requests_by_status<BS: Blockstore>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some rearranging here to match method order
| } | ||
| } | ||
| Ok(Err(e)) => { | ||
| Err(e) | Ok(Err(e)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vote "nay" in the case of any error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would mean that if the service on the other side is not responding, we will re-enqueue to task, right? does that mean if the other service is not responding then the task queue will keep growing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the task queue is now bounded by the concurrency config setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah Okay. sounds good.
| let local_resolving_blobs_count = | ||
| local_blobs_count.saturating_sub(local_finalized_blobs.len()); | ||
| let added_blobs_fetch_count = chain_env | ||
| .blob_concurrency | ||
| .saturating_sub(local_resolving_blobs_count as u32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limit how many added blobs we move to pending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(all changes are mirrored for read requests)
| // If the local blobs pool is empty and there are pending blobs on-chain, | ||
| // we may have restarted the validator. We can hydrate the pool here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hydrate from the prepare step if the local pool is empty (maybe from a restart)
| resolving = local_resolving_blobs_count, | ||
| finalized = local_finalized_blobs.len(), | ||
| "blob pool counts" | ||
| ); | ||
| tracing::debug!( | ||
| added = added_blobs_fetched_count, | ||
| pending = pending_blobs_fetched_count, | ||
| "blob fetched counts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up logging
| // Reject the proposal if the current processor is not keeping up with blob | ||
| // resolving. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give slow processors the opportunity to reject
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to do this here but does it mean that new block will be created more slowly if proposals are rejected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the block doesn't have enough votes, yes, it would have to advance to another voting "round". if the rejector doesn't have much relative power, the block is still accepted.
| let client = FendermintClient::new_http(tendermint_url, None)?; | ||
| let iroh_manager = IrohManager::from_addr(Some(iroh_addr)); | ||
| let iroh_addr = iroh_addr | ||
| .to_socket_addrs()? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Why did we start using Iroh Manager earlier? Does it provide connection pooling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it provides lazy connection creation... it's something we wrote... it's getting removed with the iroh upgrade work
| Ok((count, done)) | ||
| } | ||
|
|
||
| /// Count all items and resolved and failed items. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Count all items and resolved and failed items. | |
| /// Count all items including resolved and failed items. |
| ); | ||
|
|
||
| // Once the read request is closed, we can clean up the votes | ||
| let mut request_id = read_request.id.as_bytes().to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
avichalp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Posted some questions for my understanding.
|
@avichalp doesn't look like I'll have time to make changes here today before we make a release for tomorrow. I'll do a follow up to address the doc comment and shutdown fendermint if the resolver service exits. |
|
sounds good. created a new issue here #593 (comment) |
addedblobs to move topendingwithout considering the current size of the poolIrohManagerto the Object API that was causing the "Iroh node is not running" errors.