-
Notifications
You must be signed in to change notification settings - Fork 11.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consensus Adapter #1449
Consensus Adapter #1449
Conversation
Make me a favour on the checkpointing side, which is also going to use the the consensus:
|
You have been too fast :) I was writing the description of the PR (but I am a slow writer)
Good idea fixing this |
Codecov Report
@@ Coverage Diff @@
## main #1449 +/- ##
========================================
Coverage 82.19% 82.19%
========================================
Files 104 107 +3
Lines 21145 21385 +240
========================================
+ Hits 17380 17578 +198
- Misses 3765 3807 +42
Continue to review full report at Codecov.
|
Fixed |
@lxfind - I want to get this PR landed. Who is the best to review it? |
Do we know what we want the interface between consensus and sui to be? As in we expect this to simply be a point to point link between narwhal and sui? Maybe we can reuse some of the grpc work that's being worked on for this interface as well? |
Yes eventually we may use the grpc interface. The challenge is that the Narwhal interface also needs to work for Celo and Somelier (not only Sui), so I am not sure this can be done before dev net The current interface is simple: Sui submits serialized certificates to Narwhal (who see them as bytes). Consensus sequences those certificates and assigns a lock to each shared object. Sui (or the client) then pick up the certificate and execute it (on the Sui side). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given our timeline we can get this in once the tests are passing (they seemed to timeout?). I think we'll want to spend a bit more time looking at our interfaces but we can do that at a future time.
Tests work fine on my local machine, debugging the CI now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is going to run out of fds (not to mention ports) fast.
let consensus_address = if let Some(val) = json.get("consensus_address") { | ||
SocketAddr::deserialize(val).map_err(serde::de::Error::custom)? | ||
} else { | ||
let port = PORT_ALLOCATOR | ||
.lock() | ||
.map_err(serde::de::Error::custom)? | ||
.next_port() | ||
.ok_or_else(|| serde::de::Error::custom("No available port."))?; | ||
format!("127.0.0.1:{port}").parse().unwrap() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heads up in regard to #1469 : what we're really managing here is a bind address (for lack of a way to determine the public address), and we should probably use variable names that reflect this. Dropping a log line here or in make_authority
would also help.
cc. @bmwill for an ubiquitous pain point of the current implementation: we bind where we (think we) are reachable, and in absence of that public address, bind to localhost 😅
|
||
Some(output) = self.rx_consensus_output.recv() => { | ||
// Notify the caller that the transaction has been sequenced. | ||
let (outcome, transaction) = output; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You could bind this pattern two lines earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in #1467
if let Some(repliers) = self.pending.get_mut(&digest) { | ||
if let Some(replier) = repliers.pop_front() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(repliers) = self.pending.get_mut(&digest) { | |
if let Some(replier) = repliers.pop_front() { | |
if let Some(replier) = self.pending.get_mut(&digest).and_then(|r| r.pop_front()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #1467
// Send certificate to consensus | ||
let serialized = serialize_consensus_transaction(certificate); | ||
let bytes = Bytes::from(serialized.clone()); | ||
// TODO [issue #1452]: We are re-creating a connection every time. This is wasteful but does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a couple other ways to approach this:
- Use a real RPC like framework that just handles these things for us (e.g. gRPC like we're doing for our other narwhal integrations)
- store an
Option<TcpStream>
and if its none do the reconnection otherwise use the existing connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially went for option 2 (store an Option<TcpStream>
) but this function cannot take a mutable reference to self (for reasons linked to authority state). but what francois mentions is a problem I will try to fix it in a separate PR
What this PR does
This PR introduces the
Consensus Adapter
module who bridges the Sui with consensus. It works in the following way:What this PR does not do