-
Couldn't load subscription status.
- Fork 289
authentication protocol for udp #2417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
this doesn't include integration code (e.g discovery, dataplane and raptorcast changes) and i still need to add more docs to the code itself |
f088da4 to
070fb7c
Compare
070fb7c to
98a6b9b
Compare
|
|
||
| pub fn generate_cookie(cookie_secret: &[u8; 32], nonce: u64, remote_addr: &SocketAddr) -> [u8; 16] { | ||
| let mut address_bytes = [0u8; 16]; | ||
| match remote_addr.ip() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remote port is not used. According to the spec: The initiator address is a concatenation of the message source IP address and source port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i forgot about port, will add
| } | ||
| } | ||
|
|
||
| #[derive(Clone, Debug, Zeroize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no ZeroizeOnDrop. Is it intentional? It could be useful for tmp_x variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i am using Zeroizing for chaining_key and hash output. i think there was no need to zeroize every tmp variable, will recheck
| .expect("ephemeral private key must be set"), | ||
| &msg.ephemeral_public, | ||
| )?; | ||
| let temp = keyed_hash!(state.chaining_key.as_ref(), ecdh_ee.as_ref()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ephemeral private key of the initiator's state can be erased here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess i can .take() it from the state, but the state itself is dropped (and private key is erased at that point) soon after exiting this function
| Ok(result) => { | ||
| result.map(|(timer, r)| (timer, None, r.rekey, Some(r.terminated))) | ||
| } | ||
| Err(_) => None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it safe to ignore an error here and at similar places below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tick() shouldn't be fallible, i will remove Result from every such function in a follow up
| ) | ||
| .map_err(CookieError::CookieDecryptionFailed)?; | ||
|
|
||
| Ok(reply.encrypted_cookie) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is worth adding a comment that this function decrypts the cookie and returns its value. No comments, and a returned reply.encrypted_cookie is misleading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add
| Ok((terminated, rekey)) | ||
| } | ||
|
|
||
| pub fn handle_rekey_timer(&mut self) -> RekeyEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is called handle_rekey_timer, but nothing is handled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i will refactor this
|
|
||
| let mut session = Initiator { | ||
| handshake_state, | ||
| common, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be called session state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean whole struct, e.g rename CommonSessionData to SessionState and all related variables?
i can rename, seems like a better choice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct:
common -> session_state
CommonSessionData -> SessionState
The Noise protocol framework uses CipherState, SymmetricState, HandshakeState, you can follow that naming for other relevant types: SessionState, ProtocolState, ...
|
|
||
| use crate::error::{ProtocolErrorContext, Result}; | ||
|
|
||
| pub struct Cookies { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no zeroization for secrets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think that it accomplishes anything. those secrets can be erased only when authentication protocol is terminated (e.g program closed for example).
and it doesn't matter if they are stolen when protocol is no longer working
| refresh_duration: Duration, | ||
| ) -> Self { | ||
| let mut cookie_secret = [0u8; 32]; | ||
| rng.fill_bytes(&mut cookie_secret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the responder is under load, sends a handshake response with a cookie, and then immediately crashes? The initiator is an honest node, and it will send the new handshake initiation with the cookie that can't be verified after reloading. Will the reloaded node accept that handshake after timeout?
P.S. It is easier to ask that explicitly than to go through the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if responder is under load after crash then such message will have invalid mac2 and will be rejected.
if responder is not under load then invlalid mac2 will be ignored and initiation will pass.
do you think it is worth it to persist cookie secret to handle this case? to me it seems not worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if the responder is still under load after its crash, the honest initiator will not be able to establish the connection until some timer is triggered. In WireGuard, it seems to me, this would be Reject-After-Time × 3 seconds.
I am trying to understand which timeout defines that, and in how many seconds the honest peer will be able to establish a connection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initiator will retry if it doesn't get reply in session_timeout with some jitter
| pub ip_rate_limit_window: Duration, | ||
| pub max_requests_per_ip: usize, | ||
| pub ip_history_capacity: usize, | ||
| pub psk: [u8; 32], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add zeroization for Config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it makes sense to erase psk, as it meant to be long term secret
| pub rekey_deadline: Option<Duration>, | ||
| pub session_timeout_deadline: Option<Duration>, | ||
| pub max_session_duration_deadline: Option<Duration>, | ||
| pub stored_cookie: Option<[u8; 16]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add zeroization for the cookie secret?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it only allows attacker to bypass dos-filter.
it doesn't cost much to wrap this in Zerozing, but practically it seemed to me unnecessary
| use crate::{Initiator, Responder, Transport}; | ||
| use monad_wireauth_session::{Config, SessionIndex}; | ||
|
|
||
| pub struct API<C: Context> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In WireGuard:
after an initiator receives a handshake response message (section 5.4.3),
the responder cannot send transport data messages (section 5.4.6)
until it has received the first transport data message from the initiator.
And, further, transport data messages encrypted using the previous secure session might be in transit after a new secure session has been created.
For these reasons, WireGuard keeps in memory the current secure session,
the previous secure session, and the next secure session for the case of an unconfirmed session
Could you point me to the code where this logic is implemented? Or explain why this is not implemented and why the protocol does not need that functionality.
I think it should be in
monad-bft/monad-wireauth-api/src/api.rs
Line 400 in 98a6b9b
| fn decrypt_data_packet(&mut self, packet: &mut [u8], remote_addr: SocketAddr) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after an initiator receives a handshake response message (section 5.4.3),
the responder cannot send transport data messages (section 5.4.6)
until it has received the first transport data message from the initiator.
implemented by having a separate type for Responder, that type can't send messages, but can accept them (decrypt method), and on first successfully decrypted message it transitions to Transport type (establish method)
And, further, transport data messages encrypted using the previous secure session might be in transit after a new secure session has been created.
For these reasons, WireGuard keeps in memory the current secure session,
the previous secure session, and the next secure session for the case of an unconfirmed session
in api crate (state.rs) i track sessions with different state separately, sessions in Transport state can't be replaced by unconfirmed sessions, once Transport is created there is a call handle_established that evicts older sessions. that covers current secure session (Transport type) and separate unconfirmed session. so initiator or responder will never evict established session until new session transitions to Transport state
there is currently an expectation that both sides will initiate connect before sending messages, and i track 1 of each initiated and accepted instead of previous secure session. it addresses a problem that older session may get evicted during rekey while there are inflight messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though this last part is not very intuitive, i will check how to transition to simply storing 2 last established sessions instead
| initiating_sessions: HashMap::new(), | ||
| responding_sessions: HashMap::new(), | ||
| transport_sessions: HashMap::new(), | ||
| last_open_session_by_public_key: HashMap::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does open mean? Who opened it? Can we use initiator/responder-based word?
Does Open mean not yet established?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be established or transport last_transport_session_by_public_key
|
|
||
| self.transport_sessions.insert(session_id, transport); | ||
|
|
||
| terminated_sessions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not obvious without comments why the function called handle_established returns terminated_sessions also in
monad-bft/monad-wireauth-api/src/api.rs
Line 204 in 98a6b9b
| let terminated_sessions = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are replaced sessions, i will take a look if it can be simplified
|
|
||
| let mut initiator = self | ||
| .state | ||
| .remove_initiator(&response.receiver_index.get().into()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the State removed before the response is authenticated/validated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i forgot to update this method to use get -> validate -> remove , similarly to decrypt_data_packet
| response: &mut HandshakeResponse, | ||
| remote_addr: SocketAddr, | ||
| ) -> Result<()> { | ||
| if !self.check_under_load(remote_addr, response.sender_index.get(), response)? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In WireGuard, verify MAC1 first — always. Check the MAC2 if the host is under load.
Here, first check is whether we are under load, and then check MAC1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will update to run mac1 check before this, the same way it is done in accept_handshake_init
| } | ||
|
|
||
| let duration_since_start = self.context.duration_since_start(); | ||
| let local_index = self.state.allocate_session_index(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We allocate and insert a new index here. What if we allocate it and then fail on validate_init? My understanding is that the error is returned, but handle_terminate is not called, leaving the index permanently marked as “allocated.”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats a mistake
|
@dshulyak When can I review the fixes for the findings? I think it makes sense to do one more round of the security review after you fix all current findings/comments. |
|
@dnkolegov-ar i will push changes by wed 29th |
implementation of the #2091