-
Notifications
You must be signed in to change notification settings - Fork 411
Define NodeAlias
struct and Display
impl
#1544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a tiny formatting nit.
Also thought whether we want to use this in ChannelManager
and introduce more checks for user-defined aliases via using a constructor (e.g., enforcing a non-empty alias), but that's probably too opinionated and it's best to keep things simple and local here.
lightning/src/routing/gossip.rs
Outdated
@@ -923,6 +923,47 @@ impl_writeable_tlv_based!(NodeAnnouncementInfo, { | |||
(10, addresses, vec_type), | |||
}); | |||
|
|||
/// A user-defined name for a node, which may be used when displaying the node in a graph. | |||
/// | |||
/// Node aliases provide a potential avenue for injection attacks. See [security considerations]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify this, its not like its complicated - the node alias is set by random third party nodes, care needs to be taken with any processing of it period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, missed this
lightning/src/routing/gossip.rs
Outdated
let alias = match core::str::from_utf8(bytes) { | ||
Ok(alias) => { | ||
alias.chars() | ||
.map(|c| if c.is_control() { control_symbol } else { c }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really worth mapping to replacement and such? Maybe its easier to just run to the first non-printable character and move on, it'd save us a chunk of code and be much easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh, it's all pretty straight forward while the suggestion would result in surprising behavior, IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that surprising? You had garbage in the node alias, garbage being ignored seems fine. I'm not a huge fan of the amount of code here trying to make sure we print as much of the alias as we can even though the alias author shoved garbage in. That's not really our problem, we don't need to bend over backwards for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... the node in question isn't the user. Why should our users suffer for their bad behavior? If a node's alias contains a tab or newline, it would be surprising if we show a truncated or empty string, depending on its position.
I don't really understand the pushback here. This is pretty standard behavior and is a trivial amount of code. The alternative proposals are to truncate, as you initially suggested, or filter as you are now suggesting, which the only difference is swapping map
for filter
with a slightly simpler lambda. If you just don't like using a replacement character, that's fine, but I don't think the amount of code here is really a valid argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... the node in question isn't the user. Why should our users suffer for their bad behavior? If a node's alias contains a tab or newline, it would be surprising if we show a truncated or empty string, depending on its position.
Right, I guess I don't see it as punishing the user. I see it as punishing the node who shoved garbage in their alias. Eg a tab may break someone printing node aliases in a tab-delimited file, so it's totally reasonable to just remove the crap the node tried to shove into the next column.
I don't really understand the pushback here. This is pretty standard behavior and is a trivial amount of code.
Yea, I think it's mostly all the iterators and allocating a string. Writing is usually slow/allocating anyway, but it's annoying to allocate just to filter. More generally, I find a bunch of iterators and chaining to be really hard to read, especially when there's a limit applied based on an index calculated elsewhere. I'm probably special here so not gonna hold this up anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I guess I don't see it as punishing the user. I see it as punishing the node who shoved garbage in their alias. Eg a tab may break someone printing node aliases in a tab-delimited file, so it's totally reasonable to just remove the crap the node tried to shove into the next column.
Let's say confusing the user then. 😉 e.g., When comparing to what a node explorer may show, a malicious node could be similarly aliased with but with some control characters thrown in. Probably no real security concern, but such a node could attempt to masquerade as another and would at very least be confusing.
Yea, I think it's mostly all the iterators and allocating a string. Writing is usually slow/allocating anyway, but it's annoying to allocate just to filter. More generally, I find a bunch of iterators and chaining to be really hard to read, especially when there's a limit applied based on an index calculated elsewhere. I'm probably special here so not gonna hold this up anymore.
FWIW, the old sanitize_string
from the sample was allocating a string. Not sure if we can get around that with Formatter
's interface. I guess we could write char-by-char using char::encode_utf8
, though not sure how much more readable that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, without any allocations:
diff --git a/lightning/src/routing/gossip.rs b/lightning/src/routing/gossip.rs
index 0369a87c..1e3be5dd 100644
--- a/lightning/src/routing/gossip.rs
+++ b/lightning/src/routing/gossip.rs
@@ -935,20 +935,24 @@ impl fmt::Display for NodeAlias {
let control_symbol = core::char::REPLACEMENT_CHARACTER;
let first_null = self.0.iter().position(|b| *b == 0).unwrap_or(self.0.len());
let bytes = self.0.split_at(first_null).0;
- let alias = match core::str::from_utf8(bytes) {
+ match core::str::from_utf8(bytes) {
Ok(alias) => {
- alias.chars()
- .map(|c| if c.is_control() { control_symbol } else { c })
- .collect::<String>()
+ for c in alias.chars() {
+ let mut bytes = [0u8; 4];
+ let c = if !c.is_control() { c } else { control_symbol };
+ f.write_str(c.encode_utf8(&mut bytes))?;
+ }
},
Err(_) => {
- bytes.iter().map(|b| *b as char)
+ for c in bytes.iter().map(|b| *b as char) {
// Display printable ASCII characters
- .map(|c| if c >= '\x20' && c <= '\x7e' { c } else { control_symbol })
- .collect::<String>()
+ let mut bytes = [0u8; 4];
+ let c = if c >= '\x20' && c <= '\x7e' { c } else { control_symbol };
+ f.write_str(c.encode_utf8(&mut bytes))?;
+ }
},
};
- f.write_str(&alias)
+ Ok(())
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a malicious node could be similarly aliased with but with some control characters thrown in. Probably no real security concern, but such a node could attempt to masquerade as another and would at very least be confusing.
Nodes can always alias with others? There's nothing stopping a node from putting whatever they want?
FWIW, the old sanitize_string from the sample was allocating a string.
Yea, I was more okay with it in the sample, which tends to be gratuitous with the allocations in a few places....it is just a sample, after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, true... guess that example doesn't make much sense. Pushed a version without allocations.
Codecov Report
@@ Coverage Diff @@
## main #1544 +/- ##
==========================================
+ Coverage 90.94% 91.70% +0.75%
==========================================
Files 80 80
Lines 43469 47899 +4430
Branches 43469 47899 +4430
==========================================
+ Hits 39533 43925 +4392
- Misses 3936 3974 +38
Continue to review full report at Codecov.
|
LGTM, feel free to squash. |
Provide a wrapper struct for 32-byte node aliases, which implements Display for printing. Support the UTF-8 character encoding, but replace control characters and terminate at the first null character. Fall back to ASCII if the byte sequence is an invalid encoding.
be40a82
to
21aff6f
Compare
Provide a wrapper struct for 32-byte node aliases, which implements
Display
for printing. Support the UTF-8 character encoding, but replace control characters and terminate at the first null character. Fall back to ASCII if the byte sequence is an invalid encoding.This is being upstreamed from the sample node. But also adds support for invalid UTF-8 and replaces rather than filters control characters.