-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for escaping OsStr
#9
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
src/lib.rs
Outdated
if cfg!(unix) || env::var("MSYSTEM").is_ok() { | ||
unix::escape_os_str(s) | ||
} else { | ||
unimplemented!("windows::escape_os_str") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this can land without an implementation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you can use std::os::windows::ffi::OsStrExt::encode_wide here to do the escaping then convert it back using std::os::windows::ffi::OsStringExt::from_wide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NobodyXu I've added a first draft of an implementation of escape_os_str
for Windows. Whenever you can find the time, I'd really appreciate a review and some assistance with a test case for test_escape_os_str_from_bytes
.
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
src/lib.rs
Outdated
/// assert!(!needs_escape(b'\\' as u16)); | ||
/// ``` | ||
pub fn needs_escape(wide_byte: u16) -> bool { | ||
let (high, low) = ((wide_byte >> 8) as u8, (wide_byte & 0xFF) as u8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use char::from_u16 here instead?
src/lib.rs
Outdated
/// ``` | ||
/// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx | ||
pub fn escape_os_str(s: &OsStr) -> Cow<'_, OsStr> { | ||
let encoded: Vec<u16> = s.encode_wide().collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let encoded: Vec<u16> = s.encode_wide().collect(); | |
let encoded: Vec<u16> = s.encode_wide(); |
Let's avoid this collect and instead clone the iterator.
src/lib.rs
Outdated
let mut chars = encoded.into_iter().peekable(); | ||
loop { | ||
let mut nslashes = 0; | ||
while let Some(&c) = chars.peek() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use Peekable::next_if_eq here
src/lib.rs
Outdated
break; | ||
} | ||
} | ||
match chars.next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could rewrite this into a for loop and execute the None branch after the for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. Do you think this refactoring improves readability significantly?
Also, aren't we iterating at a dynamic rate (i.e. we skip through runs of \\
s but process non-backslashes one character at a time? I could be wrong but I feel like this "run-length encoding" kind of implementation is probably more readable as it stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I just realize that you cannot refactor into a for loop due to use of peekable.
Thanks for the quick review. I've applied some suggested changes and finished the tests. lmk what you think. |
src/windows.rs
Outdated
/// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx | ||
pub fn escape(s: Cow<str>) -> Cow<str> { | ||
let needs_escape = s.chars().any(|c| matches!(c, '"' | '\t' | '\n' | ' ')); | ||
if s.is_empty() || !needs_escape { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little bit confused on why we should for s.is_empty() || !needs_escape
here in windows, but check for !s.is_empty() && needs_escape
on unix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they should be functionally equivalent but I've refactored them to look pretty similar now. lmk what you think.
|
||
use std::borrow::Cow; | ||
use std::env; | ||
#[cfg(unix)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making these modules conditionally compiled is a breaking change.
let encoded = s.encode_wide(); | ||
let needs_escaping = encoded.clone().any(needs_escape); | ||
|
||
if s.is_empty() || !needs_escaping { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo
if s.is_empty() || !needs_escaping { | |
if !s.is_empty() && !needs_escaping { |
Hi, thanks for this project!
A recent PR in
openssh-rust/openssh
provides an implementation forescape(&OsStr) -> Cow<'_, OsStr>
that I think could be a good fit here.It was suggested in a code review that it is preferable to use
shell_escape
for all the escaping needs rather than rolling out a custom implementation. However, currenlyshell_escape
only works withCow<str>
. This PR is an attempt to extend the escaping toOsStr
s as well.iiuc #5 is a relevant issue that we may be able to resolve with this PR.
Let me know if you have any suggestions/comments about these changes and I'll be happy to improve them accordingly.