Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for escaping OsStr #9

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

aalekhpatel07
Copy link

Hi, thanks for this project!

A recent PR in openssh-rust/openssh provides an implementation for escape(&OsStr) -> Cow<'_, OsStr> that I think could be a good fit here.

It was suggested in a code review that it is preferable to use shell_escape for all the escaping needs rather than rolling out a custom implementation. However, currenly shell_escape only works with Cow<str>. This PR is an attempt to extend the escaping to OsStrs as well.

iiuc #5 is a relevant issue that we may be able to resolve with this PR.

Let me know if you have any suggestions/comments about these changes and I'll be happy to improve them accordingly.

Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
src/lib.rs Outdated
if cfg!(unix) || env::var("MSYSTEM").is_ok() {
unix::escape_os_str(s)
} else {
unimplemented!("windows::escape_os_str")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can land without an implementation here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you can use std::os::windows::ffi::OsStrExt::encode_wide here to do the escaping then convert it back using std::os::windows::ffi::OsStringExt::from_wide.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NobodyXu I've added a first draft of an implementation of escape_os_str for Windows. Whenever you can find the time, I'd really appreciate a review and some assistance with a test case for test_escape_os_str_from_bytes.

Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
Signed-off-by: Aalekh Patel <aalekh.gwpeck.7998@icloud.com>
@aalekhpatel07 aalekhpatel07 marked this pull request as draft February 11, 2023 22:15
@aalekhpatel07 aalekhpatel07 marked this pull request as ready for review February 14, 2023 03:38
src/lib.rs Outdated
/// assert!(!needs_escape(b'\\' as u16));
/// ```
pub fn needs_escape(wide_byte: u16) -> bool {
let (high, low) = ((wide_byte >> 8) as u8, (wide_byte & 0xFF) as u8);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use char::from_u16 here instead?

src/lib.rs Outdated
/// ```
/// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx
pub fn escape_os_str(s: &OsStr) -> Cow<'_, OsStr> {
let encoded: Vec<u16> = s.encode_wide().collect();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let encoded: Vec<u16> = s.encode_wide().collect();
let encoded: Vec<u16> = s.encode_wide();

Let's avoid this collect and instead clone the iterator.

src/lib.rs Outdated
let mut chars = encoded.into_iter().peekable();
loop {
let mut nslashes = 0;
while let Some(&c) = chars.peek() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use Peekable::next_if_eq here

src/lib.rs Outdated
break;
}
}
match chars.next() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could rewrite this into a for loop and execute the None branch after the for loop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. Do you think this refactoring improves readability significantly?

Also, aren't we iterating at a dynamic rate (i.e. we skip through runs of \\s but process non-backslashes one character at a time? I could be wrong but I feel like this "run-length encoding" kind of implementation is probably more readable as it stands.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just realize that you cannot refactor into a for loop due to use of peekable.

@aalekhpatel07
Copy link
Author

Thanks for the quick review. I've applied some suggested changes and finished the tests. lmk what you think.

src/windows.rs Outdated
/// [msdn]: http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx
pub fn escape(s: Cow<str>) -> Cow<str> {
let needs_escape = s.chars().any(|c| matches!(c, '"' | '\t' | '\n' | ' '));
if s.is_empty() || !needs_escape {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit confused on why we should for s.is_empty() || !needs_escape here in windows, but check for !s.is_empty() && needs_escape on unix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be functionally equivalent but I've refactored them to look pretty similar now. lmk what you think.


use std::borrow::Cow;
use std::env;
#[cfg(unix)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making these modules conditionally compiled is a breaking change.

let encoded = s.encode_wide();
let needs_escaping = encoded.clone().any(needs_escape);

if s.is_empty() || !needs_escaping {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo

Suggested change
if s.is_empty() || !needs_escaping {
if !s.is_empty() && !needs_escaping {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants