Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for \x?? and improve escapes #84

Merged
merged 1 commit into from
Feb 13, 2018
Merged

Add support for \x?? and improve escapes #84

merged 1 commit into from
Feb 13, 2018

Conversation

torkleyy
Copy link
Contributor

@torkleyy torkleyy commented Jan 21, 2018

This PR seems to contain a regression.

Fixes #82

@torkleyy
Copy link
Contributor Author

This was a stupid mistake, fixed it now :)

This is ready for review.

@torkleyy torkleyy removed the working label Jan 21, 2018
@torkleyy
Copy link
Contributor Author

r? @pyfisch

@pyfisch
Copy link
Contributor

pyfisch commented Jan 21, 2018

Thanks. How about using char::escape_debug instead for both chars and strings.
There is also a u32::from_str_radix to parse hexadecimal numbers.
Rust itself only support very few special escapes and RON should IMHO not invent new ones (or copy them from JSON).

@torkleyy
Copy link
Contributor Author

torkleyy commented Jan 21, 2018 via email

@torkleyy
Copy link
Contributor Author

@pyfisch Is there a reason you suggested escape_debug over e.g. escape_default?

@pyfisch
Copy link
Contributor

pyfisch commented Feb 10, 2018

@torkleyy That I like to be able to read Unicode characters. (escape_unicode escapes just everything)

@torkleyy
Copy link
Contributor Author

torkleyy commented Feb 10, 2018 via email

@kvark
Copy link
Collaborator

kvark commented Feb 11, 2018

bors delegate=pyfisch

@bors
Copy link
Contributor

bors bot commented Feb 11, 2018

✌️ pyfisch can now approve this pull request

@torkleyy
Copy link
Contributor Author

torkleyy commented Feb 11, 2018

I finally finished this :) r? @pyfisch @kvark

Will squash once approved.

Copy link
Contributor

@pyfisch pyfisch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Added a few comments.

src/parse.rs Outdated
@@ -94,7 +95,7 @@ impl<'a> Bytes<'a> {
let c = self.eat_byte()?;

if c != b'\\' && c != b'\'' {
return self.err(ParseError::InvalidEscape);
return self.err(ParseError::InvalidEscape("TODO"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind to resolve the TODO?

src/parse.rs Outdated
Ok(n)
}

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the commented out code.

src/ser/mod.rs Outdated

fn serialize_escaped_str(&mut self, value: &str) {
self.output += "\"";
self.output += &value.chars().flat_map(|c| c.escape_default()).collect::<String>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid the string allocation and immediately push the characters to the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err right, I was trying it out and forgot about it then. Thanks!

let s: String = (1..128).into_iter().flat_map(from_u32).collect();

check_same(s);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some test with non-ascii characters would be nice.

For example and ß. (They should not be escaped)

src/parse.rs Outdated
}
if bytes == 0 {
return self.err(ParseError::InvalidEscape(
"Null character (\\u{0}) not allowed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the null character not allowed? "foo\0bar" is a valid string in Rust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow it in RON though, since that would make usage in e.g. C really difficult.

src/parse.rs Outdated

fn parse_str_escape(&mut self, store: &mut Vec<u8>) -> Result<()> {
use std::iter::repeat;

match self.eat_byte()? {
b'\'' => store.push(b'\''),
b'"' => store.push(b'"'),
b'\\' => store.push(b'\\'),
b'b' => store.push(b'\x08'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these escapes are no standard Rust escapes.

@torkleyy
Copy link
Contributor Author

Okay, addressed all comments except one (about \0).

@pyfisch
Copy link
Contributor

pyfisch commented Feb 11, 2018

I don't like the current handling of NUL. Currently the string "foo\0bar" will successfully be serialized. But it will fail deserialization.

My assumption is that I can serialize any Rust data type with RON and deserialize it again and get an equivalent representation. Special casing strings with a NUL character breaks this assumption.

@torkleyy
Copy link
Contributor Author

You're right that serialization and deserialization should allow the same things. If we decide to exclude \0 I will add a check to the serialization.

@kvark @ozkriff @Xaeroxe What do you think about this?

My main concern is that adding support for \0 would make using RON in e.g. C really difficult. Some parsers for e.g. JSON have to use special hacks and callbacks to make \0 inside a string work.

@pyfisch
Copy link
Contributor

pyfisch commented Feb 11, 2018

My primary concern is that if we exclude \0 somewhere in Servo a string with a \0 will occur and on serialization RON panics. In this case servo will probably be forced to use a different format without this quirk. (I think the current use cases for debugging in Servo and WebRender do not allow strings with \0 but it is better to be safe)

@torkleyy
Copy link
Contributor Author

It doesn't panic, just gives you an error. And really, it doesn't make any sense to include \0 in a string. If a format supports that it's a bug, not a feature.

@kvark
Copy link
Collaborator

kvark commented Feb 11, 2018

C interop and serialization are somewhat orthogonal. I could see a problem with other languages implementing RON support and treating this differently. For that we should have NULL handling defined as implementation dependent. There are cases in OpenGL/Vulkan spec where they do the same: it's not undefined behavior, and it's not standard, it's a defined per implementation behavior.

TL;DR: we should support \0 in this implementation.

Copy link
Contributor

@pyfisch pyfisch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

@pyfisch
Copy link
Contributor

pyfisch commented Feb 13, 2018

bors delegate+

Merge it after squashing.

@bors
Copy link
Contributor

bors bot commented Feb 13, 2018

✌️ torkleyy can now approve this pull request

Add support for "\x??"

Add more tests, fix bug

Finish string escapes

Add non ascii test, fix nits

Remove non-standard escapes

Implement char escapes

Allow NUL
@torkleyy
Copy link
Contributor Author

bors r=pyfisch

bors bot added a commit that referenced this pull request Feb 13, 2018
84: Add support for \x?? and improve escapes r=pyfisch a=torkleyy

~~**This PR seems to contain a regression.**~~

Fixes #82
@bors
Copy link
Contributor

bors bot commented Feb 13, 2018

Build succeeded

@bors bors bot merged commit 2e19ca2 into ron-rs:master Feb 13, 2018
@torkleyy torkleyy deleted the x branch February 13, 2018 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants