-
Notifications
You must be signed in to change notification settings - Fork 683
Global object unescape routine #606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
/* 3. */ | ||
MEM_DEFINE_LOCAL_ARRAY (input_start_p, input_size, lit_utf8_byte_t); | ||
ecma_string_to_utf8_string (input_string_p, input_start_p, (ssize_t) (input_size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you, please, add assertion that the return value is not negative, as in 7064a2c?
dad3532
to
fb045a1
Compare
I've updated the patch |
/* 3. */ | ||
MEM_DEFINE_LOCAL_ARRAY (input_start_p, input_size, lit_utf8_byte_t); | ||
ssize_t sz = ecma_string_to_utf8_string (input_string_p, input_start_p, (ssize_t) (input_size)); | ||
JERRY_ASSERT (sz > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string's size can be zero.
fb045a1
to
5e287fa
Compare
Please try unescape("%ud800\udc00") === "\ud800\udc00" |
|
||
lit_utf8_size_t lit_size = lit_code_point_to_utf8 (code_point, output_char_p); | ||
output_char_p += lit_size; | ||
output_length += lit_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that the output_length
is synchronized with output_char_p
. In the case, we could remove output_length
and calculate it as output_char_p - input_start_p
.
5e287fa
to
1824e76
Compare
lit_utf8_size_t input_size = ecma_string_get_size (input_string_p); | ||
|
||
/* 3. */ | ||
MEM_DEFINE_LOCAL_ARRAY (input_start_p, input_size, lit_utf8_byte_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment why input_size >= output_size, and an assert at the end. I don't think this is trivial.
E.g. %xx is 3 byte long, and the maximum is 0xff, which encoded as 2 bytes in UTF8. Etc.
1824e76
to
fe63e79
Compare
I've updated the patch. |
/* The length of input string is always greater than output string | ||
* so we re-use the input string buffer. | ||
* E.g. %xx is 3 byte long, and the maximum is 0xff, which encoded | ||
* as 2 bytes in UTF8. Etc. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just suggested an text idea. I think this would sound better:
The %xx is three byte long, and the maximum encoded value is 0xff, which maximum encoded length is two byte. Similar to this, the maximum encoded length of %uxxxx is four byte.
One minor thing and LGTM after that. |
f0f0e4a
to
b2ede9d
Compare
@zherczeg , do we you have other any comments? |
This patch conflits with CESU8. I would wait to land that first, update this, and land it. |
@zherczeg CESU8 landed |
664e4de
to
8d6dabe
Compare
I've updated the patch. |
* 8 found valid '%uwxyz' pattern | ||
*/ | ||
uint8_t status = 0; | ||
lit_code_point_t hex_digits = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's enough to use an ecma_char_t
for hex_digits
, the decoded value will always be less than 0xFFFF.
8d6dabe
to
494c5ad
Compare
LGTM |
|
LGTM |
494c5ad
to
846bfb0
Compare
JerryScript-DCO-1.0-Signed-off-by: Szilard Ledan szledan.u-szeged@partner.samsung.com
846bfb0
to
36e90d9
Compare
JerryScript-DCO-1.0-Signed-off-by: Szilard Ledan szledan.u-szeged@partner.samsung.com