Skip to content

Fix size types #943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 8, 2016
Merged

Conversation

akosthekiss
Copy link
Member

Related issue #942

@akosthekiss akosthekiss added the bug Undesired behaviour label Mar 8, 2016
@akosthekiss
Copy link
Member Author

Note: this PR does not only solve the related issue but does a lot more. The reason is that the issue pointed at a problem (incorrect or illogical use of various size_t types) that could have been solved two ways: either with a quick hack (no), or by going through (hopefully) all ecma-to-utf8 string conversion routines (and some more) and applying and propagating all required changes (that's the proper way to go, IMHO).

* @return number of bytes, actually copied to the buffer - if string's content was copied successfully;
* otherwise (in case size of buffer is insufficient) - negative number, which is calculated
* as negation of buffer size, that is required to hold the string's content.
* @return number of bytes, actually copied to the buffer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if the buffer is too small?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we fail with an assert (in debug). That's aligned with other similar functions (e.g., ecma_uint32_to_utf8_string).

Actually, at almost all call sites, we rely on the buffer being preallocated to the exact size of the string. Perhaps the only counter example is ecma_compare_ecma_strings_relational, where a buffer of fixed size is available and there was a logic to decide whether to used that fixed buffer or allocate a new one, and that logic really relied on a negative return value from ecma_string_to_utf8_string, but it was to rewrite to use ecma_string_get_size instead.

@LaszloLango
Copy link
Contributor

This might take effect on performance too. @bzsolt, could you measure it on RPi2?

@bzsolt
Copy link
Member

bzsolt commented Mar 8, 2016

RPi 2 results
master: e926aa7

                  Benchmark |                   Perf(+ is better)
                 3d-cube.js |   2.538s ->   2.582s :  -1.734%  (+-1.268%) : [-]
     access-binary-trees.js |   1.366s ->   1.356s :  +0.732%  (+-1.163%) : [≈]
         access-fannkuch.js |   7.654s ->   7.616s :  +0.496%  (+-0.852%) : [≈]
            access-nbody.js |   2.938s ->   2.930s :  +0.272%  (+-1.038%) : [≈]
bitops-3bit-bits-in-byte.js |   1.666s ->   1.682s :  -0.960%  (+-1.517%) : [≈]
     bitops-bits-in-byte.js |   2.404s ->   2.456s :  -2.163%  (+-0.671%) : [-]
      bitops-bitwise-and.js |   3.274s ->   3.248s :  +0.794%  (+-1.174%) : [≈]
      bitops-nsieve-bits.js |  20.134s ->  20.110s :  +0.119%  (+-0.829%) : [≈]
   controlflow-recursive.js |   0.920s ->   0.930s :  -1.087%  (+-0.000%) : [-]
              crypto-aes.js |   3.544s ->   3.568s :  -0.677%  (+-0.924%) : [≈]
              crypto-md5.js |   7.130s ->   7.126s :  +0.056%  (+-1.033%) : [≈]
             crypto-sha1.js |   3.888s ->   3.902s :  -0.360%  (+-1.288%) : [≈]
       date-format-tofte.js |   2.184s ->   2.182s :  +0.092%  (+-1.154%) : [≈]
       date-format-xparb.js |   0.880s ->   0.864s :  +1.818%  (+-1.282%) : [+]
             math-cordic.js |   2.862s ->   2.902s :  -1.398%  (+-1.122%) : [-]
       math-partial-sums.js |   1.586s ->   1.588s :  -0.126%  (+-1.590%) : [≈]
      math-spectral-norm.js |   1.428s ->   1.442s :  -0.980%  (+-0.916%) : [-]
           string-base64.js |  31.038s ->  31.048s :  -0.032%  (+-1.075%) : [≈]
            string-fasta.js |   3.444s ->   3.360s :  +2.439%  (+-1.068%) : [+]
            Geometric mean: |                Speed up: -0.136% (+-0.253%) : [≈]

Binary sizes (bytes): 187,160 -> 187,164

@LaszloLango
Copy link
Contributor

@bzsolt, thanks. @akiss77, LGTM. Please squash before merge.

@akosthekiss
Copy link
Member Author

Honestly, I've expected both size and performance gain. Two dead functons have been eliminated, and ecma_string_to_utf8_string (which seems to be a quite often called function) does not call ecma_string_get_size anymore in release. I expected these to show somewhere. :(

Anyway, the compiler error is fixed at least, as well as abuse of the size_t types.

@zherczeg
Copy link
Member

zherczeg commented Mar 8, 2016

I am happy you spent time on checking this. Thank you.
LGTM

E.g.,
* `ssize_t` was used where `lit_utf8_size_t` or `jerry_api_size_t`
  would have been correct,
* `lit_utf8_size_t` was used where `ecma_length_t` would have been
  correct.

Note, the patch also includes internal and public API changes:
* `ecma_string_to_utf8_string` does not return negative value if
   output buffer is not large enough to contain the string; the
   buffer is expected to be large enough. (`ecma_string_get_size`
   can be used to retrieve the required size.)
* `jerry_api_string_to_char_buffer` adapts the same logic (and
  `jerry_api_get_string_size` can be used to determine the
  required size of the buffer).

Related issue: jerryscript-project#942

JerryScript-DCO-1.0-Signed-off-by: Akos Kiss akiss@inf.u-szeged.hu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Undesired behaviour
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants