Skip to content

Improve the string descriptor. #1036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2016

Conversation

robertsipka
Copy link
Contributor

Separate the utf-8 and ascii strings.
In case of ascii strings the size is equal to the length, so we able to store this information
in the string descriptor instead of using the string header to store it.

JerryScript-DCO-1.0-Signed-off-by: Robert Sipka rsipka.uszeged@partner.samsung.com

mem_cpointer_t ascii_collection_cp;
/** Size of ascii string in bytes */
uint16_t size;
} ascii_string;
Copy link
Member

@zherczeg zherczeg May 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ascii_string_t, and missing comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

ECMA_STRING_CONTAINER_HEAP_CHUNKS, /**< actual data is on the heap
in a ecma_collection_chunk_t chain */
ECMA_STRING_CONTAINER_HEAP_UTF8_STRING, /**< actual data is on the heap as an utf-8 string */
ECMA_STRING_CONTAINER_HEAP_ASCII_STRING, /**< actual data is on the heap as an ascii string */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put ASCII_STRING first.

ECMA_STRING_CONTAINER_HEAP_NUMBER, /**< actual data is on the heap as a ecma_number_t */
ECMA_STRING_CONTAINER_HEAP_UTF8_STRING, /**< actual data is on the heap as an utf-8 string */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this right after the ECMA_STRING_CONTAINER_HEAP_ASCII_STRING.

@robertsipka
Copy link
Contributor Author

Thanks for the comments, I've updated and rebased with master.

{
string_desc_p->refs_and_container = ECMA_STRING_CONTAINER_HEAP_ASCII_STRING | ECMA_STRING_REF_ONE;
const size_t data_size = new_size;
void *data_p = (void *) mem_heap_alloc_block (data_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use lit_utf8_byte_t * here, if you cast void* to it anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've updated these pointers.

@zherczeg
Copy link
Member

zherczeg commented May 6, 2016

What is the effect on memory of this patch?

@robertsipka
Copy link
Contributor Author

                  Benchmark |          RSS(+ is better) |                   Perf(+ is better)
                 3d-cube.js |   84k ->   84k :  +0.000% |   2.170s ->   2.156s :  +0.645%  (+-1.079%) : [≈]
     access-binary-trees.js |   72k ->   72k :  +0.000% |   1.192s ->   1.176s :  +1.342%  (+-1.215%) : [+]
         access-fannkuch.js |   32k ->   32k :  +0.000% |   6.300s ->   6.332s :  -0.508%  (+-0.908%) : [≈]
            access-nbody.js |   36k ->   36k :  +0.000% |   2.374s ->   2.348s :  +1.095%  (+-0.609%) : [+]
bitops-3bit-bits-in-byte.js |   24k ->   24k :  +0.000% |   1.364s ->   1.360s :  +0.293%  (+-0.824%) : [≈]
     bitops-bits-in-byte.js |   24k ->   24k :  +0.000% |   2.036s ->   2.030s :  +0.295%  (+-0.552%) : [≈]
      bitops-bitwise-and.js |   24k ->   24k :  +0.000% |   2.798s ->   2.858s :  -2.144%  (+-1.475%) : [-]
      bitops-nsieve-bits.js |  164k ->  164k :  +0.000% |   3.932s ->   3.906s :  +0.661%  (+-1.164%) : [≈]
   controlflow-recursive.js |  124k ->  124k :  +0.000% |   0.802s ->   0.800s :  +0.249%  (+-1.145%) : [≈]
              crypto-aes.js |  104k ->  104k :  +0.000% |   2.874s ->   2.868s :  +0.209%  (+-0.716%) : [≈]
              crypto-md5.js |  168k ->  168k :  +0.000% |   1.180s ->   1.180s :  +0.000%  (+-0.000%) : [≈]
             crypto-sha1.js |  116k ->  116k :  +0.000% |   1.116s ->   1.118s :  -0.179%  (+-1.306%) : [≈]
       date-format-tofte.js |   56k ->   56k :  +0.000% |   1.716s ->   1.716s :  +0.000%  (+-0.929%) : [≈]
       date-format-xparb.js |   52k ->   52k :  +0.000% |   0.722s ->   0.720s :  +0.277%  (+-1.272%) : [≈]
             math-cordic.js |   28k ->   28k :  +0.000% |   2.446s ->   2.442s :  +0.164%  (+-1.031%) : [≈]
       math-partial-sums.js |   24k ->   24k :  +0.000% |   1.324s ->   1.312s :  +0.906%  (+-1.094%) : [≈]
      math-spectral-norm.js |   32k ->   32k :  +0.000% |   1.196s ->   1.188s :  +0.669%  (+-1.212%) : [≈]
           string-base64.js |  156k ->  148k :  +5.128% |   5.234s ->   5.278s :  -0.841%  (+-0.737%) : [-]
            string-fasta.js |   40k ->   40k :  +0.000% |   2.662s ->   2.676s :  -0.526%  (+-0.951%) : [≈]
            Geometric mean: |     RSS reduction: 0.277% |                 Speed up: 0.140% (+-0.233%) : [≈]

@zherczeg
Copy link
Member

zherczeg commented May 6, 2016

Cool a string based test improved!

@LaszloLango
Copy link
Contributor

LaszloLango commented May 6, 2016

@robertsipka, could you also compare with '--mem-stats'? @bzsolt can help if you don't know how to.

@robertsipka robertsipka force-pushed the ascii_string branch 2 times, most recently from 448d2ee to cee537e Compare May 6, 2016 11:22
@robertsipka
Copy link
Contributor Author

Benchmark Mem heap (bytes)
3d-cube.js 32760 -> 32752
access-binary-trees.js 49144 -> 49144
access-fannkuch.js 8184 -> 8184
access-nbody.js 8184 -> 8184
bitops-3bit-bits-in-byte.js 928 -> 920
bitops-bits-in-byte.js 872 -> 864
bitops-bitwise-and.js 560 -> 552
bitops-nsieve-bits.js 142736 -> 142728
controlflow-recursive.js 11112 -> 11104
crypto-aes.js 65544 -> 65544
crypto-md5.js 140920 -> 141048
crypto-sha1.js 91680 -> 91648
date-format-tofte.js 24568 -> 24568
date-format-xparb.js 16400 -> 16384
math-cordic.js 2384 -> 2376
math-partial-sums.js 1864 -> 1856
math-spectral-norm.js 8184 -> 8176
string-base64.js 93664 -> 93624
string-fasta.js 16376 -> 16376

string_desc_p->refs_and_container = ECMA_STRING_CONTAINER_HEAP_UTF8_STRING | ECMA_STRING_REF_ONE;
const size_t data_size = new_size + sizeof (ecma_string_heap_header_t);
ecma_string_heap_header_t *data_p = (ecma_string_heap_header_t *) mem_heap_alloc_block (data_size);
lit_utf8_size_t bytes_copied = ecma_string_to_utf8_string (string1_p,
(lit_utf8_byte_t *) (data_p + 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong indentation

mem_cpointer_t ascii_collection_cp;
/** Size of ascii string in bytes */
uint16_t size;
} ascii_string_t;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to ascii_string, this is not a typedef

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@LaszloLango
Copy link
Contributor

LGTM

1 similar comment
@zherczeg
Copy link
Member

zherczeg commented May 9, 2016

LGTM

Separate the utf-8 and ascii strings.
In case of ascii strings the size is equal to the length, so we able to store this information
in the string descriptor instead of using the string header to store it.

JerryScript-DCO-1.0-Signed-off-by: Robert Sipka rsipka.uszeged@partner.samsung.com
@robertsipka
Copy link
Contributor Author

Thanks, I've rebased it with master.

@LaszloLango LaszloLango merged commit de1912f into jerryscript-project:master May 9, 2016
@robertsipka robertsipka deleted the ascii_string branch November 24, 2016 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants