Skip to content

Storing byte code size in the byte code header. #948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 16, 2016

Conversation

zherczeg
Copy link
Member

@zherczeg zherczeg commented Mar 9, 2016

This reduces the memory consumption, because the new allocator uses less memory if
the size as available when a block is freed. Snapshot generation is also simplified.

@LaszloLango LaszloLango added enhancement An improvement memory consumption Affects memory consumption labels Mar 9, 2016
@@ -1316,12 +1316,12 @@ void
ecma_bytecode_ref (ecma_compiled_code_t *bytecode_p) /**< byte code pointer */
{
/* Abort program if maximum reference number is reached. */
if ((bytecode_p->status_flags >> ECMA_BYTECODE_REF_SHIFT) >= 0x3ff)
if (bytecode_p->refs >= 0xffff)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use UINT16_MAX

@zherczeg zherczeg force-pushed the byte_code_size branch 2 times, most recently from 58ff53e to 7a46bea Compare March 10, 2016 13:22
@LaszloLango
Copy link
Contributor

@zherczeg, please rebase to the current master

@zherczeg
Copy link
Member Author

Raspberry Pi2 results:

                  Benchmark |          RSS(+ is better) |                   Perf(+ is better)
                 3d-cube.js |   80k ->   76k :  +5.000% |   2.530s ->   2.552s :  -0.870%  (+-0.892%) : [≈]
     access-binary-trees.js |   80k ->   76k :  +5.000% |   1.334s ->   1.348s :  -1.049%  (+-1.098%) : [≈]
         access-fannkuch.js |   40k ->   32k : +20.000% |   7.546s ->   7.596s :  -0.663%  (+-1.132%) : [≈]
            access-nbody.js |   40k ->   36k : +10.000% |   2.914s ->   2.920s :  -0.206%  (+-0.806%) : [≈]
bitops-3bit-bits-in-byte.js |   28k ->   24k : +14.286% |   1.666s ->   1.684s :  -1.080%  (+-0.963%) : [-]
     bitops-bits-in-byte.js |   28k ->   24k : +14.286% |   2.420s ->   2.472s :  -2.149%  (+-0.932%) : [-]
      bitops-bitwise-and.js |   32k ->   24k : +25.000% |   3.290s ->   3.260s :  +0.912%  (+-1.318%) : [≈]
      bitops-nsieve-bits.js |  152k ->  148k :  +2.632% |  20.138s ->  20.132s :  +0.030%  (+-1.035%) : [≈]
   controlflow-recursive.js |  140k ->  136k :  +2.857% |   0.926s ->   0.932s :  -0.648%  (+-1.578%) : [≈]
              crypto-aes.js |  108k ->  104k :  +3.704% |   3.544s ->   3.502s :  +1.185%  (+-0.575%) : [+]
              crypto-md5.js |  176k ->  164k :  +6.818% |   7.128s ->   7.180s :  -0.730%  (+-1.304%) : [≈]
             crypto-sha1.js |  124k ->  116k :  +6.452% |   3.884s ->   3.890s :  -0.154%  (+-0.712%) : [≈]
       date-format-tofte.js |   64k ->   60k :  +6.250% |   2.152s ->   2.168s :  -0.743%  (+-1.488%) : [≈]
       date-format-xparb.js |   56k ->   52k :  +7.143% |   0.856s ->   0.848s :  +0.935%  (+-2.399%) : [≈]
             math-cordic.js |   36k ->   32k : +11.111% |   2.898s ->   2.880s :  +0.621%  (+-1.360%) : [≈]
       math-partial-sums.js |   32k ->   24k : +25.000% |   1.584s ->   1.574s :  +0.631%  (+-1.004%) : [≈]
      math-spectral-norm.js |   36k ->   32k : +11.111% |   1.430s ->   1.432s :  -0.140%  (+-0.644%) : [≈]
           string-base64.js |  156k ->  152k :  +2.564% |  30.992s ->  30.930s :  +0.200%  (+-0.876%) : [≈]
            string-fasta.js |   40k ->   36k : +10.000% |   3.504s ->   3.386s :  +3.368%  (+-1.091%) : [+]
            Geometric mean: |    RSS reduction: 10.232% |                Speed up: -0.022% (+-0.273%) : [≈]

Binary sizes

master: 183,388
branch: 182,948

@LaszloLango
Copy link
Contributor

I'd use the 'u'suffix for number constants for unsigned variables and return types. LGTM besides that. Nice improvement.

__extension__ uint32_t is_run_global : 1; /**< flag, indicating whether the snapshot
* was dumped as 'Global scope'-mode code (true)
* or as eval-mode code (false) */
uint8_t is_run_global; /**< flag, indicating whether the snapshot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply use the type bool here? I don't see any alignment requirements or need for multiple flags that would necessitate an integer type with fixed bit-length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like using bools in structures because its storage size is unclear to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First I thought that exact size does not matter here since the struct is not used that often. Now I realize that it does matter, since this header becomes part of the snapshot buffer and will be written to disk, so we will need to know the exact offsets of fields and data, etc. That's really a point against bool.

However, there might still be some issues with uint_8 and also with the way the snapshot buffer is constructed (in jerry.c). The size of a struct may differ from machine to machine, AFAIK it's up to the compiler to define alignments, padding, whatnot. Would this struct be packed, then its size would be 13 bytes, on a regular 32 bit machine it would most probably be 16 bytes in unpacked form, but actually there is no guarantee for that (and there may be exotic architectures where we might get a different result). Making the last field uint32_t makes things a bit more portable IMHO.

As for my just-realized concerns with jerry.c, I'll make a comment there.

@zherczeg
Copy link
Member Author

Issues fixed.

@zherczeg zherczeg force-pushed the byte_code_size branch 2 times, most recently from 6745833 to b263619 Compare March 11, 2016 15:14
@@ -165,7 +165,7 @@ util_free_literal (lexer_literal_t *literal_p) /**< literal */
{
if (!(literal_p->status_flags & LEXER_FLAG_SOURCE_PTR))
{
PARSER_FREE ((uint8_t *) literal_p->u.char_p);
mem_heap_free_block_size_stored ((uint8_t *) literal_p->u.char_p);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better cast this to void *

@akosthekiss
Copy link
Member

After the casts are fixed, LGTM (FWIW)

memory consumption, because the new allocator uses less memory if
the size as available when a block is freed. Snapshot generation
is also simplified.

JerryScript-DCO-1.0-Signed-off-by: Zoltan Herczeg zherczeg.u-szeged@partner.samsung.com
@zherczeg
Copy link
Member Author

I hope I fixed everything.

@LaszloLango
Copy link
Contributor

LGTM

@zherczeg zherczeg merged commit d190ca4 into jerryscript-project:master Mar 16, 2016
@zherczeg zherczeg deleted the byte_code_size branch March 16, 2016 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement memory consumption Affects memory consumption
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants