Skip to content

resize strings after parsing#260

Merged
flori merged 2 commits intoruby:masterfrom
tenderlove:memuse
Jun 11, 2016
Merged

resize strings after parsing#260
flori merged 2 commits intoruby:masterfrom
tenderlove:memuse

Conversation

@tenderlove
Copy link
Member

The parser uses rb_str_buf_new to allocate new strings.
rb_str_buf_new has a minimum size of 128 and is not an embedded
string
. This causes applications that parse JS to allocate extra memory when parsing short strings.

For a real-world example, we can use the mime-types gem. The mime-types
gem stores all mime types inside a JSON file and parses them when you
require the gem.

Here is a sample program:

require 'objspace'
require 'mime-types'

GC.start
GC.start

p ObjectSpace.memsize_of_all String

The example program loads the mime-types gem and outputs the total space
used by all strings. Here are the results of the program before and
after this patch:

** Before **

[aaron@TC json (memuse)]$ ruby test.rb
5497494
[aaron@TC json (memuse)]$

** After **

[aaron@TC json (memuse)]$ ruby -I lib:ext test.rb
3335862
[aaron@TC json (memuse)]$

This change results in a ~40% reduction of memory use for strings in the
mime-types gem.

Thanks @matthewd for finding the problem, and @nobu for the patch!

@tenderlove
Copy link
Member Author

I should have GC'd first, but luckily the percent savings are about the same so I'll update the commit message:

[aaron@TC json (memuse)]$ cat test.rb
require 'objspace'
require 'mime-types'

GC.start
GC.start

p ObjectSpace.memsize_of_all String
[aaron@TC json (memuse)]$ ruby test.rb
5497494
[aaron@TC json (memuse)]$ ruby -I lib:ext test.rb
3335862
[aaron@TC json (memuse)]$

@tenderlove
Copy link
Member Author

bump / @flori

@tenderlove tenderlove force-pushed the memuse branch 2 times, most recently from f458204 to f1795d9 Compare March 3, 2016 22:49
@tenderlove
Copy link
Member Author

@flori I've rebased this against master and separated the .rl file change from the generated parser change. How can I help get this merged?

The parser uses `rb_str_buf_new` to allocate new strings.
`rb_str_buf_new` [has a minimum size of 128 and is not an embedded
string](https://github.com/ruby/ruby/blob/9949407fd90c1c5bfe332141c75db995a9b867aa/string.c#L1119-L1135).  This causes applications that parse JS to allocate extra memory when parsing short strings.

For a real-world example, we can use the mime-types gem.  The mime-types
gem stores all mime types inside a JSON file and parses them when you
require the gem.

Here is a sample program:

```ruby
require 'objspace'
require 'mime-types'

GC.start
GC.start

p ObjectSpace.memsize_of_all String
```

The example program loads the mime-types gem and outputs the total space
used by all strings.  Here are the results of the program before and
after this patch:

** Before **

```
[aaron@TC json (memuse)]$ ruby test.rb
5497494
[aaron@TC json (memuse)]$
```

** After **

```
[aaron@TC json (memuse)]$ ruby -I lib:ext test.rb
3335862
[aaron@TC json (memuse)]$
```

This change results in a ~40% reduction of memory use for strings in the
mime-types gem.

Thanks @matthewd for finding the problem, and @nobu for the patch!
@tenderlove
Copy link
Member Author

@flori I've rebased this against master again. How can I help get this merged? Is there something more I need to do? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants