html_escape: Avoid buffer allocation for strings with no escapable character #87

noteflakes · 2025-10-11T11:29:41Z

The existing html_escape implementation always allocates buffer space (6 times
the length of the input string), even when the input string does not contain any
character that needs to be escaped.

This PR modifies the implementation of optimized_escape_html to not
pre-allocate an output buffer, but instead allocate it on the first occurence of
a character that needs escaping. In addition, instead of copying non-escaped
characters one by one to the output buffer, continuous non-escaped segments of
characters are copied using memcpy.

A synthetic benchmark employing the input strings used in the test_html_escape
method in test/test_erb.rb shows the modified implementation to be about 35%
faster than the original:

ruby 3.5.0preview1 (2025-04-18 master d06ec25be4) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
          escape old   273.773k i/100ms
          escape new   369.558k i/100ms
Calculating -------------------------------------
          escape old      2.766M (± 1.6%) i/s  (361.48 ns/i) -     13.962M in   5.048625s
          escape new      3.765M (± 2.0%) i/s  (265.58 ns/i) -     18.847M in   5.007869s

Comparison:
          escape old:  2766396.0 i/s
          escape new:  3765317.7 i/s - 1.36x  faster

…aracter This change improves reduces allocations and makes `html_escape` ~35% faster in a benchmark with escaped strings taken from the `test_html_escape` test in `test/test_erb.rb`. - Perform buffer allocation on first instance of escapable character. - Instead of copying characters one at a time, copy unescaped segments using `memcpy`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

html_escape: Avoid buffer allocation for strings with no escapable character #87

html_escape: Avoid buffer allocation for strings with no escapable character #87

noteflakes commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

html_escape: Avoid buffer allocation for strings with no escapable character #87

Are you sure you want to change the base?

html_escape: Avoid buffer allocation for strings with no escapable character #87

Conversation

noteflakes commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant