⚡ Faster SequenceSet#normalize when frozen#556
Merged
Conversation
Base automatically changed from
sequence_set/drop-normalized-string
to
master
November 25, 2025 18:31
Calling `SequenceSet#normalize` on a frozen set can be more than 4x
faster, by simply re-parsing `@string` and scanning its elements, rather
than fully generating a new string and comparing it with `@string`.
```
normal
reparse and check: 20449.2 i/s
generate and compare: 20267.2 i/s - 1.01x slower
v0.5.12: 3090.2 i/s - 6.62x slower
frozen and normal
generate and compare: 19328485.2 i/s
reparse and check: 17455122.3 i/s - 1.11x slower
v0.5.12: 3730.0 i/s - 5181.95x slower
unsorted
reparse and check: 16936.2 i/s
generate and compare: 16872.9 i/s - 1.00x slower
v0.5.12: 2583.6 i/s - 6.56x slower
abnormal
generate and compare: 17610.8 i/s
reparse and check: 16596.1 i/s - 1.06x slower
v0.5.12: 2560.3 i/s - 6.88x slower
frozen unsorted
reparse and check: 10089.5 i/s
v0.5.12: 2333.7 i/s - 4.32x slower
generate and compare: 2093.1 i/s - 4.82x slower
frozen abnormal
reparse and check: 10392.1 i/s
v0.5.12: 2354.5 i/s - 4.41x slower
generate and compare: 2124.3 i/s - 4.89x slower
```
Please note that these results do vary based on benchmark settings, e.g:
size of the sequence set.
4d0f345 to
0560cce
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Calling
SequenceSet#normalizeon a frozen set can be more than 4x faster, by simply re-parsing@stringand scanning its elements, rather than fully generating a new string and comparing it with@string.Please note that these results vary significantly based on benchmark settings (e.g: size of the sequence set) and randomized factors (e.g: how early in the string is the first out-of-order or abnormal string).
Also, I manually adjusted the benchmark in order to compare prior unreleased commits in this branch vs this PR, because #554 also provides a significant performance boost. So "generate and compare" includes #554, and "reparse and check" represents this PR.