Skip to content

Conversation

@mpkorstanje
Copy link
Contributor

@mpkorstanje mpkorstanje commented Jul 30, 2025

🤔 What's changed?

Improve performance by matching step and title keywords using a generated matcher class.

Before main @ b3413a9

Benchmark                          Mode  Cnt     Score    Error  Units
GherkinParserBenchmarkTest.parse  thrpt    5  1940.849 ± 17.126  ops/s

After

Benchmark                          Mode  Cnt     Score    Error  Units
GherkinParserBenchmarkTest.parse  thrpt    5  2213.691 ± 11.385  ops/s

⚡️ What's your motivation?

In #443 @jkronegg shows that unrolling the loops over the keywords speeds gherkin parsing significantly. I'm not sure about the theoretical underpinnings but the effect is there. This refines that solution by making the unrolling work for an arbitrary number of keywords in each language.

🏷️ What kind of change is this?

  • 🏦 Refactoring/debt/DX (improvement to code design, tooling, etc. without changing behaviour)

♻️ Anything particular you want feedback on?

📋 Checklist:

  • I agree to respect and uphold the Cucumber Community Code of Conduct
  • I've changed the behaviour of the code
    • I have added/updated tests to cover my changes.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • Users should know about my change
    • I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

mpkorstanje added a commit that referenced this pull request Jul 30, 2025
Aside from being good practice, extensibility if not designed for should
be prohibited, this will help isolate some of the effects of #445.
mpkorstanje added a commit that referenced this pull request Jul 30, 2025
Aside from being good practice, extensibility if not designed for should
be prohibited, this will help isolate some of the effects of #445.
@mpkorstanje mpkorstanje force-pushed the generated-matcher-with-keyword-type-and-length branch from 71452c8 to 3551231 Compare July 30, 2025 08:51
@mpkorstanje mpkorstanje requested a review from jkronegg July 30, 2025 09:51
@mpkorstanje mpkorstanje changed the title java: Generate keyword matchers java: Improve performance with a generated keyword matcher Jul 30, 2025
@mpkorstanje mpkorstanje marked this pull request as ready for review July 30, 2025 10:13
Copy link
Contributor

@jkronegg jkronegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! The generated code looks good and the give confidence on the loop unrolling.
On my realworld project with 740 BDD scenarios, the GherkinMessagesFeatureParser.parse is 25% faster than with the main branch (100 ms -> 75 ms).

@mpkorstanje mpkorstanje changed the title java: Improve performance with a generated keyword matcher java: Use a generated keyword matcher to improve performance Jul 30, 2025
@mpkorstanje mpkorstanje merged commit c279678 into main Jul 30, 2025
4 checks passed
@mpkorstanje mpkorstanje deleted the generated-matcher-with-keyword-type-and-length branch July 30, 2025 13:13
@mpkorstanje
Copy link
Contributor Author

On my realworld project with 740 BDD scenarios, the GherkinMessagesFeatureParser.parse is 25% faster than with the main branch (100 ms -> 75 ms).

That is incredible. If you have more ideas I'd be happy to see them. 👍

@mpkorstanje mpkorstanje mentioned this pull request Jul 30, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants