reuse regex matcher in non FST index LIKE queries#8261
reuse regex matcher in non FST index LIKE queries#8261richardstartin merged 2 commits intoapache:masterfrom
Conversation
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Let's apply the same improvement to raw value based evaluator
...ava/org/apache/pinot/core/operator/filter/predicate/RegexpLikePredicateEvaluatorFactory.java
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #8261 +/- ##
=============================================
- Coverage 64.06% 30.77% -33.29%
=============================================
Files 1586 1619 +33
Lines 83399 85080 +1681
Branches 12641 12834 +193
=============================================
- Hits 53427 26184 -27243
- Misses 26129 56566 +30437
+ Partials 3843 2330 -1513
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
| } | ||
|
|
||
| public Pattern getPattern() { | ||
| if (_pattern == null) { |
There was a problem hiding this comment.
Do this lazily because it's an overhead when an FST index is available
There was a problem hiding this comment.
The pattern can be accessed by multiple threads, so might be better to make it volatile or make it atomic swap?
There was a problem hiding this comment.
It doesn't matter, the worst that can happen is it gets compiled more than once because the operation is idempotent.
| } | ||
|
|
||
| public Pattern getPattern() { | ||
| if (_pattern == null) { |
There was a problem hiding this comment.
The pattern can be accessed by multiple threads, so might be better to make it volatile or make it atomic swap?
The profile below was taken from one of our customer's deployments - very high allocation rates are observed in no-index LIKE queries, because of matcher construction.

This PR simply reuses the
Matcheras it will never be used across threads at the segment level.This decreases allocation significantly (2.5x) and may slightly decrease average query time.
before:
after: