Skip to content

fix: UseAnchoredLiteral regression in FindIndices#110

Merged
kolkov merged 1 commit intomainfrom
hotfix/anchored-literal-regression
Feb 1, 2026
Merged

fix: UseAnchoredLiteral regression in FindIndices#110
kolkov merged 1 commit intomainfrom
hotfix/anchored-literal-regression

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Feb 1, 2026

Summary

  • Add missing UseAnchoredLiteral case to FindIndices, FindIndicesAt, and findIndicesAtWithState
  • Add findIndicesAnchoredLiteral and findIndicesAnchoredLiteralAt methods

Problem

Pattern ^/.*[\w-]+\.php$ fell through to slow NFA path because UseAnchoredLiteral was missing in switch statements.

Regression: 0.01ms → 408ms (40,000x slower)

Fix

After fix: 408ms → 0.5ms (O(1) anchored literal matching restored)

Test plan

  • All tests pass
  • Linter passes
  • Local benchmark confirms fix: anchored_php: 566µs

Fixes #107

FindIndices*, FindIndicesAt*, and findIndicesAtWithState were
missing the UseAnchoredLiteral strategy case, causing patterns
like ^/.*[\w-]+\.php$ to fall through to slow NFA path.

Regression: 0.01ms -> 408ms (40,000x slower)
Fix: 408ms -> 0.5ms (back to O(1) anchored literal matching)

Fixes #107
@github-actions
Copy link

github-actions bot commented Feb 1, 2026

Benchmark Comparison

Comparing main → PR #110

Summary: geomean 139.0n 138.8n -0.11%

⚠️ Potential regressions detected:

geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
AnchoredLiteralVsStdlib/stdlib_no_match-4               600.0n ± ∞ ¹    610.1n ± ∞ ¹     +1.68% (p=0.032 n=5)
BranchDispatch_Coregex/Hex32-4                          7.569n ± ∞ ¹    7.620n ± ∞ ¹     +0.67% (p=0.008 n=5)
IPRegex_Find/coregex_1KB_sparse-4                       2.426µ ± ∞ ¹    4.405µ ± ∞ ¹    +81.57% (p=0.008 n=5)
IPRegex_Find/stdlib_1MB_sparse-4                        153.4µ ± ∞ ¹   1763.8µ ± ∞ ¹  +1049.94% (p=0.008 n=5)
IPRegex_Find/coregex_6MB_sparse-4                       3.332µ ± ∞ ¹    4.924µ ± ∞ ¹    +47.78% (p=0.008 n=5)
IPRegex_Find/stdlib_1MB_dense-4                         1.748µ ± ∞ ¹   68.276µ ± ∞ ¹  +3805.95% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit f0f527d into main Feb 1, 2026
15 checks passed
@kolkov kolkov deleted the hotfix/anchored-literal-regression branch February 1, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: char_class and word_repeat regressions in v0.11.5

1 participant