Releases · coregx/coregex

06 Feb 01:15

kolkov

v0.12.0

a30fd70

v0.12.0: Rust-inspired optimizations Latest

Latest

Performance

Anti-quadratic guard for reverse suffix/inner/suffix-set searches — prevents O(n²) degradation on high false-positive suffix workloads, falls back to PikeVM when quadratic detected
Lazy DFA 4x loop unrolling — process 4 state transitions per inner loop iteration, check special states between batches
Prefilter IsFast() gate — skip reverse search optimizations when fast SIMD-backed prefix prefilter already exists
DFA cache clear & continue — on cache overflow, clear and fall back to PikeVM for current search instead of permanently disabling DFA

Fixed

OnePass DFA capture limit — tighten from 17 to 16 capture groups (uint32 slot mask = 32 bits)

Benchmark (AMD EPYC, regex-bench)

Pattern	coregex	vs stdlib	vs Rust
suffix	0.91ms	257x	1.4x faster
email	0.70ms	383x	1.9x faster
ip	2.19ms	225x	5.5x faster
uri	0.76ms	340x	1.2x faster
multiline_php	0.60ms	171x	1.2x faster
anchored_php	0.03ms	~1x	12.0x faster

Assets 2

01 Feb 21:33

kolkov

v0.11.9

8b528fa

v0.11.9: Fix missing first-byte prefilter in FindAll

Fixed

Missing first-byte prefilter in FindAll state-reusing path (#107)
- findIndicesBoundedBacktrackerAtWithState was missing anchoredFirstBytes O(1) check
- Pattern ^/.*[\w-]+\.php (without $) took 377ms instead of 40µs on 6MB input
- Fix: 377ms → 40µs (9000x improvement for non-matching anchored patterns)

Full Changelog

v0.11.8...v0.11.9

Assets 2

01 Feb 20:41

kolkov

v0.11.8

f0f527d

v0.11.8: Fix UseAnchoredLiteral regression

Fixed

Critical regression in UseAnchoredLiteral strategy (#107)
- FindIndices* and findIndicesAtWithState were missing UseAnchoredLiteral case
- Pattern ^/.*[\w-]+\.php$ fell through to slow NFA path
- Regression: 0.01ms → 408ms (40,000x slower)
- Fix: 408ms → 0.5ms (O(1) anchored literal matching restored)

Full Changelog

v0.11.7...v0.11.8

Assets 2

01 Feb 19:50

kolkov

v0.11.7

1480f40

v0.11.7: FindAll optimization - 1.08x faster than stdlib

Fixed

FindAll now uses optimized state-reusing path

FindAll was using slow per-match loop instead of optimized findAllIndicesStreaming
Results for (\w{2,8})+ on 6MB: 2179ms → 834ms (2.6x faster)
Now 1.08x faster than stdlib (was 2.4x slower in regex-bench)

Full Changelog

See CHANGELOG.md

Assets 2

01 Feb 18:56

kolkov

v0.11.6

fce1691

v0.11.6: PikeVM 6MB optimization - 1.68x faster than stdlib

Performance

Major PikeVM optimization achieving 1.68x speedup over stdlib for large inputs (was 2.2x slower).

Key Changes

Windowed BoundedBacktracker (V12): Search in 914KB windows before PikeVM fallback
SlotTable architecture: Rust-style per-state slot storage
Dynamic slot sizing: 0 (IsMatch), 2 (Find), full (Captures)
Lightweight searchThread: 16 bytes (was 40+ bytes)

Benchmark Results

Pattern (\w{2,8})+ vs stdlib:

Size	Speedup
10KB	1.68x faster
50KB	1.88x faster
100KB	2.04x faster
1MB	1.67x faster
6MB	1.68x faster

6MB improvement: 1900ms → 628ms (3x faster)

Full Changelog

See CHANGELOG.md

Assets 2

01 Feb 09:46

kolkov

v0.11.5

de173be

v0.11.5: Fix checkHasWordBoundary catastrophic slowdown

Summary

Fixes catastrophic performance regression in patterns with \w{n,m} quantifiers (Issue #105).

Before: 3 minutes 22 seconds on 79KB input (7,000,000x slower than stdlib)
After: 3.6 µs on 79KB input (8.6x faster than stdlib)

Changes

Fixed

checkHasWordBoundary catastrophic slowdown (Issue #105)
- Root cause: O(N*M) complexity from scanning all NFA states per byte
- Fix: Use NewBuilderWithWordBoundary(), add hasWordBoundary guards, anchored prefilter verification

Performance

DFA state lookup: map → slice — 42% CPU time eliminated
Literal extraction from capture/repeat groups — better prefilters
- =($\w...){2} now extracts =$ (2 bytes) instead of just =

Benchmarks (79KB input)

Stage	Time	vs stdlib
Before fix	3m 22s	7,000,000x slower
After fix	3.6 µs	8.6x faster

Credits

@danslo for root cause analysis and fix suggestions

Full Changelog: v0.11.4...v0.11.5

Contributors

danslo

Assets 2

16 Jan 15:59

kolkov

v0.11.4

8baa0ef

v0.11.4: FindAll multiline optimization

Fixed

FindAll/FindAllIndex now use UseMultilineReverseSuffix strategy (Issue #102)
- FindIndicesAt() was missing dispatch for UseMultilineReverseSuffix
- IsMatch/Find were fast (1µs), but FindAll was slow (78ms) — 100x gap vs Rust
- After fix: FindAll on 6MB with 2000 matches: ~1ms (was 78ms)

Performance

Operation	Before	After	Improvement
FindAll (6MB, 2000 matches)	78ms	~1ms	78x faster
vs Rust gap	100x slower	~1.3x slower	Near parity!

Changed

Updated golang.org/x/sys v0.39.0 → v0.40.0

Full Changelog: v0.11.3...v0.11.4

Assets 2

16 Jan 14:45

kolkov

v0.11.3

43efbbd

v0.11.3: Prefix fast path 319-552x speedup

Performance

Pattern (?m)^/.*\.php now 319-552x faster than stdlib (was 3.5-5.7x in v0.11.1)

Operation	coregex	stdlib	Speedup
IsMatch	182 ns	100 µs	552x
Find	240 ns	81 µs	338x
CountAll	58 µs	18.7 ms	319x

Algorithm

Suffix prefilter finds .php candidates (SIMD memmem)
SIMD backward scan to find line start (bytes.LastIndexByte)
O(1) prefix byte check (/ at line start)
Skip-to-next-line on mismatch (avoids O(n²) worst case)
DFA fallback for complex patterns without extractable prefix

Changes

MultilineReverseSuffixSearcher.prefixBytes for O(1) verification
SetPrefixLiterals() extracts prefix from pattern
findLineStart() uses SIMD bytes.LastIndexByte
Skip-to-next-line: on prefix mismatch, jump to next \n position

Fixes #99

Assets 2

16 Jan 13:46

kolkov

v0.11.2

fe0bc83

v0.11.2: DFA verification for UseMultilineReverseSuffix

Performance Improvement

Replace O(n*m) PikeVM verification with O(n) DFA verification for multiline suffix patterns.

Issue: #99 (Rust regex 84x faster on (?m)^/.*\.php)

Benchmark Results

Case	Before	After	Speedup
No-match (2KB)	1136 ns	108 ns	10.5x
Long no-match	25937 ns	197 ns	131x
Large input (6MB)	66 ms	~5-10 ms	10-30x (expected)

Changes

MultilineReverseSuffixSearcher.forwardDFA replaces pikevm field
Uses lazy.DFA.SearchAtAnchored() for linear-time anchored matching
lazy.CompileWithConfig() creates forward DFA with proper config

Research Insight

Analysis of Rust regex-automata revealed that the hybrid (lazy) DFA does NOT use per-state acceleration — only the dense (pre-compiled) DFA does. The real performance difference comes from using DFA vs NFA/PikeVM for verification.

coregex already has partial state acceleration in dfa/lazy/. The main fix was switching from PikeVM to DFA verification.

Full Changelog: v0.11.1...v0.11.2

Assets 2

15 Jan 21:55

kolkov

v0.11.1

6552443

v0.11.1: UseMultilineReverseSuffix 3.5-5.7x speedup

What's New

New 18th strategy UseMultilineReverseSuffix for multiline suffix patterns like (?m)^/.*\.php.

Performance (Issue #97)

Before: coregex was 24% slower than stdlib
After: coregex is 3.5-5.7x faster than stdlib

Operation	coregex	stdlib	Speedup
IsMatch (0.5MB)	20.6 µs	72.2 µs	3.5x
Find (0.5MB)	15.3 µs	68.7 µs	4.5x
CountAll (200 matches)	2.56 ms	14.6 ms	5.7x
No-match (small)	90 ns	1.1 µs	12x
No-match (2KB)	184 ns	24 µs	130x

Algorithm

Suffix prefilter finds .php candidates
Backward scan to line start (\n or pos 0)
Forward PikeVM verification

Files

meta/reverse_suffix_multiline.go (NEW)
meta/reverse_suffix_multiline_test.go (NEW)

Full Changelog: v0.11.0...v0.11.1

Assets 2

Releases: coregx/coregex

v0.12.0: Rust-inspired optimizations

Performance

Fixed

Benchmark (AMD EPYC, regex-bench)

Uh oh!

v0.11.9: Fix missing first-byte prefilter in FindAll

Fixed

Full Changelog

Uh oh!

v0.11.8: Fix UseAnchoredLiteral regression

Fixed

Full Changelog

Uh oh!

v0.11.7: FindAll optimization - 1.08x faster than stdlib

Fixed

Full Changelog

Uh oh!

v0.11.6: PikeVM 6MB optimization - 1.68x faster than stdlib

Performance

Key Changes

Benchmark Results

Full Changelog

Uh oh!

v0.11.5: Fix checkHasWordBoundary catastrophic slowdown

Summary

Changes

Fixed

Performance

Benchmarks (79KB input)

Credits

Contributors

Uh oh!

v0.11.4: FindAll multiline optimization

Fixed

Performance

Changed

Uh oh!

v0.11.3: Prefix fast path 319-552x speedup

Performance

Algorithm

Changes

Uh oh!

v0.11.2: DFA verification for UseMultilineReverseSuffix

Performance Improvement

Benchmark Results

Changes

Research Insight

Uh oh!

v0.11.1: UseMultilineReverseSuffix 3.5-5.7x speedup

What's New

Performance (Issue #97)

Algorithm

Files

Uh oh!