Regengo is a compile-time finite state machine generator for regular expressions. It converts regex patterns into optimized Go code, leveraging the Go compiler's optimizations for type-safe, pattern-specific code generation.
High Performance — 2-15x faster than Go's regexp, including capture group extraction
Compile-Time Safety — Invalid capture group references fail at Go compilation, not runtime
Smart Engine Selection — Automatically chooses Thompson NFA, DFA, or TDFA based on pattern analysis
Fast Replacers — Pre-compiled replacement templates, 2-3x faster than stdlib
Efficient Streaming — Match patterns over io.Reader with constant memory and cross-boundary support
Zero Allocations — FindStringReuse, FindAllStringAppend, ReplaceAllBytesAppend for hot paths
Rigorously Tested — Over 2,000 generated tests across 250 patterns verify correctness against Go stdlib
- Highlights
- Installation
- Quick Start
- Generated Methods
- Capture Groups
- Replace API
- Performance
- Streaming API
- Transform API
- CLI Reference
- Documentation
- API Comparison
- License
go install github.com/KromDaniel/regengo/cmd/regengo@latestregengo -pattern '(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})' \
-name Date \
-output date.go \
-package mainimport "github.com/KromDaniel/regengo"
err := regengo.Compile(regengo.Options{
Pattern: `(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})`,
Name: "Date",
OutputFile: "date.go",
Package: "main",
})// Match
if CompiledDate.MatchString("2024-12-25") {
fmt.Println("Valid date!")
}
// Find with captures
result, ok := CompiledDate.FindString("2024-12-25")
if ok {
fmt.Printf("Year: %s, Month: %s, Day: %s\n", result.Year, result.Month, result.Day)
}
// Find all
matches := CompiledDate.FindAllString("Dates: 2024-01-15 and 2024-12-25", -1)
for _, m := range matches {
fmt.Println(m.Match)
}type Date struct{}
var CompiledDate = Date{}
// Matching
func (Date) MatchString(input string) bool
func (Date) MatchBytes(input []byte) bool
// Finding (with captures)
func (Date) FindString(input string) (*DateResult, bool)
func (Date) FindStringReuse(input string, reuse *DateResult) (*DateResult, bool)
func (Date) FindBytes(input []byte) (*DateBytesResult, bool)
func (Date) FindBytesReuse(input []byte, reuse *DateBytesResult) (*DateBytesResult, bool)
// Finding all
func (Date) FindAllString(input string, n int) []*DateResult
func (Date) FindAllStringAppend(input string, n int, s []*DateResult) []*DateResult
func (Date) FindAllBytes(input []byte, n int) []*DateBytesResult
func (Date) FindAllBytesAppend(input []byte, n int, s []*DateBytesResult) []*DateBytesResult
// Streaming (for large files/network)
func (Date) FindReader(r io.Reader, cfg stream.Config, onMatch func(stream.Match[*DateBytesResult]) bool) error
func (Date) FindReaderCount(r io.Reader, cfg stream.Config) (int64, error)
func (Date) FindReaderFirst(r io.Reader, cfg stream.Config) (*DateBytesResult, int64, error)
// Transform (io.Reader-based streaming transformation)
func (Date) NewTransformReader(r io.Reader, cfg stream.TransformConfig, onMatch func(*DateBytesResult, func([]byte))) io.Reader
func (Date) ReplaceReader(r io.Reader, template string) io.Reader
func (Date) SelectReader(r io.Reader, pred func(*DateBytesResult) bool) io.Reader
func (Date) RejectReader(r io.Reader, pred func(*DateBytesResult) bool) io.Reader
// Replace (runtime template parsing)
func (Date) ReplaceAllString(input string, template string) string
func (Date) ReplaceAllBytes(input []byte, template string) []byte
func (Date) ReplaceAllBytesAppend(input []byte, template string, buf []byte) []byte
func (Date) ReplaceFirstString(input string, template string) string
func (Date) ReplaceFirstBytes(input []byte, template string) []byte
// Replace precompiled (when using -replacer flag, N = 0, 1, 2...)
func (Date) ReplaceAllStringN(input string) string
func (Date) ReplaceAllBytesN(input []byte) []byte
func (Date) ReplaceAllBytesAppendN(input []byte, buf []byte) []byte
func (Date) ReplaceFirstStringN(input string) string
func (Date) ReplaceFirstBytesN(input []byte) []byte
// Utility
func (Date) MatchLengthInfo() (minLen, maxLen int)Regengo automatically generates a _test.go file with correctness tests and benchmarks. See Auto-Generated Tests for details.
Named capture groups become typed struct fields:
// Pattern: (?P<user>\w+)@(?P<domain>\w+)
type EmailResult struct {
Match string
User string // from (?P<user>...)
Domain string // from (?P<domain>...)
}
result, ok := CompiledEmail.FindString("user@example.com")
if ok {
fmt.Println(result.User, result.Domain) // "user" "example"
}For hot paths, reuse result structs to eliminate allocations:
// Single match reuse
var reuse EmailResult
for _, input := range inputs {
result, ok := CompiledEmail.FindStringReuse(input, &reuse)
if ok {
process(result.User, result.Domain)
}
}
// FindAll with append reuse
var results []*DateResult
for _, input := range inputs {
results = CompiledDate.FindAllStringAppend(input, -1, results[:0])
for _, r := range results {
process(r.Year, r.Month, r.Day)
}
}Replace matches using capture group references. Supports both runtime templates and pre-compiled templates for maximum performance.
Compile-time safety: Pre-compiled replacer templates are validated during code generation. References to non-existent capture groups (e.g., $invalid or $3 when only 2 groups exist) cause a compile error—not a runtime surprise.
// Generate with pre-compiled replacer
// regengo -pattern '(?P<user>\w+)@(?P<domain>\w+)' -name Email -replacer '$user@HIDDEN' -output email.go
input := "Contact alice@example.com or bob@test.org"
// Pre-compiled (fastest) - template: "$user@HIDDEN"
result := CompiledEmail.ReplaceAllString0(input)
// Result: "Contact alice@HIDDEN or bob@HIDDEN"
// Runtime (flexible) - any template at runtime
result := CompiledEmail.ReplaceAllString(input, "[$0]")
// Result: "Contact [alice@example.com] or [bob@test.org]"| Syntax | Description |
|---|---|
$0 |
Full match |
$1, $2 |
Capture by index |
$name |
Capture by name |
$$ |
Literal $ |
See Replace API Guide for complete documentation.
Regengo consistently outperforms Go's standard regexp package:
| Pattern | Method | stdlib | regengo | Speedup |
|---|---|---|---|---|
Date \d{4}-\d{2}-\d{2} |
FindString | 105 ns | 7 ns | 14x faster |
| Multi-date extraction | FindAllString | 431 ns | 49 ns | 8.9x faster |
| Email validation | MatchString | 1554 ns | 507 ns | 3x faster |
| Log parser | FindString | 399 ns | 121 ns | 3.3x faster |
Memory: 50-100% fewer allocations. Zero allocations with Reuse variants.
See Detailed Benchmarks for complete results.
Process any io.Reader with constant memory. Unlike Go's regexp.FindReaderIndex which only finds the first match, Regengo finds all matches in a stream—handling buffering and cross-boundary matches automatically. Matches are delivered via callback, avoiding slice allocations and enabling true streaming semantics.
file, _ := os.Open("server.log")
defer file.Close()
err := CompiledDate.FindReader(file, stream.Config{}, func(m stream.Match[*DateBytesResult]) bool {
fmt.Printf("Found at offset %d: %s\n", m.StreamOffset, m.Result.Match)
return true // continue
})See Streaming API Guide for details.
Transform streams by replacing, filtering, or modifying pattern matches. Returns an io.Reader for standard Go composition with io.Copy, io.MultiReader, HTTP handlers, etc.
Memory-efficient: Process arbitrarily large files with constant memory usage.
// Redact all emails in a stream
file, _ := os.Open("data.log")
masked := CompiledEmail.ReplaceReader(file, "[REDACTED]")
io.Copy(os.Stdout, masked)
// Chain multiple transformations
var r io.Reader = file
r = CompiledEmail.ReplaceReader(r, "[EMAIL]")
r = CompiledIP.ReplaceReader(r, "[IP]")
r = stream.LineFilter(r, func(line []byte) bool {
return !bytes.HasPrefix(line, []byte("DEBUG"))
})
io.Copy(os.Stdout, r)| Method | Description |
|---|---|
ReplaceReader(r, template) |
Replace matches with template ($name, $1, $0) |
SelectReader(r, pred) |
Output only matches where predicate returns true |
RejectReader(r, pred) |
Remove matches where predicate returns true |
NewTransformReader(r, cfg, fn) |
Full control: emit 0, 1, or N outputs per match |
See Transform API Guide for complete documentation.
Required:
-pattern string Regex pattern to compile
-name string Name for generated struct
-output string Output file path
Basic:
-package string Package name (default "main")
-test-inputs Comma-separated test inputs
-no-test Disable test file generation
-no-pool Disable sync.Pool (pool enabled by default for 0 allocs)
-replacer string Pre-compiled replacement template (can repeat)
Analysis:
-analyze Output pattern analysis as JSON (no code generation)
-verbose Print analysis decisions
Engine Control:
-force-thompson Force Thompson NFA (prevents ReDoS)
-force-tnfa Force Tagged NFA for captures
-force-tdfa Force Tagged DFA for captures
-tdfa-threshold Max DFA states before fallback (default: 500)
Info:
-version Print version information
-help Show help message
- API Comparison - Full regengo vs stdlib reference
- Replace API - String replacement with captures
- Streaming API - Processing large files and streams
- Transform API - Stream transformation with io.Reader composition
- Analysis & Complexity - Engine selection and guarantees
- Unicode Support - Unicode character classes
- Detailed Benchmarks - Complete performance data
- Auto-Generated Tests - Generated correctness tests and benchmarks
Regengo returns typed structs with named fields instead of []string slices—access result.Year instead of match[1].
stdlib regexp |
regengo | Notes |
|---|---|---|
MatchString(s) |
MatchString(s) |
Identical |
MatchBytes(b) |
MatchBytes(b) |
Identical |
FindStringSubmatch(s) |
FindString(s) |
[]string → *Result |
FindSubmatch(b) |
FindBytes(b) |
[][]byte → *BytesResult |
FindAllStringSubmatch(s, n) |
FindAllString(s, n) |
[][]string → []*Result |
FindAllSubmatch(b, n) |
FindAllBytes(b, n) |
[][][]byte → []*BytesResult |
FindReaderIndex(r) |
FindReader(r, cfg, cb) |
First match → all matches |
| - | FindReaderCount(r, cfg) |
Count matches in stream |
| - | FindReaderFirst(r, cfg) |
First match with captures |
| - | Find*Reuse(...) |
Zero-alloc result reuse |
| - | FindAll*Append(...) |
Append to existing slice |
ReplaceAllString(s, t) |
ReplaceAllString(s, t) |
Runtime template |
ReplaceAllString(s, t) |
ReplaceAllString0(s) |
Pre-compiled (3x faster) |
| - | ReplaceAllBytesAppend(...) |
Zero-alloc replace |
| - | ReplaceReader(r, t) |
Stream transform |
| - | SelectReader(r, pred) |
Extract matches from stream |
| - | RejectReader(r, pred) |
Remove matches from stream |
See Full API Comparison for complete reference with examples.
MIT License - see LICENSE for details.

