Skip to content

Commit 0f7c75b

Browse files
committed
Run benchmarks on pushes and pull requests
Run weekly scheduled benchmarks Compare results against previous runs Alert on performance regressions (>10% slower)
1 parent 88d7d77 commit 0f7c75b

File tree

8 files changed

+232
-203
lines changed

8 files changed

+232
-203
lines changed

.github/workflows/benchmark.yml

Lines changed: 67 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -2,81 +2,81 @@ name: Performance Benchmarks
22

33
on:
44
push:
5-
branches: [ main, develop ]
5+
branches: [main, develop]
66
pull_request:
7-
branches: [ main, develop ]
7+
branches: [main, develop]
88
# Schedule benchmarks to run weekly
99
schedule:
10-
- cron: '0 0 * * 0' # Run at midnight on Sundays
10+
- cron: "0 0 * * 0" # Run at midnight on Sundays
1111

1212
jobs:
1313
benchmark:
1414
runs-on: ubuntu-latest
1515
steps:
16-
- uses: actions/checkout@v3
17-
with:
18-
fetch-depth: 0 # Fetch all history for proper comparison
16+
- uses: actions/checkout@v3
17+
with:
18+
fetch-depth: 0 # Fetch all history for proper comparison
1919

20-
- name: Set up Python
21-
uses: actions/setup-python@v4
22-
with:
23-
python-version: '3.10'
24-
cache: 'pip'
25-
26-
- name: Install dependencies
27-
run: |
28-
python -m pip install --upgrade pip
29-
pip install -e .
30-
pip install -r requirements-dev.txt
31-
pip install pytest-benchmark
32-
33-
- name: Restore benchmark data
34-
uses: actions/cache@v3
35-
with:
36-
path: .benchmarks
37-
key: benchmark-${{ runner.os }}-${{ hashFiles('**/requirements*.txt') }}
38-
restore-keys: |
39-
benchmark-${{ runner.os }}-
40-
41-
- name: Run benchmarks and save baseline
42-
run: |
43-
# Run benchmarks and save results
44-
pytest tests/benchmark_text_service.py -v --benchmark-autosave
45-
46-
- name: Check for performance regression
47-
run: |
48-
# Compare against the previous benchmark if available
49-
# Fail if performance degrades by more than 10%
50-
if [ -d ".benchmarks" ]; then
51-
BASELINE=$(ls -t .benchmarks/Linux-CPython-3.10-64bit | head -n 2 | tail -n 1)
52-
CURRENT=$(ls -t .benchmarks/Linux-CPython-3.10-64bit | head -n 1)
53-
if [ -n "$BASELINE" ] && [ "$BASELINE" != "$CURRENT" ]; then
54-
# Set full paths to the benchmark files
55-
BASELINE_FILE="$benchmark_dir/$BASELINE"
56-
CURRENT_FILE="$benchmark_dir/$CURRENT"
57-
58-
echo "Comparing current run ($CURRENT) against baseline ($BASELINE)"
59-
# First just show the comparison
60-
pytest tests/benchmark_text_service.py --benchmark-compare
61-
62-
# Then check for significant regressions
63-
echo "Checking for performance regressions (>10% slower)..."
64-
# Use our Python script for benchmark comparison
65-
python scripts/compare_benchmarks.py "$BASELINE_FILE" "$CURRENT_FILE"
20+
- name: Set up Python
21+
uses: actions/setup-python@v4
22+
with:
23+
python-version: "3.10"
24+
cache: "pip"
25+
26+
- name: Install dependencies
27+
run: |
28+
python -m pip install --upgrade pip
29+
pip install -e .
30+
pip install -r requirements-dev.txt
31+
pip install pytest-benchmark
32+
33+
- name: Restore benchmark data
34+
uses: actions/cache@v3
35+
with:
36+
path: .benchmarks
37+
key: benchmark-${{ runner.os }}-${{ hashFiles('**/requirements*.txt') }}
38+
restore-keys: |
39+
benchmark-${{ runner.os }}-
40+
41+
- name: Run benchmarks and save baseline
42+
run: |
43+
# Run benchmarks and save results
44+
pytest tests/benchmark_text_service.py -v --benchmark-autosave
45+
46+
- name: Check for performance regression
47+
run: |
48+
# Compare against the previous benchmark if available
49+
# Fail if performance degrades by more than 10%
50+
if [ -d ".benchmarks" ]; then
51+
BASELINE=$(ls -t .benchmarks/Linux-CPython-3.10-64bit | head -n 2 | tail -n 1)
52+
CURRENT=$(ls -t .benchmarks/Linux-CPython-3.10-64bit | head -n 1)
53+
if [ -n "$BASELINE" ] && [ "$BASELINE" != "$CURRENT" ]; then
54+
# Set full paths to the benchmark files
55+
BASELINE_FILE="$benchmark_dir/$BASELINE"
56+
CURRENT_FILE="$benchmark_dir/$CURRENT"
57+
58+
echo "Comparing current run ($CURRENT) against baseline ($BASELINE)"
59+
# First just show the comparison
60+
pytest tests/benchmark_text_service.py --benchmark-compare
61+
62+
# Then check for significant regressions
63+
echo "Checking for performance regressions (>10% slower)..."
64+
# Use our Python script for benchmark comparison
65+
python scripts/compare_benchmarks.py "$BASELINE_FILE" "$CURRENT_FILE"
66+
else
67+
echo "No previous benchmark found for comparison or only one benchmark exists"
68+
fi
6669
else
67-
echo "No previous benchmark found for comparison or only one benchmark exists"
70+
echo "No benchmarks directory found"
6871
fi
69-
else
70-
echo "No benchmarks directory found"
71-
fi
72-
73-
- name: Upload benchmark results
74-
uses: actions/upload-artifact@v3
75-
with:
76-
name: benchmark-results
77-
path: .benchmarks/
78-
79-
- name: Alert on regression
80-
if: failure()
81-
run: |
82-
echo "::warning::Performance regression detected! Check benchmark results."
72+
73+
- name: Upload benchmark results
74+
uses: actions/upload-artifact@v3
75+
with:
76+
name: benchmark-results
77+
path: .benchmarks/
78+
79+
- name: Alert on regression
80+
if: failure()
81+
run: |
82+
echo "::warning::Performance regression detected! Check benchmark results."

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -346,13 +346,14 @@ auto_service = TextService() # engine="auto" is the default
346346

347347
Benchmark tests show that the regex engine is significantly faster than spaCy for PII detection:
348348

349-
| Engine | Processing Time (10KB text) | Entities Detected |
350-
|--------|------------------------------|-------------------|
349+
| Engine | Processing Time (10KB text) | Entities Detected |
350+
| ------ | --------------------------- | ---------------------------------------------------- |
351351
| Regex | ~0.004 seconds | EMAIL, PHONE, SSN, CREDIT_CARD, IP_ADDRESS, DOB, ZIP |
352-
| SpaCy | ~0.48 seconds | PERSON, ORG, GPE, CARDINAL, FAC |
353-
| Auto | ~0.004 seconds | Same as regex when patterns are found |
352+
| SpaCy | ~0.48 seconds | PERSON, ORG, GPE, CARDINAL, FAC |
353+
| Auto | ~0.004 seconds | Same as regex when patterns are found |
354354

355355
**Key findings:**
356+
356357
- The regex engine is approximately **123x faster** than spaCy for processing the same text
357358
- The auto engine provides the best balance between speed and comprehensiveness
358359
- Uses fast regex patterns first

notes/story-1.3-tkt.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
2-
31
## **Story 1.3 – Integrate Regex Annotator into `TextService`**
42

53
> **Goal:** Allow `TextService` to support a pluggable engine via `engine="regex" | "spacy" | "auto"`.
@@ -8,6 +6,7 @@
86
---
97

108
### 📂 0. **Preconditions**
9+
1110
- [ ] Confirm `RegexAnnotator` is implemented and returns both:
1211
- `Dict[str, List[str]]` for legacy compatibility
1312
- `AnnotationResult` for structured output
@@ -18,6 +17,7 @@
1817
### 🔨 1. Add `engine` Parameter to `TextService`
1918

2019
#### Code:
20+
2121
```python
2222
class TextService:
2323
def __init__(self, engine: str = "auto", ...):
@@ -33,6 +33,7 @@ class TextService:
3333
Add branching logic to support all three modes.
3434

3535
#### Pseudocode:
36+
3637
```python
3738
def annotate(self, text: str, structured: bool = False):
3839
if self.engine == "regex":
@@ -51,28 +52,32 @@ def annotate(self, text: str, structured: bool = False):
5152
### 🧪 3. Write Integration Tests
5253

5354
#### 3.1 Happy Path (Regex Only)
55+
5456
- [ ] `test_engine_regex_detects_simple_entities()`
55-
Inputs: email, phone
56-
Asserts: `TextService(engine="regex").annotate(text)` returns expected dict
57+
Inputs: email, phone
58+
Asserts: `TextService(engine="regex").annotate(text)` returns expected dict
5759

5860
#### 3.2 Fallback (Auto → SpaCy)
61+
5962
- [ ] `test_engine_auto_fallbacks_to_spacy()`
60-
Inputs: Named entities or tricky patterns regex misses
61-
Asserts: spaCy is invoked if regex finds nothing
63+
Inputs: Named entities or tricky patterns regex misses
64+
Asserts: spaCy is invoked if regex finds nothing
6265

6366
#### 3.3 Explicit SpaCy
67+
6468
- [ ] `test_engine_spacy_only()`
65-
Asserts: spaCy is always used regardless of regex hits
69+
Asserts: spaCy is always used regardless of regex hits
6670

6771
#### 3.4 Structured Return
72+
6873
- [ ] `test_structured_annotation_output()`
69-
Asserts: `structured=True` returns list of `Span` objects with label/start/end/text
74+
Asserts: `structured=True` returns list of `Span` objects with label/start/end/text
7075

7176
---
7277

7378
### 📏 4. Performance Budget (Optional But Valuable)
7479

75-
- [ ] Add benchmarking test to compare `regex` vs `spacy` on a 10 KB text
80+
- [ ] Add benchmarking test to compare `regex` vs `spacy` on a 10 KB text
7681
- [ ] Log and confirm regex is ≥5× faster than spaCy in most scenarios
7782

7883
---
@@ -84,4 +89,3 @@ def annotate(self, text: str, structured: bool = False):
8489
- [ ] Add a comment near the `auto` logic explaining fallback threshold
8590

8691
---
87-

0 commit comments

Comments
 (0)