fix: migrate 3 plugin notification files from gpt-4 to gpt-4.1-mini#4691
fix: migrate 3 plugin notification files from gpt-4 to gpt-4.1-mini#4691
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request successfully migrates three plugin notification files from gpt-4 to the more cost-effective gpt-4.1-mini model, which is a good improvement. However, the changes highlight a significant maintainability issue: model names are hardcoded as string literals in multiple places. This practice is error-prone and makes future updates difficult, as evidenced by these files being missed in a previous migration. I've added comments to each file recommending the use of constants for model names, ideally in a centralized configuration, to improve maintainability and prevent similar issues in the future. While the current changes are correct, addressing this underlying structural issue would be highly beneficial.
|
|
||
| response = client.chat.completions.create( | ||
| model="gpt-4", | ||
| model="gpt-4.1-mini", |
There was a problem hiding this comment.
While changing the model to gpt-4.1-mini is correct, hardcoding the model name as a string literal here and in other plugin files makes maintenance difficult. For example, if you need to update the model again, you'll have to find and replace it in multiple locations, which is error-prone. This is likely why these files were missed in the previous migration (PR #4675).
To improve maintainability, I recommend defining model names as constants in a centralized place, perhaps a shared config.py for all plugins, or at least at the top of each file.
Example:
# At the top of the file
DRINKING_INTENT_MODEL = "gpt-4.1-mini"
# In analyze_drinking_intent()
...
response = client.chat.completions.create(
model=DRINKING_INTENT_MODEL,
...
)This would make future updates much safer and easier.
|
|
||
| response = client.chat.completions.create( | ||
| model="gpt-4", | ||
| model="gpt-4.1-mini", |
There was a problem hiding this comment.
Similar to the other plugin files in this PR, hardcoding the model name gpt-4.1-mini as a string literal here can lead to maintenance issues. Centralizing model definitions, for instance as constants at the top of the file or in a shared configuration, would make future updates easier and less error-prone. The fact that these files were missed in a previous migration highlights the risk of scattered, hardcoded configuration values.
A better approach would be:
# At the top of the file
OMI_RESPONSE_MODEL = "gpt-4.1-mini"
# In get_openai_response()
...
response = client.chat.completions.create(
model=OMI_RESPONSE_MODEL,
...
)| try: | ||
| response = client.chat.completions.create( | ||
| model="gpt-4", | ||
| model="gpt-4.1-mini", |
There was a problem hiding this comment.
Hardcoding the model name gpt-4.1-mini here presents a maintainability risk. As seen with this PR, when model names are scattered as string literals across multiple files, it's easy to miss some during an update. This can lead to inconsistent model usage and unexpected costs.
To prevent this in the future, I recommend defining the model name as a constant. A centralized configuration would be ideal, but even a constant at the top of this file would be a significant improvement.
Example:
# At the top of the file
TOPIC_EXTRACTION_MODEL = "gpt-4.1-mini"
# In extract_topics()
...
response = client.chat.completions.create(
model=TOPIC_EXTRACTION_MODEL,
...
)
Smoke Test Results — Local Dev EnvironmentTested all 3 plugin functions against the live OpenAI API using dev environment credentials. Results
All 3 functions confirmed working on 🤖 Generated with Claude Code |
Smoke Test Results — Local Plugin ServicesStarted all 3 plugin services locally with dev environment credentials and hit actual HTTP webhook endpoints. Service Setup
Test Results
Endpoint Health Checks
All 3 plugin services start, accept requests, and make successful 🤖 Generated with Claude Code |
Smoke Test Results — Full Plugins Service (
|
| Plugin | Endpoint | Result | Model |
|---|---|---|---|
| mentor | POST /notification/mentor/webhook |
202 (buffered). extract_topics() → ["machine learning", "data science", "career development"] |
gpt-4.1-mini |
| hey_omi | POST /notifications/webhook |
Trigger "hey omi" detected → question aggregated → OpenAI answered "2 plus 2 is 4." | gpt-4.1-mini |
| drinking_app | standalone Flask (not in main.py) |
Correctly detected drinking intent (YES) |
gpt-4.1-mini |
Setup-Status Endpoints
GET /notification/mentor/webhook/setup-status→{"is_setup_completed": true}GET /notifications/webhook/setup-status→{"is_setup_completed": true}
All gpt-4 → gpt-4.1-mini migration confirmed working on the full plugins service.
🤖 Generated with Claude Code
GPT-5.1 Judge Eval — hey_omi model migration (gpt-4 → gpt-4.1-mini)12 test cases × 3 runs = 36 evaluations, judged by gpt-5.1. Aggregate Scores
Per-Test Breakdown
Verdict: PASS (4.31/5)gpt-4.1-mini is suitable for hey_omi. Lower scores on how-to questions (#2, #6, #8) are due to 🤖 Generated with Claude Code |
A/B Eval — gpt-4 vs gpt-4.1-mini (hey_omi) | Judged by gpt-5.112 test cases × 3 runs × 2 models = 72 model calls, all judged by gpt-5.1. Criteria Comparison
Per-Test Comparison (Overall score, avg of 3 runs)
Verdict: PASSgpt-4.1-mini scores +0.06 higher than gpt-4 overall. The only regression is on carbonara recipe (-0.7), which both models struggle with equally due to Safe to migrate — no quality regression. 🤖 Generated with Claude Code |
|
Your eval looks naive, but they are good first steps. lgtm |
Summary
model="gpt-4"→model="gpt-4.1-mini"in 3 plugin notification filesFiles changed
plugins/example/notifications/mentor/main.pyextract_topics()— highest volumeplugins/example/notifications/hey_omi.pyget_openai_response()plugins/example/notifications/drinking_app.pyFixes #4690
Test plan
gpt-4hardcodes in notifications dirplugins/example/main.pyservice viauvicorn main:appwith dev envgpt-4.1-miniAPI callsextract_topics()→ valid JSON topic arraygpt-4.1-miniYES🤖 Generated with Claude Code