Skip to content

Automated prompt optimization using mentor-agent architecture. Generate and refine prompts from labeled data.

License

Notifications You must be signed in to change notification settings

ademakdogan/prompt-optimizer

Repository files navigation

Prompt Optimizer

Python 3.13+ License: MIT OpenRouter GitHub Issues

πŸš€ Stop struggling with prompt engineering. Let AI optimize your prompts automatically.


Generate accurate extraction prompts directly from your labeled data. Improve your existing prompts and eliminate manual fine-tuning.

Prompt Optimizer uses a mentor-agent architecture to automatically generate, refine, and optimize prompts for your specific use case. Simply provide your labeled examples, define your output schema, and let the system discover the optimal prompt through iterative learning.

✨ Why Use This?

Feature Description
🎯 Automatic Prompt Discovery Don't know where to start? The system generates an initial prompt based on your data
πŸ“ˆ Continuous Improvement Each iteration learns from mistakes and produces better prompts
⚑ Token Efficiency When accuracy is tied, the shortest (cheapest) prompt wins
πŸ”„ Works for Any Domain From NER to calculations, entity extraction to data transformation

πŸ”§ How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    OPTIMIZATION LOOP                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Data   │────▢│  Agent  │────▢│ Evaluator │────▢│ Mentor  β”‚ β”‚
β”‚  β”‚ Samples β”‚     β”‚  Model  β”‚     β”‚           β”‚     β”‚  Model  β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚
β”‚                       β–²                                  β”‚      β”‚
β”‚                       β”‚         Improved Prompt          β”‚      β”‚
β”‚                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Agent Model: Processes input data with the current prompt and extracts structured information
  2. Evaluator: Compares extractions against ground truth, calculating accuracy and identifying errors
  3. Mentor Model: Analyzes failed predictions and generates an improved prompt
  4. Loop: Repeats for a configurable number of iterations until optimal accuracy is achieved

πŸš€ Quick Start

Option 1: Local Installation

# Clone and install
git clone https://github.com/ademakdogan/prompt-optimizer.git
cd prompt-optimizer
uv sync --all-extras

# Configure
echo "OPENROUTER_API_KEY=your-api-key" > .env

# Run
uv run python -m prompt_optimizer --data resources/test_mapping.json --samples 5

Option 2: Docker

# Clone and configure
git clone https://github.com/ademakdogan/prompt-optimizer.git
cd prompt-optimizer
echo "OPENROUTER_API_KEY=your-api-key" > .env

# Build and run
make build
make optimize DATA=resources/test_mapping.json SAMPLES=5 LOOPS=3

Makefile Commands

Command Description
make build Build Docker image
make run Run container (shows help)
make optimize Run optimization with defaults
make test Run tests in Docker
make shell Open shell in container
make clean Remove container and image
make help Show all available commands

Custom optimization:

make optimize DATA=resources/my_data.json SAMPLES=10 LOOPS=5

πŸ“ Example Datasets

The project includes two example datasets demonstrating different problem types:

1. Named Entity Recognition (test_pii.json)

This dataset demonstrates entity extraction from unstructured text. The AI must identify and extract specific pieces of information (names, emails, coordinates, etc.) from natural language.

{
  "source_text": "Dear Mr. Vandervort, congratulations on turning 68! Your unique ID is 0Zr2bcG1X9Ub. Visit us at 609 Gorczany Pass.",
  "target_result": {
    "prefix": "Mr.",
    "lastname": "Vandervort", 
    "age": "68",
    "username": "0Zr2bcG1X9Ub",
    "street": "609 Gorczany Pass"
  }
}

Characteristics:

  • Text is unstructured natural language
  • Fields must be recognized and extracted from context
  • Values are copied verbatim from source text
  • Typical for: NER, PII detection, document parsing, CV/resume extraction

2. Calculation & Mapping (test_mapping.json)

This dataset demonstrates data transformation where the AI must not only extract but also calculate derived values from input fields.

{
  "source_text": "\"name\": 'TechSolutions Inc', \"gross\": 1000, \"commission_rate\": 0.1, \"vat\": 180",
  "target_result": {
    "client_name": "TechSolutions Inc",
    "total_gross": 1180,
    "total_mid_gross": 1280
  }
}

Characteristics:

  • Input is semi-structured (key-value pairs)
  • Fields require mathematical calculations:
    • total_gross = gross + vat
    • total_mid_gross = total_gross + (commission_rate Γ— gross)
  • The AI must learn the formulas from examples
  • Typical for: Financial calculations, data transformation, ETL pipelines

πŸ“ How to Use with Your Own Data

Step 1: Prepare Your Dataset

Create a JSON file with your labeled examples in the following format:

[
  {
    "source_text": "Your input text here...",
    "target_result": {
      "field1": "expected_value1",
      "field2": "expected_value2"
    }
  },
  // ... more examples
]

Save it to resources/your_dataset.json.

Step 2: Define Your Schema

Open src/prompt_optimizer/models/agent_model.py and update the ExtractionSchema class to match your target_result fields:

class ExtractionSchema(BaseModel):
    # ═══════════════════════════════════════════════════════════════
    # πŸ”§ USER CUSTOMIZATION SECTION - MODIFY THIS FOR YOUR DATASET
    # ═══════════════════════════════════════════════════════════════
    
    # Define fields matching your target_result keys:
    field1: str  # Required field
    field2: Optional[str] = None  # Optional field
    field3: Optional[float] = None  # Optional numeric field

Tips:

  • Use str for required fields, Optional[str] for optional ones
  • Use float or int for numeric values
  • Add Field(description="...") to provide hints to the AI

Step 3: Run Optimization

uv run python -m prompt_optimizer \
    --data resources/your_dataset.json \
    --samples 10 \
    --loops 5

Step 4: Get Your Optimized Prompt

After optimization completes, find your prompt in:

  • final_prompt.txt: The best-performing prompt
  • mentor_prompts.txt: Full history of all iterations

βš™οΈ Configuration

Environment Variables

Variable Default Description
OPENROUTER_API_KEY Required Your OpenRouter API key
AGENT_MODEL openai/gpt-4.1-nano Model for data extraction
MENTOR_MODEL openai/gpt-4.1-nano Model for prompt improvement
WINDOW_SIZE 2 History iterations shown to mentor
LOOP_COUNT 3 Maximum optimization iterations
LOG_LEVEL INFO Logging verbosity

CLI Parameters

Parameter Type Default Description
--data string resources/test_data.json Path to your dataset
--samples int 5 Number of samples to use
--prompt string None Initial prompt (auto-generated if not provided)
--loops int From settings Number of optimization iterations
--window-size int From settings History window for mentor
--output string None Save results to JSON file
--log-level string From settings DEBUG/INFO/WARNING/ERROR

🐍 Python API

from prompt_optimizer.core import PromptOptimizer
from prompt_optimizer.data import load_test_data

# Load your labeled data
data = load_test_data("resources/your_dataset.json", limit=10)

# Create optimizer
optimizer = PromptOptimizer(window_size=2, loop_count=5)

# Option 1: Let the system generate initial prompt
results = optimizer.optimize(data=data)

# Option 2: Improve an existing prompt
my_prompt = """
Extract client information and calculate totals.
Return as JSON with: client_name, total_gross, total_mid_gross
"""
results = optimizer.optimize(data=data, initial_prompt=my_prompt)

# The best prompt is automatically saved to final_prompt.txt

πŸ“Š Optimization Metrics

After each run, you'll see a detailed metrics table:

╔════════════════════════════════════════════════════════════╗
β•‘ OPTIMIZATION METRICS                                       β•‘
╠════════════════════════════════════════════════════════════╣
β•‘  Total iterations:      4                                  β•‘
β•‘  Best accuracy:         93.33% (iter 3, ~199 tokens)       β•‘
β•‘  Final accuracy:        56.67%                             β•‘
β•‘  Accuracy improvement:  +18.33%                            β•‘
β•‘  Average accuracy:      56.67%                             β•‘
╠════════════════════════════════════════════════════════════╣
β•‘  ITERATION DETAILS                                         β•‘
╠════════════╦══════════════╦════════════════════════════════╣
β•‘ Iteration  β•‘   Accuracy   β•‘ Progress                       β•‘
╠════════════╬══════════════╬════════════════════════════════╣
β•‘     1      β•‘      38.3%   β•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘           β•‘
β•‘     2      β•‘      38.3%   β•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘           β•‘
β•‘     3      β•‘      93.3%   β•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘           β•‘
β•‘     4      β•‘      56.7%   β•‘ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘           β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Smart Best Result Selection

When multiple iterations achieve the same accuracy, the optimizer automatically selects the prompt with the fewest tokens. This ensures:

  • Lower API costs
  • Faster inference
  • Reduced context window usage

πŸ“‚ Project Structure

prompt-optimizer/
β”œβ”€β”€ src/prompt_optimizer/
β”‚   β”œβ”€β”€ api/              # OpenRouter client, agent, mentor
β”‚   β”œβ”€β”€ config/           # Settings management
β”‚   β”œβ”€β”€ core/             # Optimizer loop and evaluator
β”‚   β”œβ”€β”€ data/             # Data loading utilities
β”‚   β”œβ”€β”€ models/           # Pydantic schemas (edit agent_model.py!)
β”‚   └── utils/            # Logging, metrics, persistence
β”œβ”€β”€ resources/            # Example datasets
β”‚   β”œβ”€β”€ test_pii.json     # NER/Entity extraction example
β”‚   └── test_mapping.json # Calculation/mapping example
β”œβ”€β”€ tests/                # Unit, integration, e2e tests
β”œβ”€β”€ final_prompt.txt      # Best prompt from last run
└── mentor_prompts.txt    # Full optimization history

πŸ§ͺ Testing

# Run all tests
uv run pytest tests/ -v

# Run with coverage
uv run pytest tests/ --cov=prompt_optimizer --cov-report=html

πŸ€– Supported Models

Any model on OpenRouter can be used:

Model Speed Quality Cost
openai/gpt-4.1-nano Fast Good Low
openai/gpt-4.1-mini Medium Better Medium
google/gemini-2.5-flash-lite Fast Good Low
anthropic/claude-3-haiku Fast Good Low

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Automated prompt optimization using mentor-agent architecture. Generate and refine prompts from labeled data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published