Optimize with GEPA

Learn how to optimize your DSPy programs using GEPA (Genetic Pareto).

Overview

GEPA (Genetic Pareto) is a powerful optimization technique that evolves prompts to improve your DSPy program's performance.

Prerequisites

A DSPy program to optimize
Training data in JSONL format
Connected language model

Step 1: Create Your Program

First, create a DSPy program:

dspy-code
/init
/connect ollama llama3.1:8b

Create a sentiment analyzer module

/save sentiment_analyzer.py

Step 2: Generate Training Data

Create training examples:

/generate data 50 for sentiment analysis

This creates a training_data.jsonl file with examples like:

{"text": "I love this product!", "sentiment": "positive"}
{"text": "This is terrible.", "sentiment": "negative"}

Step 3: Optimize with GEPA

/optimize sentiment_analyzer.py training_data.jsonl

Optimization Cost (Cloud & Local)

Cloud providers (OpenAI, Anthropic, Gemini): GEPA may perform many optimization calls. Only run /optimize if you're aware of the potential API cost and have a billing plan/quotas that can support it.
Local hardware: For smoother optimization with local models, we recommend at least 32 GB RAM.

What Happens During Optimization

Initialization - GEPA creates an initial population of prompt variations
Evaluation - Each variation is tested on training data
Selection - Best-performing prompts are selected
Evolution - New prompts are created through mutation and crossover
Iteration - Process repeats for multiple generations
Result - Best prompt is returned

Understanding the Output

Generation 1: Best fitness = 0.75
Generation 2: Best fitness = 0.82
Generation 3: Best fitness = 0.88
...
Final: Best fitness = 0.92

Advanced Optimization

Custom Metrics

Define your own metric:

def my_metric(example, pred, trace=None):
    """Custom evaluation metric."""
    if pred.sentiment == example.sentiment:
        return 1.0
    return 0.0

Optimization Parameters

from gepa import GEPAOptimizer

optimizer = GEPAOptimizer(
    population_size=50,      # Larger population
    generations=20,          # More generations
    mutation_rate=0.15,      # Higher mutation
    crossover_rate=0.7       # Crossover probability
)

Best Practices

Quality Data - Use high-quality training examples
Sufficient Data - At least 20-50 examples
Diverse Examples - Cover various cases
Validation Set - Keep some data for validation
Patience - Optimization takes time

Example Workflow

# 1. Create program
dspy-code
> Create a sentiment analyzer

# 2. Save it
/save sentiment_analyzer.py

# 3. Generate training data
/generate data 50 for sentiment analysis

# 4. Optimize
/optimize sentiment_analyzer.py training_data.jsonl

# 5. Test optimized version
/run sentiment_analyzer_optimized.py --input text="This is great!"

Troubleshooting

Low Performance

Check training data quality
Increase number of examples
Adjust optimization parameters
Try different model

Slow Optimization

Reduce population size
Reduce generations
Use faster model
Optimize on subset first

Next Steps

Evaluate on test set
Deploy optimized program
Monitor performance
Iterate and improve

For more details, see Optimization