DSPy Optimizers API Reference
This document provides the API reference for using DSPy optimizers in SuperOptiX through the DSPyOptimizerFactory and related classes.
DSPyOptimizerFactory
The DSPyOptimizerFactory provides a unified interface for creating and configuring all supported DSPy optimizers.
Methods
create_optimizer(optimizer_name, params=None, lm_config=None)
Creates a DSPy optimizer instance with the specified configuration.
Parameters:
- optimizer_name (str): Name of the optimizer to create
- params (dict, optional): Optimizer-specific parameters
- lm_config (dict, optional): Language model configuration
Returns: - Configured DSPy optimizer instance
Example:
from superoptix.core.optimizer_factory import DSPyOptimizerFactory
optimizer = DSPyOptimizerFactory.create_optimizer(
optimizer_name="GEPA",
params={
"metric": "answer_exact_match",
"auto": "light",
"reflection_lm": "qwen3:8b"
},
lm_config={
"model": "llama3.1:8b",
"provider": "ollama"
}
)
create_tier_optimized_optimizer(tier, training_data_size, optimizer_config=None)
Creates an optimizer with settings optimized for the specified agent tier and data size.
Parameters:
- tier (str): Agent tier ("oracles", "genies", "protocols", "superagents")
- training_data_size (int): Number of training examples
- optimizer_config (dict, optional): Override default configuration
Returns: - Configured optimizer optimized for the tier
Example:
optimizer = DSPyOptimizerFactory.create_tier_optimized_optimizer(
tier="oracles",
training_data_size=5
)
get_supported_optimizers()
Returns a list of all supported optimizer names.
Returns:
- list: Available optimizer names
Example:
optimizers = DSPyOptimizerFactory.get_supported_optimizers()
# Returns: ['GEPA', 'SIMBA', 'MIPROv2', 'BootstrapFewShot', ...]
Optimizer Configurations
GEPA (Graph Enhanced Prompting Algorithm)
Reflective prompt evolution with advanced feedback mechanisms.
Configuration Parameters
gepa_params = {
# Core settings
"metric": str, # Evaluation metric function name
"auto": str, # Budget control: "minimal", "light", "medium", "heavy"
# Reflection settings
"reflection_lm": str, # Model for reflection analysis
"reflection_minibatch_size": int, # Examples per reflection batch
# Optimization controls
"max_full_evals": int, # Maximum full evaluations
"skip_perfect_score": bool, # Skip if score is perfect
"add_format_failure_as_feedback": bool, # Include format errors in feedback
# Advanced settings
"predictor_level_feedback": bool, # Component-level feedback
"format_failure_feedback": bool # Format-specific feedback
}
Available Metrics
answer_exact_match: Basic exact string matchingadvanced_math_feedback: Mathematical reasoning with step validationmulti_component_enterprise_feedback: Business document multi-aspect analysisvulnerability_detection_feedback: Security analysis with remediationprivacy_preservation_feedback: Data privacy compliance assessmentmedical_accuracy_feedback: Healthcare safety validationlegal_analysis_feedback: Legal compliance verification
Budget Settings
| Budget | Description | Time | Evaluations | Use Case |
|---|---|---|---|---|
minimal |
Fastest optimization | 1-2 min | ~50 calls | Quick testing |
light |
Balanced speed/quality | 3-5 min | ~400 calls | Development |
medium |
Quality focused | 8-12 min | ~800 calls | Production prep |
heavy |
Maximum quality | 15-30 min | ~1600 calls | Critical applications |
Example Configuration
gepa_config = {
"name": "GEPA",
"params": {
"metric": "advanced_math_feedback",
"auto": "light",
"reflection_lm": "qwen3:8b",
"reflection_minibatch_size": 3,
"skip_perfect_score": True,
"add_format_failure_as_feedback": True
}
}
SIMBA (Stochastic Introspective Mini-Batch Ascent)
Advanced optimization with mini-batch processing and introspective analysis.
Configuration Parameters
simba_params = {
"metric": str, # Evaluation metric
"bsize": int, # Batch size for optimization
"num_candidates": int, # Number of candidates to generate
"max_steps": int, # Maximum optimization steps
"verbose": bool # Enable detailed logging
}
Example Configuration
simba_config = {
"name": "SIMBA",
"params": {
"metric": "answer_exact_match",
"bsize": 8,
"num_candidates": 5,
"max_steps": 10
}
}
MIPROv2 (Multi-step Instruction Prompt Optimization)
Sophisticated prompt engineering with multi-step instruction refinement.
Configuration Parameters
miprov2_params = {
"metric": str, # Evaluation metric
"num_candidates": int, # Candidates per iteration
"init_temperature": float, # Initial sampling temperature
"verbose": bool # Enable detailed logging
}
BootstrapFewShot
Creates few-shot examples through bootstrapping with self-generated demonstrations.
Configuration Parameters
bootstrap_params = {
"metric": str, # Evaluation metric
"max_bootstrapped_demos": int, # Maximum bootstrapped examples
"max_rounds": int, # Bootstrap rounds
"max_errors": int # Error tolerance
}
BetterTogether
Ensemble approach combining multiple few-shot strategies.
Configuration Parameters
better_together_params = {
"metric": str, # Evaluation metric
"max_bootstrapped_demos": int, # Bootstrapped examples
"max_labeled_demos": int, # Labeled examples
"max_rounds": int # Optimization rounds
}
Usage Patterns
Basic Optimization Workflow
from superoptix.core.optimizer_factory import DSPyOptimizerFactory
# 1. Create optimizer
optimizer = DSPyOptimizerFactory.create_optimizer(
optimizer_name="GEPA",
params={"metric": "answer_exact_match", "auto": "light"}
)
# 2. Compile with training data
optimized_pipeline = optimizer.compile(
student=base_pipeline,
trainset=training_examples
)
# 3. Use optimized pipeline
result = optimized_pipeline(input_data)
Tier-Based Optimization
# Automatic tier optimization
optimizer = DSPyOptimizerFactory.create_tier_optimized_optimizer(
tier="oracles", # Oracle-tier agent
training_data_size=10, # Small training set
optimizer_config={ # Optional overrides
"auto": "medium"
}
)
Custom Metric Integration
def custom_domain_feedback(example, pred, trace=None):
"""Custom feedback metric for domain-specific evaluation."""
from dspy.primitives import Prediction
# Implement domain-specific scoring logic
score = evaluate_prediction(example, pred)
feedback = generate_improvement_feedback(example, pred)
return Prediction(score=score, feedback=feedback)
# Register and use custom metric
optimizer = DSPyOptimizerFactory.create_optimizer(
optimizer_name="GEPA",
params={
"metric": custom_domain_feedback, # Pass function directly
"auto": "light"
}
)
Error Handling
Common Exceptions
OptimizerNotSupportedError
Raised when requesting an unsupported optimizer.
try:
optimizer = DSPyOptimizerFactory.create_optimizer("InvalidOptimizer")
except OptimizerNotSupportedError as e:
print(f"Optimizer not supported: {e}")
InvalidParameterError
Raised when optimizer parameters are invalid.
try:
optimizer = DSPyOptimizerFactory.create_optimizer(
"GEPA",
params={"invalid_param": "value"}
)
except InvalidParameterError as e:
print(f"Invalid parameter: {e}")
Best Practices for Error Handling
def safe_optimize(optimizer_config, pipeline, trainset):
"""Safely optimize with comprehensive error handling."""
try:
optimizer = DSPyOptimizerFactory.create_optimizer(**optimizer_config)
return optimizer.compile(student=pipeline, trainset=trainset)
except OptimizerNotSupportedError:
# Fallback to default optimizer
fallback = DSPyOptimizerFactory.create_optimizer("BootstrapFewShot")
return fallback.compile(student=pipeline, trainset=trainset)
except InvalidParameterError as e:
# Log error and use default parameters
logger.warning(f"Invalid parameters: {e}. Using defaults.")
optimizer = DSPyOptimizerFactory.create_optimizer(
optimizer_config["name"]
)
return optimizer.compile(student=pipeline, trainset=trainset)
Performance Considerations
Memory Usage Guidelines
| Optimizer | Memory Usage | Recommended RAM | Notes |
|---|---|---|---|
| GEPA | High (2 models) | 16GB+ | Uses reflection LM |
| SIMBA | Medium | 8GB+ | Single model optimization |
| MIPROv2 | Medium | 8GB+ | Instruction refinement |
| BootstrapFewShot | Low | 4GB+ | Lightweight bootstrapping |
| BetterTogether | Medium | 8GB+ | Ensemble approach |
Optimization Time Estimates
| Optimizer | Light Config | Medium Config | Heavy Config |
|---|---|---|---|
| GEPA | 3-5 minutes | 8-12 minutes | 15-30 minutes |
| SIMBA | 1-2 minutes | 3-5 minutes | 5-10 minutes |
| MIPROv2 | 2-4 minutes | 5-8 minutes | 10-15 minutes |
| BootstrapFewShot | 30-60 seconds | 1-2 minutes | 2-5 minutes |
Scaling Recommendations
# For large training sets (>50 examples)
if len(trainset) > 50:
optimizer_name = "BootstrapFewShot" # More efficient for large data
else:
optimizer_name = "GEPA" # Better for small, high-quality data
# For resource-constrained environments
if available_memory_gb < 8:
optimizer_config["auto"] = "minimal"
elif available_memory_gb < 16:
optimizer_config["auto"] = "light"
else:
optimizer_config["auto"] = "medium"
Integration with SuperSpec
DSPy optimizers integrate seamlessly with SuperSpec agent playbooks:
# agent_playbook.yaml
spec:
optimization:
optimizer:
name: GEPA
params:
metric: advanced_math_feedback
auto: light
reflection_lm: qwen3:8b
reflection_minibatch_size: 3
The factory automatically reads these configurations during agent compilation:
# Automatic configuration from playbook
from superoptix.core.pipeline_utils import create_optimized_pipeline
# Reads optimization config from playbook
optimized_agent = create_optimized_pipeline(
playbook_path="agent_playbook.yaml",
trainset=training_data
)
CLI Reference
Agent Optimization Commands
super agent optimize
Optimizes an agent using the configured optimizer from its playbook.
Syntax:
super agent optimize <agent_name> [OPTIONS]
Parameters:
- agent_name: Name of the compiled agent to optimize
Options:
- --timeout SECONDS: Override default timeout (default: 300)
- --output-dir PATH: Directory to save optimized artifacts
- --force: Force re-optimization even if optimized version exists
- --verbose: Enable detailed optimization logs
Examples:
# Basic optimization with default settings
super agent optimize math_agent
# Optimization with extended timeout (10 minutes)
super agent optimize math_agent --timeout 600
# Verbose optimization with custom output directory
super agent optimize math_agent --verbose --output-dir ./optimized_agents
# Force re-optimization of previously optimized agent
super agent optimize math_agent --force
super agent compile
Compiles an agent from its playbook, preparing it for optimization.
Syntax:
super agent compile <agent_name> [OPTIONS]
Options:
- --playbook PATH: Path to playbook file (auto-detected if not specified)
- --output-dir PATH: Directory to save compiled pipeline
- --validate: Validate playbook before compilation
Example:
# Compile agent with validation
super agent compile math_agent --validate
super agent evaluate
Evaluates agent performance using configured metrics.
Syntax:
super agent evaluate <agent_name> [OPTIONS]
Options:
- --dataset PATH: Custom evaluation dataset
- --metrics LIST: Comma-separated list of metrics to use
- --output FORMAT: Output format (json, csv, table)
Example:
# Evaluate with custom metrics
super agent evaluate math_agent --metrics answer_exact_match,advanced_math_feedback
super model install
Installs required models for optimization.
Syntax:
super model install <model_name> [OPTIONS]
Options:
- --provider PROVIDER: Model provider (ollama, huggingface, etc.)
- --backend BACKEND: Compute backend (cpu, gpu, mps)
Examples:
# Install models for GEPA optimization
super model install llama3.1:8b --provider ollama
super model install qwen3:8b --provider ollama
# Install lightweight model for resource-constrained systems
super model install llama3.2:1b --provider ollama
Configuration Commands
super optimizer list
Lists available optimizers and their parameters.
Syntax:
super optimizer list [OPTIMIZER_NAME]
Example:
# List all optimizers
super optimizer list
# Get details for specific optimizer
super optimizer list GEPA
Sample Output:
Available DSPy Optimizers:
==========================
GEPA - Graph Enhanced Prompting Algorithm
Description: Reflective prompt evolution with advanced feedback
Parameters:
- metric (required): Evaluation metric function
- auto: Budget control (minimal, light, medium, heavy)
- reflection_lm (required): Reflection model name
- reflection_minibatch_size: Examples per reflection batch
- max_full_evals: Maximum full evaluations
- skip_perfect_score: Skip if perfect score achieved
SIMBA - Stochastic Introspective Mini-Batch Ascent
Description: Advanced optimization with mini-batch processing
Parameters:
- metric (required): Evaluation metric function
- bsize: Batch size for optimization
- num_candidates: Number of candidates to generate
- max_steps: Maximum optimization steps
super agent status
Shows optimization status and performance metrics.
Syntax:
super agent status <agent_name>
Example Output:
Agent: math_agent
Status: Optimized
Optimizer: GEPA
Last Optimization: 2024-01-15 14:30:22
Performance Metrics:
- Baseline Accuracy: 60.0%
- Optimized Accuracy: 95.0%
- Improvement: +35.0%
- Optimization Time: 4m 32s
Troubleshooting Commands
super logs optimizer
Shows optimization logs for debugging.
Syntax:
super logs optimizer <agent_name> [OPTIONS]
Options:
- --lines N: Number of log lines to show (default: 50)
- --level LEVEL: Log level filter (debug, info, warning, error)
- --follow: Follow log output in real-time
Examples:
# Show recent optimization logs
super logs optimizer math_agent --lines 100
# Follow optimization in real-time
super logs optimizer math_agent --follow
# Show only error logs
super logs optimizer math_agent --level error
API Reference Extensions
Environment Variables
Configure DSPy optimizers using environment variables:
# Default optimizer settings
export SUPEROPTIX_DEFAULT_OPTIMIZER="GEPA"
export SUPEROPTIX_OPTIMIZER_TIMEOUT="300"
# GEPA-specific settings
export GEPA_AUTO_BUDGET="light"
export GEPA_REFLECTION_LM="qwen3:8b"
export GEPA_MINIBATCH_SIZE="3"
# Model configuration
export OLLAMA_HOST="localhost:11434"
export SUPEROPTIX_MODEL_CACHE_DIR="~/.superoptix/models"
Programmatic Configuration
Agent Factory with Optimizer
from superoptix.core.agent_factory import AgentFactory
from superoptix.core.optimizer_factory import DSPyOptimizerFactory
# Create agent with pre-configured optimizer
agent_config = {
"name": "math_solver",
"optimizer": {
"name": "GEPA",
"params": {
"metric": "advanced_math_feedback",
"auto": "light",
"reflection_lm": "qwen3:8b"
}
}
}
agent = AgentFactory.create_from_config(agent_config)
optimized_agent = agent.optimize(training_data)
Optimizer Comparison
from superoptix.core.optimizer_factory import DSPyOptimizerFactory
from superoptix.evaluation.benchmarks import OptimizationBenchmark
# Compare multiple optimizers
optimizers = {
"gepa": DSPyOptimizerFactory.create_optimizer("GEPA", {"auto": "light"}),
"simba": DSPyOptimizerFactory.create_optimizer("SIMBA", {"bsize": 8}),
"bootstrap": DSPyOptimizerFactory.create_optimizer("BootstrapFewShot")
}
benchmark = OptimizationBenchmark(
pipeline=base_pipeline,
trainset=training_data,
testset=test_data,
optimizers=optimizers
)
results = benchmark.run()
print(benchmark.comparison_report(results))
Custom Metric Registration
from superoptix.core.optimizer_factory import DSPyOptimizerFactory
# Register custom metric globally
def custom_domain_metric(example, pred, trace=None):
# Implementation here
pass
DSPyOptimizerFactory.register_metric("custom_domain_feedback", custom_domain_metric)
# Use in optimizer creation
optimizer = DSPyOptimizerFactory.create_optimizer(
"GEPA",
params={"metric": "custom_domain_feedback"}
)
Batch Operations
Bulk Agent Optimization
from superoptix.core.batch_optimizer import BatchOptimizer
# Optimize multiple agents
agents = ["math_agent", "legal_agent", "security_agent"]
optimizer_configs = {
"math_agent": {"name": "GEPA", "params": {"auto": "medium"}},
"legal_agent": {"name": "GEPA", "params": {"auto": "heavy"}},
"security_agent": {"name": "SIMBA", "params": {"bsize": 4}}
}
batch_optimizer = BatchOptimizer(
agents=agents,
optimizer_configs=optimizer_configs,
parallel_workers=2
)
results = batch_optimizer.optimize_all()
Performance Monitoring
from superoptix.monitoring.optimizer_monitor import OptimizerMonitor
# Monitor optimization performance
monitor = OptimizerMonitor()
with monitor.track_optimization("math_agent") as tracker:
optimizer = DSPyOptimizerFactory.create_optimizer("GEPA")
optimized_pipeline = optimizer.compile(base_pipeline, trainset)
# Get real-time metrics
metrics = tracker.get_current_metrics()
print(f"Progress: {metrics['progress']:.1%}")
print(f"Current Score: {metrics['score']:.3f}")