Agent Optimization: Beyond Prompts
What is Agent Optimization?
Agent optimization is the process of improving an AI agent's performance across all layers of the agentic pipeline, not just prompts. While traditional approaches focus solely on prompt engineering, SuperOptiX optimizes the entire stack: prompts, RAG retrieval, tool usage, memory selection, protocol handling, and dataset-driven learning.
Key Insight: A production-ready agent requires optimization at every layer. Optimizing only prompts while leaving other layers unoptimized is like tuning a car's engine while ignoring the transmission, brakes, and steering.
The Problem with Prompt-Only Optimization
Traditional AI optimization focuses exclusively on prompt engineering:
ā Traditional ApproachWhat Gets Optimized:
What's Ignored:
Result: Suboptimal performance, manual tuning required for each layer |
Example Problem: An agent might have a perfect prompt for code review, but if it doesn't know WHEN to search security documentation or WHICH analysis tools to use, it will still produce mediocre results.
The SuperOptiX Full-Stack Approach
SuperOptiX optimizes 6 distinct layers of the agentic pipeline:
| Layer | Traditional Approach | SuperOptiX Approach | Impact |
| š¬ Prompts | Manual tuning | GEPA learns optimal instructions | High |
| š RAG | Fixed retrieval | GEPA learns retrieval strategy | High |
| š ļø Tools | Hardcoded selection | GEPA learns tool selection | Medium |
| š§ Memory | All memories included | GEPA optimizes context selection | High |
| š Protocols (MCP) | Static patterns | GEPA adapts protocol usage | Medium |
| š Datasets | Small manual examples | GEPA trains on 100s-1000s examples | Very High |
How GEPA Optimizes Each Layer
GEPA (Genetic-Pareto) doesn't just optimize prompts. It learns strategies for each layer through reflection and iteration:
The Optimization Process
graph TD
A[Agent Playbook] --> B[Initial Evaluation]
B --> C[GEPA Reflection]
C --> D{Which Layer Needs Improvement?}
D -->|Prompts| E[Optimize Instructions]
D -->|RAG| F[Optimize Retrieval Strategy]
D -->|Tools| G[Optimize Tool Selection]
D -->|Memory| H[Optimize Context Selection]
D -->|Protocols| I[Optimize Protocol Usage]
D -->|All| J[Optimize Integration]
E --> K[Re-Evaluate]
F --> K
G --> K
H --> K
I --> K
J --> K
K --> L{Improved?}
L -->|Yes| M[Next Iteration]
L -->|No| N[Try Different Strategy]
M --> C
N --> C
Key Feature: GEPA automatically identifies which layer needs improvement and applies targeted optimizations.
Quick Results Preview
Here's what full-stack optimization looks like in practice:
Use Case: Code Review Agent
| Metric | Before Optimization | After Optimization | Improvement |
| Overall Accuracy | 37.5% | 87.5% | +50% |
| Security Detection | 33% | 100% | +67% |
| RAG Relevance | Random | Strategic | Contextual |
| Tool Usage | Wrong/None | Correct Tools | 100% Accurate |
| Memory Efficiency | All memories (overflow) | Optimized selection | -60% tokens |
| Response Quality | Vague suggestions | Actionable solutions | With code examples |
Key Observation: The compound effect of optimizing all layers produces production-ready results.
The Six Optimization Layers
Click on any layer to learn how GEPA optimizes it:
š¬ Prompt Optimization
What Gets Optimized: - Persona and role definition - Task instructions - Reasoning patterns - Response formatting
GEPA Learns: - How to structure clear instructions - When to use chain-of-thought - How to format comprehensive responses
Example: "Provide thorough, actionable code reviews" ā "Analyze code for security (SQL injection, XSS), performance (O(n²) complexity), and maintainability (cyclomatic complexity > 4). Provide specific solutions with code examples."
š RAG Optimization
What Gets Optimized: - When to search knowledge base - Which documents to retrieve - How to integrate context - Relevance scoring
GEPA Learns: - "Search security docs BEFORE analyzing SQL queries" - "Retrieve performance patterns for loop analysis" - "Check best practices for naming conventions"
Example: From random document retrieval ā Strategic, issue-specific retrieval with 85% relevance
š ļø Tool Optimization
What Gets Optimized: - Tool selection for scenarios - Tool invocation order - Output combination - Multi-tool orchestration
GEPA Learns: - "Use complexity_calculator for nested conditions" - "Run security_scanner on string concatenation" - "Combine findings from multiple tools"
Example: From no tool usage ā Correct tool selection 100% of the time
š§ Memory Optimization
What Gets Optimized: - Context selection - Relevance scoring - Token budgeting - Summarization strategies
GEPA Learns: - "Include similar past security findings" - "Prioritize recent review patterns" - "Summarize older memories to save tokens"
Example: From all memories (context overflow) ā Optimized selection (60% fewer tokens, better relevance)
š Protocol Optimization (MCP)
What Gets Optimized: - MCP tool selection - Protocol invocation patterns - Result processing - Error handling
GEPA Learns: - When to use MCP tools vs built-in tools - How to structure protocol calls - How to handle tool errors gracefully
Example: From generic protocol usage ā Optimized MCP-specific patterns
š Dataset-Driven Optimization
What Gets Optimized: - Pattern recognition from 100s-1000s examples - Edge case handling - Real-world solution phrasing - Domain-specific knowledge
GEPA Learns: - Common security vulnerability patterns - Typical code smell indicators - Effective recommendation phrasing from real examples
Example: From 5 manual scenarios ā Training on 100 real GitHub code reviews
Compound Effect: Why Full-Stack Matters
Each layer optimization compounds with others:
Example Scenario: Code with SQL injection
- Prompts (optimized): Agent knows to check for security issues
- RAG (optimized): Retrieves SQL injection documentation
- Tools (optimized): Runs security_scanner to confirm
- Memory (optimized): Recalls similar past findings
- Datasets (optimized): Uses phrasing from real GitHub reviews
Result: Comprehensive, actionable review with specific solutions
vs. Prompt-Only Optimization: - Prompts (optimized): Agent knows to check security - RAG (not optimized): Retrieves random docs - Tools (not optimized): Doesn't use security_scanner - Memory (not optimized): Includes irrelevant memories - Datasets (not optimized): Generic responses
Result: Vague "check your SQL" response
Getting Started
1. Understand Each Layer
Read through each layer's guide to understand what GEPA optimizes:
- š¬ Prompt Optimization
- š RAG Optimization
- š ļø Tool Optimization
- š§ Memory Optimization
- š Protocol Optimization
- š Dataset-Driven Optimization
2. See the Full-Stack Example
Check out the comprehensive šÆ Full-Stack Example showing all layers working together.
3. Apply to Your Agents
Enable optimization layers in your playbook:
spec:
# Prompts (always optimized)
persona:
role: Your agent role
goal: Your agent goal
# RAG optimization
rag:
enabled: true
knowledge_base:
- ./knowledge/**/*.md
# Tool optimization
tools:
enabled: true
categories:
- your_category
# Memory optimization
memory:
enabled: true
enable_context_optimization: true
# Dataset optimization
datasets:
- name: training_data
source: ./data/examples.csv
limit: 100
# GEPA optimizes ALL layers
optimization:
optimizer:
name: GEPA
params:
auto: medium
4. Run Optimization
super agent compile your_agent
super agent evaluate your_agent
super agent optimize your_agent --auto medium
GEPA will automatically optimize all enabled layers!
Key Takeaways
- Full-Stack > Prompts Only: Optimizing all layers produces compound improvements
- GEPA Learns Strategies: Not just prompt text, but WHEN, WHICH, and HOW for each layer
- Production-Ready Results: 37% ā 87% accuracy through comprehensive optimization
- Automatic: GEPA handles all layers, you just enable them in the playbook
- Framework-Agnostic: Works across DSPy, OpenAI, CrewAI, and all supported frameworks
Next Steps
- Deep Dive: Read individual layer guides
- See Example: Check out the Full-Stack Example
- Try It: Enable layers in your agent and run optimization
- Learn More: GEPA Optimizer Guide
Related Guides: