Skip to content

Agent Optimization: Beyond Prompts

What is Agent Optimization?

Agent optimization is the process of improving an AI agent's performance across all layers of the agentic pipeline, not just prompts. While traditional approaches focus solely on prompt engineering, SuperOptiX optimizes the entire stack: prompts, RAG retrieval, tool usage, memory selection, protocol handling, and dataset-driven learning.

Key Insight: A production-ready agent requires optimization at every layer. Optimizing only prompts while leaving other layers unoptimized is like tuning a car's engine while ignoring the transmission, brakes, and steering.


The Problem with Prompt-Only Optimization

Traditional AI optimization focuses exclusively on prompt engineering:

āŒ Traditional Approach

What Gets Optimized:

  • Prompt instructions only

What's Ignored:

  • When to retrieve knowledge (RAG)
  • Which tools to use
  • How to select relevant memories
  • How to combine information

Result: Suboptimal performance, manual tuning required for each layer

Example Problem: An agent might have a perfect prompt for code review, but if it doesn't know WHEN to search security documentation or WHICH analysis tools to use, it will still produce mediocre results.


The SuperOptiX Full-Stack Approach

SuperOptiX optimizes 6 distinct layers of the agentic pipeline:

Layer Traditional Approach SuperOptiX Approach Impact
šŸ’¬ Prompts Manual tuning GEPA learns optimal instructions High
šŸ” RAG Fixed retrieval GEPA learns retrieval strategy High
šŸ› ļø Tools Hardcoded selection GEPA learns tool selection Medium
🧠 Memory All memories included GEPA optimizes context selection High
šŸ”Œ Protocols (MCP) Static patterns GEPA adapts protocol usage Medium
šŸ“Š Datasets Small manual examples GEPA trains on 100s-1000s examples Very High

How GEPA Optimizes Each Layer

GEPA (Genetic-Pareto) doesn't just optimize prompts. It learns strategies for each layer through reflection and iteration:

The Optimization Process

graph TD
    A[Agent Playbook] --> B[Initial Evaluation]
    B --> C[GEPA Reflection]
    C --> D{Which Layer Needs Improvement?}
    D -->|Prompts| E[Optimize Instructions]
    D -->|RAG| F[Optimize Retrieval Strategy]
    D -->|Tools| G[Optimize Tool Selection]
    D -->|Memory| H[Optimize Context Selection]
    D -->|Protocols| I[Optimize Protocol Usage]
    D -->|All| J[Optimize Integration]
    E --> K[Re-Evaluate]
    F --> K
    G --> K
    H --> K
    I --> K
    J --> K
    K --> L{Improved?}
    L -->|Yes| M[Next Iteration]
    L -->|No| N[Try Different Strategy]
    M --> C
    N --> C

Key Feature: GEPA automatically identifies which layer needs improvement and applies targeted optimizations.


Quick Results Preview

Here's what full-stack optimization looks like in practice:

Use Case: Code Review Agent

Metric Before Optimization After Optimization Improvement
Overall Accuracy 37.5% 87.5% +50%
Security Detection 33% 100% +67%
RAG Relevance Random Strategic Contextual
Tool Usage Wrong/None Correct Tools 100% Accurate
Memory Efficiency All memories (overflow) Optimized selection -60% tokens
Response Quality Vague suggestions Actionable solutions With code examples

Key Observation: The compound effect of optimizing all layers produces production-ready results.


The Six Optimization Layers

Click on any layer to learn how GEPA optimizes it:

šŸ’¬ Prompt Optimization

What Gets Optimized: - Persona and role definition - Task instructions - Reasoning patterns - Response formatting

GEPA Learns: - How to structure clear instructions - When to use chain-of-thought - How to format comprehensive responses

Example: "Provide thorough, actionable code reviews" → "Analyze code for security (SQL injection, XSS), performance (O(n²) complexity), and maintainability (cyclomatic complexity > 4). Provide specific solutions with code examples."


šŸ” RAG Optimization

What Gets Optimized: - When to search knowledge base - Which documents to retrieve - How to integrate context - Relevance scoring

GEPA Learns: - "Search security docs BEFORE analyzing SQL queries" - "Retrieve performance patterns for loop analysis" - "Check best practices for naming conventions"

Example: From random document retrieval → Strategic, issue-specific retrieval with 85% relevance


šŸ› ļø Tool Optimization

What Gets Optimized: - Tool selection for scenarios - Tool invocation order - Output combination - Multi-tool orchestration

GEPA Learns: - "Use complexity_calculator for nested conditions" - "Run security_scanner on string concatenation" - "Combine findings from multiple tools"

Example: From no tool usage → Correct tool selection 100% of the time


🧠 Memory Optimization

What Gets Optimized: - Context selection - Relevance scoring - Token budgeting - Summarization strategies

GEPA Learns: - "Include similar past security findings" - "Prioritize recent review patterns" - "Summarize older memories to save tokens"

Example: From all memories (context overflow) → Optimized selection (60% fewer tokens, better relevance)


šŸ”Œ Protocol Optimization (MCP)

What Gets Optimized: - MCP tool selection - Protocol invocation patterns - Result processing - Error handling

GEPA Learns: - When to use MCP tools vs built-in tools - How to structure protocol calls - How to handle tool errors gracefully

Example: From generic protocol usage → Optimized MCP-specific patterns


šŸ“Š Dataset-Driven Optimization

What Gets Optimized: - Pattern recognition from 100s-1000s examples - Edge case handling - Real-world solution phrasing - Domain-specific knowledge

GEPA Learns: - Common security vulnerability patterns - Typical code smell indicators - Effective recommendation phrasing from real examples

Example: From 5 manual scenarios → Training on 100 real GitHub code reviews


Compound Effect: Why Full-Stack Matters

Each layer optimization compounds with others:

Example Scenario: Code with SQL injection

  1. Prompts (optimized): Agent knows to check for security issues
  2. RAG (optimized): Retrieves SQL injection documentation
  3. Tools (optimized): Runs security_scanner to confirm
  4. Memory (optimized): Recalls similar past findings
  5. Datasets (optimized): Uses phrasing from real GitHub reviews

Result: Comprehensive, actionable review with specific solutions

vs. Prompt-Only Optimization: - Prompts (optimized): Agent knows to check security - RAG (not optimized): Retrieves random docs - Tools (not optimized): Doesn't use security_scanner - Memory (not optimized): Includes irrelevant memories - Datasets (not optimized): Generic responses

Result: Vague "check your SQL" response


Getting Started

1. Understand Each Layer

Read through each layer's guide to understand what GEPA optimizes:

2. See the Full-Stack Example

Check out the comprehensive šŸŽÆ Full-Stack Example showing all layers working together.

3. Apply to Your Agents

Enable optimization layers in your playbook:

spec:
  # Prompts (always optimized)
  persona:
    role: Your agent role
    goal: Your agent goal

  # RAG optimization
  rag:
    enabled: true
    knowledge_base:
      - ./knowledge/**/*.md

  # Tool optimization
  tools:
    enabled: true
    categories:
      - your_category

  # Memory optimization
  memory:
    enabled: true
    enable_context_optimization: true

  # Dataset optimization
  datasets:
    - name: training_data
      source: ./data/examples.csv
      limit: 100

  # GEPA optimizes ALL layers
  optimization:
    optimizer:
      name: GEPA
      params:
        auto: medium

4. Run Optimization

super agent compile your_agent
super agent evaluate your_agent
super agent optimize your_agent --auto medium

GEPA will automatically optimize all enabled layers!


Key Takeaways

  1. Full-Stack > Prompts Only: Optimizing all layers produces compound improvements
  2. GEPA Learns Strategies: Not just prompt text, but WHEN, WHICH, and HOW for each layer
  3. Production-Ready Results: 37% → 87% accuracy through comprehensive optimization
  4. Automatic: GEPA handles all layers, you just enable them in the playbook
  5. Framework-Agnostic: Works across DSPy, OpenAI, CrewAI, and all supported frameworks

Next Steps


Related Guides: