🔍 RAG (Retrieval-Augmented Generation) Guide

SuperOptiX provides universal RAG support across all 6 major agent frameworks, with powerful MCP (Model Context Protocol) integration for advanced knowledge retrieval.

🌟 Key Achievement: RAG works seamlessly across DSPy, OpenAI SDK, CrewAI, Google ADK, Microsoft Agent Framework, and DeepAgents!

Overview

RAG enhances AI agents by providing them with access to external knowledge sources. Instead of relying solely on pre-trained knowledge, agents can retrieve relevant information from documents, databases, or other sources to provide more accurate and up-to-date responses.

Multi-Framework RAG

RAG in SuperOptiX works consistently across all frameworks: - 🔧 Framework Agnostic: Same RAG configuration works for all frameworks - 📊 MCP Integration: Advanced Model Context Protocol support - ⚡ Multiple Vector DBs: ChromaDB, LanceDB, Weaviate, Qdrant, Milvus - 🎯 Universal Optimization: GEPA can optimize RAG-enhanced agents - 🔄 Consistent API: Same configuration format across frameworks

🚀 Quick Start

Universal RAG Configuration

RAG works the same way across all frameworks! Just add the rag section to your playbook:

🔬 DSPy🤖 OpenAI SDK👥 CrewAI🔮 Google ADK🏢 Microsoft🌊 DeepAgents

spec:
  target_framework: dspy
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

spec:
  target_framework: openai
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

spec:
  target_framework: crewai
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

spec:
  target_framework: google-adk
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

spec:
  target_framework: microsoft
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

spec:
  target_framework: deepagents
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512

Universal Workflow

# Same workflow for ALL frameworks!
super agent compile <agent_name>  # RAG automatically configured
super agent evaluate <agent_name>  # Test with knowledge retrieval
super agent optimize <agent_name> --auto medium  # GEPA optimizes RAG-enhanced agents
super agent run <agent_name>  # Use with RAG-enhanced responses

RAG Configuration

RAG is configured through the playbook YAML file. Here's the structure:

spec:
  rag:
    enabled: true
    retriever_type: chroma  # or vector_database: chroma
    config:
      top_k: 5
      chunk_size: 512
      chunk_overlap: 50
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      collection_name: knowledge_base

Configuration Options

Option	Type	Default	Description
`enabled`	bool	true	Enable/disable RAG
`retriever_type`	string	-	Vector database type
`top_k`	int	5	Number of documents to retrieve
`chunk_size`	int	512	Document chunk size
`chunk_overlap`	int	50	Overlap between chunks
`embedding_model`	string	all-MiniLM-L6-v2	Sentence transformer model

🔌 MCP (Model Context Protocol) Integration

SuperOptiX supports MCP (Model Context Protocol) for advanced RAG capabilities and tool integration.

What is MCP?

MCP (Model Context Protocol) is a universal protocol for connecting AI agents to external data sources, tools, and knowledge bases. It provides: - Standardized Connections: Connect to any MCP server - Advanced RAG: Enhanced knowledge retrieval - Tool Integration: Seamless tool discovery and execution - Multi-Framework Support: Works across all frameworks

MCP Configuration

spec:
  target_framework: dspy  # Works with any framework
  rag:
    enabled: true
    mcp:
      enabled: true
      servers:
        - name: filesystem
          command: npx
          args: ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/docs"]
        - name: git
          command: npx
          args: ["-y", "@modelcontextprotocol/server-git", "--repository", "/path/to/repo"]
    config:
      top_k: 5
      chunk_size: 512

MCP Benefits

✅ Universal Protocol: Works across all frameworks
✅ Rich Integrations: Connect to filesystems, databases, APIs, Git repos
✅ Tool Discovery: Automatic tool detection and execution
✅ Enhanced RAG: Better context retrieval with MCP servers
✅ GEPA Optimization: Optimize MCP-enhanced agents with GEPA

Example: MCP + RAG + Multi-Framework

apiVersion: agent/v1
kind: AgentSpec
metadata:
  name: mcp_enhanced_agent
spec:
  target_framework: openai  # Works with ANY framework!
  language_model:
    provider: ollama
    model: gpt-oss:20b
  rag:
    enabled: true
    retriever_type: chroma
    mcp:
      enabled: true
      servers:
        - name: filesystem
          command: npx
          args: ["-y", "@modelcontextprotocol/server-filesystem", "./docs"]
  optimization:
    optimizer:
      name: GEPA  # Optimize MCP-enhanced agents!
      params:
        auto: medium

Learn More: See our MCP Protocol Guide for detailed MCP integration examples.

📦 Supported Vector Databases

1. ChromaDB (Recommended for Local Development)

ChromaDB is the default and recommended option for local development. It's lightweight, requires no external dependencies, and automatically downloads embedding models.

Configuration

spec:
  rag:
    enabled: true
    retriever_type: chroma
    config:
      top_k: 5
      chunk_size: 512
      chunk_overlap: 50
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      collection_name: my_knowledge_base

Example Usage

from agile.agents.qa.pipelines.qa_pipeline import QaPipeline

# Initialize with ChromaDB
pipeline = QaPipeline()

# Add documents
documents = [
    {
        'content': 'ChromaDB is a lightweight vector database for AI applications.',
        'metadata': {'source': 'chroma_docs', 'version': '1.0'}
    }
]

pipeline.add_documents(documents)

# Query
result = await pipeline.forward("What is ChromaDB?")
print(result['response'])

2. LanceDB

LanceDB is a modern vector database built on Apache Arrow, offering high performance and easy integration.

Configuration

spec:
  rag:
    enabled: true
    retriever_type: lancedb
    config:
      top_k: 5
      chunk_size: 512
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      table_name: knowledge_table
      database_path: ./data/lancedb

Example Usage

from agile.agents.qa.pipelines.qa_pipeline import QaPipeline

# Initialize with LanceDB
pipeline = QaPipeline()

# Add documents
documents = [
    {
        'content': 'LanceDB provides fast vector search with Apache Arrow.',
        'metadata': {'source': 'lancedb_docs', 'category': 'database'}
    }
]

pipeline.add_documents(documents)

# Query
result = await pipeline.forward("What is LanceDB?")
print(result['response'])

3. Weaviate

Weaviate is a vector database with a rich ecosystem and cloud offerings.

Configuration

spec:
  rag:
    enabled: true
    retriever_type: weaviate
    config:
      top_k: 5
      chunk_size: 512
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      collection_name: knowledge_collection
      weaviate_url: http://localhost:8080
      api_key: your_api_key  # Optional

Example Usage

from agile.agents.qa.pipelines.qa_pipeline import QaPipeline

# Initialize with Weaviate
pipeline = QaPipeline()

# Add documents
documents = [
    {
        'content': 'Weaviate is a vector database with rich ecosystem.',
        'metadata': {'source': 'weaviate_docs', 'category': 'database'}
    }
]

pipeline.add_documents(documents)

# Query
result = await pipeline.forward("What is Weaviate?")
print(result['response'])

4. Qdrant

Qdrant is a high-performance vector database with advanced filtering capabilities.

Configuration

spec:
  rag:
    enabled: true
    retriever_type: qdrant
    config:
      top_k: 5
      chunk_size: 512
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      collection_name: knowledge_collection
      qdrant_url: http://localhost:6333
      api_key: your_api_key  # Optional

5. Milvus

Milvus is a scalable vector database designed for production use.

Configuration

spec:
  rag:
    enabled: true
    retriever_type: milvus
    config:
      top_k: 5
      chunk_size: 512
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2
      collection_name: knowledge_collection
      milvus_host: localhost
      milvus_port: 19530

Document Ingestion

Adding Documents

# Single document
pipeline.add_document({
    'content': 'Your document content here.',
    'metadata': {
        'source': 'manual',
        'category': 'general',
        'date': '2024-01-01'
    }
})

# Multiple documents
documents = [
    {
        'content': 'First document content.',
        'metadata': {'source': 'file1', 'category': 'technical'}
    },
    {
        'content': 'Second document content.',
        'metadata': {'source': 'file2', 'category': 'business'}
    }
]
pipeline.add_documents(documents)

Document Chunking

Documents are automatically chunked for better retrieval:

spec:
  rag:
    config:
      chunk_size: 512      # Characters per chunk
      chunk_overlap: 50    # Overlap between chunks
      chunk_strategy: "recursive"  # recursive, fixed, semantic

Advanced Configuration

Custom Embedding Models

spec:
  rag:
    vector_store:
      embedding_model: sentence-transformers/all-mpnet-base-v2  # Better quality
      # or
      embedding_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2  # Multilingual

Retrieval Strategies

spec:
  rag:
    config:
      top_k: 5
      similarity_threshold: 0.7
      rerank: true  # Enable re-ranking for better results
      hybrid_search: true  # Combine dense and sparse retrieval

Filtering and Metadata

# Query with metadata filters
result = await pipeline.forward(
    "What is DSPy?",
    filters={
        'category': 'framework',
        'source': 'docs'
    }
)

RAG with Tools

RAG can be combined with tools for enhanced capabilities:

spec:
  rag:
    enabled: true
    retriever_type: chroma
  tool_calling:
    enabled: true
    available_tools: ["web_search", "file_reader"]

Best Practices

1. Document Quality

Use high-quality, relevant documents
Ensure proper formatting and structure
Include comprehensive metadata
Regular updates and maintenance

2. Chunking Strategy

Choose appropriate chunk sizes (512-1024 characters)
Use overlap to maintain context
Consider semantic boundaries
Test different strategies for your use case

3. Embedding Models

Use appropriate models for your domain
Consider multilingual requirements
Balance quality vs. performance
Regular model updates

4. Vector Database Selection

ChromaDB: Development and prototyping
LanceDB: High-performance local applications
Weaviate: Rich ecosystem and cloud features
Qdrant: Advanced filtering requirements
Milvus: Large-scale production deployments

5. Performance Optimization

Monitor retrieval latency
Optimize chunk sizes
Use appropriate top_k values
Implement caching strategies

Troubleshooting

Common Issues

RAG Not Working

# Check RAG status
status = pipeline.get_rag_status()
print(f"RAG enabled: {status['enabled']}")
print(f"Documents loaded: {status['document_count']}")

Poor Retrieval Quality

# Adjust configuration
spec:
  rag:
    config:
      top_k: 10  # Increase for more candidates
      similarity_threshold: 0.5  # Lower threshold
    vector_store:
      embedding_model: sentence-transformers/all-mpnet-base-v2  # Better model

Performance Issues

# Optimize for performance
spec:
  rag:
    config:
      top_k: 3  # Reduce for faster retrieval
      chunk_size: 256  # Smaller chunks
    vector_store:
      embedding_model: sentence-transformers/all-MiniLM-L6-v2  # Faster model

Integration Examples

RAG + Memory Integration

spec:
  rag:
    enabled: true
    retriever_type: chroma
  memory:
    enabled: true
    long_term:
      enable_embeddings: true

RAG + Optimization Integration

spec:
  rag:
    enabled: true
    retriever_type: chroma
  optimization:
    enabled: true
    strategy: bootstrap_few_shot

Agent Development Guide - Complete agent development workflow
Memory Guide - Memory systems integration
Quick Start Guide - Getting started with SuperOptiX

🔍 RAG (Retrieval-Augmented Generation) Guide

Overview

Multi-Framework RAG

🚀 Quick Start

Universal RAG Configuration

Universal Workflow

RAG Configuration

Configuration Options

🔌 MCP (Model Context Protocol) Integration

What is MCP?

MCP Configuration

MCP Benefits

Example: MCP + RAG + Multi-Framework

📦 Supported Vector Databases

1. ChromaDB (Recommended for Local Development)

Configuration

Example Usage

2. LanceDB

Configuration

Example Usage

3. Weaviate

Configuration

Example Usage

4. Qdrant

Configuration

5. Milvus

Configuration

Document Ingestion

Adding Documents

Document Chunking

Advanced Configuration

Custom Embedding Models

Retrieval Strategies

Filtering and Metadata

RAG with Tools

Best Practices

1. Document Quality

2. Chunking Strategy

3. Embedding Models

4. Vector Database Selection

5. Performance Optimization

Troubleshooting

Common Issues

RAG Not Working

Poor Retrieval Quality

Performance Issues

Integration Examples

RAG + Memory Integration

RAG + Optimization Integration

📚 Related Documentation