Skip to content

๐Ÿง  Model Management Guide

Master SuperOptiX's current model management capabilities and tier system


๐ŸŽฏ Overview

SuperOptiX provides a unified model management system that handles local language models across multiple backends. This guide covers the current implementation and how to effectively manage models within the SuperOptiX ecosystem.

๐Ÿง  Current Capabilities

  • ๐Ÿ“‹ Model Listing: List and filter installed models
  • ๐Ÿ” Model Discovery: Get guidance on available models and backends
  • ๐Ÿ“ฆ Model Installation: Install models across different backends
  • โ„น๏ธ Model Information: Get detailed model information
  • ๐Ÿ–ฅ๏ธ Server Management: Start local model servers
  • ๐Ÿง  DSPy Integration: Create DSPy clients for models
  • ๐Ÿ”ง Backend Status: Check backend availability and status

๐Ÿ—๏ธ Model Tier System

SuperOptiX uses a progressive tier system that determines agent capabilities and features:

๐ŸŽญ Oracle Tier (Free)

  • Basic question-answering with Chain of Thought reasoning
  • Simple evaluation (exact match, F1)
  • Basic optimization (BootstrapFewShot)
  • Sequential task orchestration only
  • No tools, memory, or RAG

๐Ÿงž Genie Tier (Free)

  • All Oracle capabilities plus:
  • Tool integration and ReAct reasoning
  • RAG (knowledge retrieval) capabilities
  • Agent memory (short-term and episodic)
  • Basic streaming responses
  • Sequential orchestration only

๐Ÿš€ Higher Tiers (Enterprise)

  • Parallel execution strategies
  • Kubernetes-style orchestration
  • Advanced enterprise features
  • Production-grade scaling

๐Ÿ“‹ Model Listing & Discovery

1. List Installed Models

Bash
# List all installed models
super model list

# List all available models (including uninstalled)
super model list --all

# Filter by backend
super model list --backend ollama
super model list --backend mlx
super model list --backend huggingface
super model list --backend lmstudio

# Filter by size
super model list --size tiny
super model list --size small
super model list --size medium
super model list --size large

# Filter by task
super model list --task chat
super model list --task code
super model list --task reasoning
super model list --task embedding

# Combine filters
super model list --backend ollama --size small --task chat

# Verbose information
super model list --verbose

2. Model Discovery

Bash
# Get comprehensive discovery guide
super model discover

# Get detailed installation guide
super model guide

๐Ÿ“ฆ Model Installation

1. Install Models by Backend

Bash
# Install Ollama models (default backend)
super model install llama3.2:3b
super model install llama3.2:8b
super model install llama3.2:70b

# Install MLX models
super model install -b mlx mlx-community/phi-2
super model install -b mlx mlx-community/Llama-3.2-3B-Instruct-4bit

# Install HuggingFace models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small

# Install LM Studio models
super model install -b lmstudio llama-3.2-1b-instruct
super model install -b lmstudio your-model-name

# Force reinstall if needed
super model install llama3.2:3b --force

2. Backend-Specific Setup

Bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Start Ollama (runs automatically)
ollama serve

# Install models
super model install llama3.2:3b
super model install llama3.2:8b

๐ŸŽ MLX (Apple Silicon)

Bash
# Install MLX dependencies
pip install mlx-lm

# Or install with SuperOptiX
pip install "superoptix[mlx]"

# Install models
super model install -b mlx mlx-community/phi-2
super model install -b mlx mlx-community/Llama-3.2-3B-Instruct-4bit

๐Ÿค— HuggingFace (Advanced Users)

Bash
# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn

# Or install with SuperOptiX
pip install "superoptix[huggingface]"

# Install models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small

๐ŸŽฎ LM Studio (Windows Users)

Bash
# Download LM Studio from https://lmstudio.ai
# Install and launch LM Studio
# Download a model through the interface
# Start the server (default port: 1234)

# Models are managed through LM Studio app
super model install -b lmstudio your-model-name

๐Ÿ–ฅ๏ธ Server Management

1. Start Model Servers

Bash
# Start MLX server
super model server mlx phi-2 --port 8000
super model server mlx mlx-community/Llama-3.2-3B-Instruct-4bit --port 8000

# Start HuggingFace server
super model server huggingface microsoft/Phi-4 --port 8001
super model server huggingface microsoft/DialoGPT-small --port 8001

# Start LM Studio server
super model server lmstudio llama-3.2-1b-instruct --port 1234
super model server lmstudio your-model-name --port 1234

2. Server Backend Details

Backend Server Required Default Port Platform Auto-Start
๐Ÿฆ™ Ollama No 11434 All Yes
๐ŸŽ MLX Yes 8000 Apple Silicon No
๐Ÿค— HuggingFace Yes 8001 All No
๐ŸŽฎ LM Studio Yes 1234 Windows/macOS No

๐Ÿง  DSPy Integration

1. Create DSPy Clients

Bash
# Create Ollama DSPy client
super model dspy ollama/llama3.2:3b
super model dspy ollama/llama3.2:8b

# Create MLX DSPy client
super model dspy mlx/phi-2
super model dspy mlx-community/Llama-3.2-3B-Instruct-4bit

# Create HuggingFace DSPy client
super model dspy huggingface/microsoft/Phi-4
super model dspy microsoft/Phi-4

# Create LM Studio DSPy client
super model dspy lmstudio/llama-3.2-1b-instruct
super model dspy lmstudio/your-model-name

# DSPy client with custom parameters
super model dspy ollama/llama3.2:3b --temperature 0.7 --max-tokens 2048

2. DSPy Integration Examples

Python
# In your agent playbook or pipeline
from dspy import Predict
from superoptix.models.backends.ollama import OllamaClient

# Create the client
client = OllamaClient(
    model="llama3.2:3b",
    temperature=0.7,
    max_tokens=2048
)

# Use with DSPy modules
predictor = Predict(client)

# Example usage
response = predictor("Explain quantum computing in simple terms")
print(response)

๐Ÿ“Š Model Information

1. Get Model Details

Bash
# Get model information
super model info llama3.2:3b
super model info mlx-community/phi-2
super model info microsoft/Phi-4
super model info llama-3.2-1b-instruct

2. Check Backend Status

Bash
# Check all backends
super model backends

๐Ÿ”ง Configuration Management

1. Model Configuration in Playbooks

YAML
# In your agent playbook
spec:
  language_model:
    provider: "ollama"  # or "mlx", "huggingface", "lmstudio"
    model: "llama3.2:3b"
    temperature: 0.7
    max_tokens: 2048
    api_base: "http://localhost:11434"  # for MLX/HuggingFace servers

2. Backend-Specific Configuration

Ollama Configuration

YAML
language_model:
  provider: "ollama"
  model: "llama3.2:3b"
  temperature: 0.7
  max_tokens: 2048
  # No api_base needed - uses default localhost:11434

MLX Configuration

YAML
language_model:
  provider: "mlx"
  model: "mlx-community/phi-2"
  temperature: 0.7
  max_tokens: 2048
  api_base: "http://localhost:8000"  # MLX server port

HuggingFace Configuration

YAML
language_model:
  provider: "huggingface"
  model: "microsoft/Phi-4"
  temperature: 0.7
  max_tokens: 2048
  api_base: "http://localhost:8001"  # HuggingFace server port

LM Studio Configuration

YAML
language_model:
  provider: "lmstudio"
  model: "llama-3.2-1b-instruct"
  temperature: 0.7
  max_tokens: 2048
  api_base: "http://localhost:1234"  # LM Studio server port

๐Ÿšจ Troubleshooting

Common Issues

Model Not Found

Bash
# Check available models
super model list --all

# Use correct model name
super model install llama3.2:3b  # โœ… Correct
super model install llama3.2     # โŒ Wrong

Server Connection Failed

Bash
# Check if server is running
# For Ollama: ollama serve
# For MLX: super model server mlx phi-2 --port 8000
# For LM Studio: Start in LM Studio app
# For HuggingFace: super model server huggingface model --port 8001

Port Already in Use

Bash
# Use different port
super model server mlx phi-2 --port 8001
super model server huggingface microsoft/Phi-4 --port 8002

Apple Silicon Required

Bash
# Use Ollama instead
super model install llama3.2:3b
super model dspy ollama/llama3.2:3b

Missing Python Packages

Bash
# Install MLX dependencies
pip install mlx-lm

# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn

# Or install with SuperOptiX extras
pip install "superoptix[mlx]"
pip install "superoptix[huggingface]"

๐ŸŽฏ Best Practices

1. Model Selection

  • Start with Ollama: Easiest for beginners, works on all platforms
  • Use MLX on Apple Silicon: Best performance for Apple Silicon Macs
  • Choose HuggingFace for advanced use: Maximum flexibility and model variety
  • Use LM Studio on Windows: Good GUI interface for Windows users

2. Server Management

  • Ollama: No manual server management needed
  • MLX/HuggingFace: Start servers when needed, use different ports
  • LM Studio: Manage through the application interface
  • Monitor resources: Keep an eye on memory usage

3. DSPy Integration

  • Test models first: Use super model info to verify installation
  • Start with simple prompts: Test basic functionality before complex tasks
  • Monitor performance: Check response times and quality
  • Use appropriate parameters: Adjust temperature and max_tokens for your use case

4. Tier Compliance

  • Oracle agents: Use any model, no special requirements
  • Genie agents: Models should support tool calling and reasoning
  • Higher tiers: Enterprise features require specific model capabilities


Ready to manage your models effectively? Start with super model discover to explore available models and backends! ๐Ÿš€