๐ค LLM Setup Guide
Welcome to SuperOptiX's LLM Setup Guide! This guide will help you configure and use local language models for your AI agents. We focus on local models for privacy, speed, and cost-effectiveness.
๐ Quick Start
New to local models? Start with Ollama - it's the easiest option for beginners!
๐ฏ Overview
SuperOptiX supports multiple local model backends, each optimized for different use cases:
Backend | Best For | Platform | Ease of Use | Performance |
---|---|---|---|---|
๐ฆ Ollama | Beginners, All platforms | Cross-platform | โญโญโญโญโญ | โญโญโญโญ |
๐ MLX | Apple Silicon users | macOS only | โญโญโญโญ | โญโญโญโญโญ |
๐ฎ LM Studio | Windows users | Windows/macOS | โญโญโญ | โญโญโญโญ |
๐ค HuggingFace | Advanced users | All platforms | โญโญ | โญโญโญโญโญ |
Production Inference Engines
vLLM, SGLang, and TGI are not included in the current version of SuperOptiX. These production-worthy inference engines are part of our enterprise offering.
๐ฆ Ollama (Recommended)
Ollama is the easiest way to run local models on any platform. Perfect for beginners!
๐ Quick Setup
๐ฆ Install Models with SuperOptiX
# Install recommended models by tier
super model install llama3.2:1b # Oracles tier - Small tasks, fast responses
super model install llama3.2:8b # Genies tier - Complex reasoning, tools, memory
super model install llama3.2:3b # Alternative small model
super model install qwen2.5:7b # Great all-rounder
Show Output
๐ SuperOptiX Model Intelligence - Installing llama3.2:3b
๐ฆ Pulling model llama3.2:3b from Ollama...
โณ This may take a few minutes depending on your internet connection and model size.
pulling manifest
pulling dde5aa3fc5ff: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 2.0 GB
pulling 966de95ca8a6: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.4 KB
pulling fcc5a6bec9da: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 7.7 KB
pulling a70ff7e570d9: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 6.0 KB
pulling 56bb8bd477a5: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 96 B
pulling 34bb5ab01051: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 561 B
verifying sha256 digest
writing manifest
success
โ
Model pulled successfully!
๐ก You can now use it with SuperOptiX:
super model dspy ollama/llama3.2:3b
๐ Model details:
โข Size: small
โข Task: chat
โข Parameters: 3B
๐ Installation completed successfully!
๐ฆ Ollama running on http://localhost:11434 ready to use with SuperOptiX!
๐ฅ๏ธ Server Management
๐ก Important: Ollama automatically starts its server when you run ollama serve
or when you first use a model. You don't need to manually start the server unless you want custom configuration.
# Start Ollama server (runs on port 11434 by default)
ollama serve
# Or simply use a model - server starts automatically
ollama run llama3.2:1b
๐ง Custom Configuration: Only start the server manually if you need:
- Different port: OLLAMA_HOST=0.0.0.0:8080 ollama serve
- Custom model path: OLLAMA_MODELS=/custom/path ollama serve
- GPU configuration: OLLAMA_GPU_LAYERS=35 ollama serve
โ Automatic Detection: SuperOptiX automatically detects and connects to Ollama running on the default port (11434). No additional configuration needed!
๐ Manage Ollama Models
Example Output:
๐ SuperOptiX Model Intelligence - 3 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama3.1:8b โ ๐ฆ ollama โ โ
installed โ medium โ chat โ
โ llama3.2:1b โ ๐ฆ ollama โ โ
installed โ tiny โ chat โ
โ nomic-embed-text:latest โ ๐ฆ ollama โ โ
installed โ Unknown โ embedding โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโ
# Get model information
super model info llama3.2:3b
# List all available models
super model list --all
๐ MLX (Apple Silicon)
MLX is Apple's native machine learning framework, offering blazing-fast inference on Apple Silicon Macs.
Apple Silicon Only
MLX only works on Apple Silicon Macs (M1, M2, M3). If you're on Intel Mac, use Ollama instead.
๐ Setup MLX
# Install MLX dependencies
pip install mlx-lm
# Or install with SuperOptiX
pip install "superoptix[mlx]"
๐ฆ Install MLX Models
# Install popular MLX models
super model install -b mlx mlx-community/phi-2
super model install -b mlx mlx-community/Llama-3.2-3B-Instruct-4bit
super model install -b mlx mlx-community/Mistral-7B-Instruct-v0.2-4bit
๐ฅ๏ธ Start MLX Servers
# Start MLX server on specific port
super model server mlx phi-2 --port 8000
super model server mlx mlx-community/Llama-3.2-3B-Instruct-4bit --port 8000
Example Output:
๐ MLX Local Server
Starting MLX server for mlx-community_Llama-3.2-3B-Instruct-4bit on port 8000...
๐ Starting MLX server...
๐ก Server will be available at: http://localhost:8000
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m mlx_lm.server --model mlx-community_Llama-3.2-3B-Instruct-4bit --port 8000
๐ Example playbook configuration:
language_model:
provider: mlx
model: mlx-community_Llama-3.2-3B-Instruct-4bit
api_base: http://localhost:8000
๐ Executing: /path/to/python -m mlx_lm.server --model mlx-community_Llama-3.2-3B-Instruct-4bit --port 8000
โณ Server is starting... (Press Ctrl+C to stop)
๐ Manage MLX Models
Example Output:
๐ SuperOptiX Model Intelligence - 1 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ mlx-community_Llama-3.2-3B-Instruct-4bit โ ๐ mlx โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ
# Get model information
super model info mlx-community/phi-2
super model info mlx-community_Llama-3.2-3B-Instruct-4bit
# Models are ready to use with SuperOptiX agents
๐ฎ LM Studio
LM Studio provides a user-friendly interface for running local models, especially popular on Windows.
๐ Setup LM Studio
- Download LM Studio from https://lmstudio.ai
- Install and launch LM Studio
- Download a model through the interface
- Start the server (default port: 1234)
๐ฆ Install Models with SuperOptiX
# Install models (use the name from LM Studio)
super model install -b lmstudio llama-3.2-1b-instruct
super model install -b lmstudio llama-3.2-3b
super model install -b lmstudio your-model-name
๐ฅ๏ธ Start LM Studio Servers
# Start server with specific model
super model server lmstudio llama-3.2-1b-instruct --port 1234
super model server lmstudio llama-3.2-3b --port 1234
Example Output:
๐ฎ LM Studio Local Server
Starting LM Studio server for llama-3.2-1b-instruct on port 1234...
๐ Starting LM Studio server...
๐ก Server will be available at: http://localhost:1234
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
# Start server in LM Studio app first, then connect
๐ Example playbook configuration:
language_model:
provider: lmstudio
model: llama-3.2-1b-instruct
api_base: http://localhost:1234
โณ Server is starting... (Press Ctrl+C to stop)
๐ Manage LM Studio Models
Example Output:
๐ SuperOptiX Model Intelligence - 3 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama-3.2-1b-instruct โ ๐ฎ lmstudio โ โ
installed โ small โ chat โ
โ llama-3.3-70b-instruct โ ๐ฎ lmstudio โ โ
installed โ large โ chat โ
โ llama-4-scout-17b-16e-instruct โ ๐ฎ lmstudio โ โ
installed โ medium โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโ
# Get model information
super model info llama-3.2-1b-instruct
# Models are ready to use with SuperOptiX agents
๐ค HuggingFace
HuggingFace offers access to thousands of models, perfect for advanced users who want maximum flexibility.
๐ Setup HuggingFace
# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# Or install with SuperOptiX
pip install "superoptix[huggingface]"
๐ฆ Install HuggingFace Models
# Install popular models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small
super model install -b huggingface microsoft/DialoGPT-medium
super model install -b huggingface meta-llama/Llama-2-7b-chat-hf
๐ฅ๏ธ Start HuggingFace Servers
# Start server with specific model
super model server huggingface microsoft/Phi-4 --port 8001
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server huggingface microsoft/DialoGPT-medium --port 8001
Example Output:
๐ค HuggingFace Local Server
Starting HuggingFace server for microsoft/DialoGPT-small on port 8002...
๐ Starting HuggingFace server...
๐ก Server will be available at: http://localhost:8002
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m superoptix.models.backends.huggingface_server microsoft/DialoGPT-small --port 8002
๐ Example playbook configuration:
language_model:
provider: huggingface
model: microsoft/DialoGPT-small
api_base: http://localhost:8002
Device set to use cpu
INFO: Started server process [4652]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
๐ Manage HuggingFace Models
Example Output:
๐ SuperOptiX Model Intelligence - 2 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ
# Get model information
super model info microsoft/Phi-4
super model info microsoft/DialoGPT-small
# Models are ready to use with SuperOptiX agents
๐ฏ Model Management Commands
๐ฅ๏ธ Server Commands
Example Output:
usage: super model server [-h] [--port PORT] {mlx,huggingface,lmstudio} model_name
๐ Start local model servers for MLX, HuggingFace, or LM Studio. Examples:
super model server mlx mlx-community/Llama-3.2-3B-Instruct-4bit
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server lmstudio llama-3.2-1b-instruct
Backends:
mlx Apple Silicon optimized (default: port 8000)
huggingface Transformers models (default: port 8001)
lmstudio Desktop app models (default: port 1234)
Note: Ollama servers use 'ollama serve' command separately.
positional arguments:
{mlx,huggingface,lmstudio} Backend type
model_name Model name to start server for
options:
-h, --help show this help message and exit
--port PORT, -p PORT Port to run server on
๐ List and Explore Models
Example Output:
๐ SuperOptiX Model Intelligence - 9 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama-3.2-1b-instruct โ ๐ฎ lmstudio โ โ
installed โ small โ chat โ
โ llama-3.3-70b-instruct โ ๐ฎ lmstudio โ โ
installed โ large โ chat โ
โ llama-4-scout-17b-16e-instruct โ ๐ฎ lmstudio โ โ
installed โ medium โ chat โ
โ llama3.1:8b โ ๐ฆ ollama โ โ
installed โ medium โ chat โ
โ llama3.2:1b โ ๐ฆ ollama โ โ
installed โ tiny โ chat โ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ mlx-community_Llama-3.2-3B-Instruct-4bit โ ๐ mlx โ โ
installed โ small โ chat โ
โ nomic-embed-text:latest โ ๐ฆ ollama โ โ
installed โ Unknown โ embedding โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโ
๐ Discover more models: super model discover
๐ฅ Install a model: super model install <model_name>
# List all available models (including uninstalled)
super model list --all
# Filter by backend
super model list --backend ollama
super model list --backend mlx
super model list --backend lmstudio
super model list --backend huggingface
# Verbose information
super model list --verbose
๐ Get Model Information
# Get detailed model info
super model info llama3.2:3b
super model info mlx-community/phi-2
super model info microsoft/Phi-4
super model info llama-3.2-1b-instruct
๐ฏ Choose Your Setup
๐ Beginner (Recommended)
# 1. Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# 2. Install SuperOptiX
pip install superoptix
# 3. Install a model
super model install llama3.2:3b
# 4. Models are ready to use with SuperOptiX agents
๐ Apple Silicon User
# 1. Install MLX dependencies
pip install mlx-lm
# 2. Install SuperOptiX
pip install superoptix
# 3. Install MLX model
super model install -b mlx mlx-community/phi-2
# 4. Start server
super model server mlx phi-2 --port 8000
# 5. Models are ready to use with SuperOptiX agents
๐ฎ Windows User
# 1. Install LM Studio from https://lmstudio.ai
# 2. Download a model in LM Studio
# 3. Start server in LM Studio
# 4. Install SuperOptiX
pip install superoptix
# 5. Connect to LM Studio
super model server lmstudio your-model-name --port 1234
# 6. Models are ready to use with SuperOptiX agents
๐ค Advanced User
# 1. Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# 2. Install SuperOptiX
pip install superoptix
# 3. Install HuggingFace model
super model install -b huggingface microsoft/Phi-4
# 4. Start server
super model server huggingface microsoft/Phi-4 --port 8001
# 5. Models are ready to use with SuperOptiX agents
๐ง Advanced Configuration
๐ Multiple Servers
Run multiple models simultaneously:
# Terminal 1: Ollama model
# Models are ready to use with SuperOptiX agents
# Terminal 2: MLX model (Apple Silicon)
super model server mlx phi-2 --port 8000
# Models are ready to use with SuperOptiX agents
# Terminal 3: HuggingFace model
super model server huggingface microsoft/Phi-4 --port 8001
# Models are ready to use with SuperOptiX agents
# Terminal 4: LM Studio model
super model server lmstudio llama-3.2-1b-instruct --port 1234
# Models are ready to use with SuperOptiX agents
๐จ Troubleshooting
Common Issues
Error: Model not found
or Model does not exist
Solution:
Error: Connection refused
or Cannot connect to server
Solution:
Error: Address already in use
Solution:
Error: MLX requires Apple Silicon
Solution:
Error: ModuleNotFoundError: No module named 'mlx_lm'
or ModuleNotFoundError: No module named 'transformers'
Solution:
Error: Command 'ollama' not found
or Command 'lms' not found
Solution:
Error: 401 Unauthorized
or Repository Not Found
Solution:
๐ Next Steps
Now that you have your local models set up:
- ๐ Quick Start Guide - Build your first agent with local models
- ๐ค Create Your First Genies Agent - Step-by-step tutorial
- ๐ช Marketplace - Discover pre-built agents
- ๐ Model Intelligence Guide - Advanced model management
๐ฌ Need Help?
- ๐ Documentation - Comprehensive guides
- ๐ Support Portal - Report bugs
๐ค Ready to Run Local Models?
๐ค HuggingFace
HuggingFace offers access to thousands of models, perfect for advanced users who want maximum flexibility.
๐ Setup HuggingFace
# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# Or install with SuperOptiX
pip install "superoptix[huggingface]"
๐ฆ Install HuggingFace Models
# Install popular models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small
super model install -b huggingface microsoft/DialoGPT-medium
super model install -b huggingface meta-llama/Llama-2-7b-chat-hf
๐ฅ๏ธ Start HuggingFace Servers
# Start server with specific model
super model server huggingface microsoft/Phi-4 --port 8001
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server huggingface microsoft/DialoGPT-medium --port 8001
Example Output:
๐ค HuggingFace Local Server
Starting HuggingFace server for microsoft/DialoGPT-small on port 8002...
๐ Starting HuggingFace server...
๐ก Server will be available at: http://localhost:8002
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m superoptix.models.backends.huggingface_server microsoft/DialoGPT-small --port 8002
๐ Example playbook configuration:
language_model:
provider: huggingface
model: microsoft/DialoGPT-small
api_base: http://localhost:8002
Device set to use cpu
INFO: Started server process [4652]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
๐ Manage HuggingFace Models
Example Output:
๐ SuperOptiX Model Intelligence - 2 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ