๐ค LLM Setup Guide
Welcome to SuperOptiX's LLM Setup Guide! This guide will help you configure and use local language models for your AI agents. We focus on local models for privacy, speed, and cost-effectiveness.
๐ Quick Start
New to local models? Start with Ollama - it's the easiest option for beginners!
๐ฏ Overview
SuperOptiX supports multiple local model backends, each optimized for different use cases:
Backend | Best For | Platform | Ease of Use | Performance |
---|---|---|---|---|
๐ฆ Ollama | Beginners, All platforms | Cross-platform | โญโญโญโญโญ | โญโญโญโญ |
๐ค GPT-OSS | Advanced reasoning, Agentic tasks | Cross-platform | โญโญโญโญ | โญโญโญโญโญ |
๐ MLX | Apple Silicon users | macOS only | โญโญโญโญ | โญโญโญโญโญ |
๐ฎ LM Studio | Windows users | Windows/macOS | โญโญโญ | โญโญโญโญ |
๐ค HuggingFace | Advanced users | All platforms | โญโญ | โญโญโญโญโญ |
Production Inference Engines
vLLM, SGLang, and TGI are not included in the current version of SuperOptiX. These production-worthy inference engines are part of our enterprise offering.
๐ฆ Ollama (Recommended)
Ollama is the easiest way to run local models on any platform. Perfect for beginners!
๐ Quick Setup
๐ฆ Install Models with SuperOptiX
# Install recommended models by tier
super model install llama3.2:1b # Oracles tier - Small tasks, fast responses
super model install llama3.2:8b # Genies tier - Complex reasoning, tools, memory
super model install llama3.2:3b # Alternative small model
super model install qwen2.5:7b # Great all-rounder
Show Output
๐ SuperOptiX Model Intelligence - Installing llama3.2:3b
๐ฆ Pulling model llama3.2:3b from Ollama...
โณ This may take a few minutes depending on your internet connection and model size.
pulling manifest
pulling dde5aa3fc5ff: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 2.0 GB
pulling 966de95ca8a6: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.4 KB
pulling fcc5a6bec9da: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 7.7 KB
pulling a70ff7e570d9: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 6.0 KB
pulling 56bb8bd477a5: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 96 B
pulling 34bb5ab01051: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 561 B
verifying sha256 digest
writing manifest
success
โ
Model pulled successfully!
๐ก You can now use it with SuperOptiX:
super model dspy ollama/llama3.2:3b
๐ Model details:
โข Size: small
โข Task: chat
โข Parameters: 3B
๐ Installation completed successfully!
๐ฆ Ollama running on http://localhost:11434 ready to use with SuperOptiX!
๐ฅ๏ธ Server Management
๐ก Important: Ollama automatically starts its server when you run ollama serve
or when you first use a model. You don't need to manually start the server unless you want custom configuration.
# Start Ollama server (runs on port 11434 by default)
ollama serve
# Or simply use a model - server starts automatically
ollama run llama3.2:1b
๐ง Custom Configuration: Only start the server manually if you need:
- Different port: OLLAMA_HOST=0.0.0.0:8080 ollama serve
- Custom model path: OLLAMA_MODELS=/custom/path ollama serve
- GPU configuration: OLLAMA_GPU_LAYERS=35 ollama serve
โ Automatic Detection: SuperOptiX automatically detects and connects to Ollama running on the default port (11434). No additional configuration needed!
๐ Manage Ollama Models
Example Output:
๐ SuperOptiX Model Intelligence - 3 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama3.1:8b โ ๐ฆ ollama โ โ
installed โ medium โ chat โ
โ llama3.2:1b โ ๐ฆ ollama โ โ
installed โ tiny โ chat โ
โ nomic-embed-text:latest โ ๐ฆ ollama โ โ
installed โ Unknown โ embedding โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโ
# Get model information
super model info llama3.2:3b
# List all available models
super model list --all
๐ค GPT-OSS Models (OpenAI's Open Source)
GPT-OSS models are OpenAI's latest open-weight language models designed for powerful reasoning, agentic tasks, and versatile developer use cases. SuperOptiX now supports both GPT-OSS-20B and GPT-OSS-120B models with native Apple Silicon support!
๐ Apple Silicon Support
MLX-LM v0.26.3 now provides native Apple Silicon support for GPT-OSS models, resolving the mixed precision issues that previously prevented these models from running on Apple Silicon.
Backend | Model | Status | Performance | Apple Silicon | Recommendation |
---|---|---|---|---|---|
๐ฆ Ollama | gpt-oss:20b | โ Works | 19.7 t/s | โ Optimized format | โญ RECOMMENDED |
๐ MLX-LM | openai_gpt-oss-20b | โ Works | 5.2 t/s | โ Native support | Apple Silicon only |
๐ค HuggingFace | openai/gpt-oss-20b | โ Broken | N/A | โ Mixed precision errors | โ Avoid on Apple Silicon |
๐ฏ GPT-OSS Model Overview
Model | Parameters | Active Parameters | Best For | Hardware Requirements |
---|---|---|---|---|
GPT-OSS-20B | 21B | 3.6B | Lower latency, local/specialized use cases | 16GB+ RAM |
GPT-OSS-120B | 117B | 5.1B | Production, general purpose, high reasoning | Single H100 GPU |
๐ Recommended: Use Ollama for GPT-OSS Models
For the best performance and reliability with GPT-OSS models, we recommend using Ollama:
- โ Best Performance: 19.7 t/s vs 5.2 t/s (MLX) vs N/A (HuggingFace)
- โ Cross-Platform: Works on all platforms (Windows, macOS, Linux)
- โ Easy Setup: Simple installation and model management
- โ Optimized Format: GGUF format optimized for local inference
- โ No Server Required: Direct model execution
Install and use GPT-OSS with Ollama:
๐ Key Features
- ๐ Apache 2.0 License: Build freely without copyleft restrictions
- โก Native MXFP4 Quantization: Optimized for efficient inference
- ๐ Apple Silicon Native: No more mixed precision issues
๐ฆ Install GPT-OSS Models
Via Ollama (Cross-Platform - RECOMMENDED)
# Install GPT-OSS models via Ollama (Best Performance)
super model install gpt-oss:20b
super model install gpt-oss:120b
# Or use direct Ollama commands
ollama pull gpt-oss:20b
ollama pull gpt-oss:120b
# Run with Ollama backend
super model run gpt-oss:20b "Your prompt" --backend ollama
Via MLX-LM (Apple Silicon - Native Support)
# Install GPT-OSS models via Ollama
super model install gpt-oss:20b
super model install gpt-oss:120b
# Or use direct Ollama commands
ollama pull gpt-oss:20b
ollama pull gpt-oss:120b
# Run with Ollama backend
super model run gpt-oss:20b "Your prompt" --backend ollama
Show Ollama Installation Output
๐ SuperOptiX Model Intelligence - Installing gpt-oss:20b
๐ฆ Pulling model gpt-oss:20b from Ollama...
โณ This may take a few minutes depending on your internet connection and model size.
pulling manifest
pulling 8f7b3c2a1d4e: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 12.5 GB
pulling 9a2b4c6d8e0f: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.2 KB
verifying sha256 digest
writing manifest
success
โ
Model pulled successfully!
๐ก You can now use it with SuperOptiX:
super model dspy ollama/gpt-oss:20b
๐ Model details:
โข Size: large
โข Task: chat
โข Parameters: 21B (3.6B active)
๐ Installation completed successfully!
๐ฆ Ollama running on http://localhost:11434 ready to use with SuperOptiX!
Via HuggingFace
# Install GPT-OSS models via HuggingFace
super model install openai/gpt-oss-20b --backend huggingface
super model install openai/gpt-oss-120b --backend huggingface
# Start HuggingFace server
super model server huggingface openai/gpt-oss-20b --port 8001
super model server huggingface openai/gpt-oss-120b --port 8002
Show HuggingFace Installation Output
๐ SuperOptiX Model Intelligence - Installing openai/gpt-oss-20b
๐ค Downloading model from HuggingFace...
โณ This may take several minutes depending on your internet connection and model size.
Downloading model files...
โข config.json: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 2.1 KB
โข model.safetensors: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 12.5 GB
โข tokenizer.json: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.8 MB
โข tokenizer_config.json: 100% โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.2 KB
โ
Model downloaded successfully!
๐ก You can now use it with SuperOptiX:
super model server huggingface openai/gpt-oss-20b --port 8001
๐ Model details:
โข Size: large
โข Task: chat
โข Parameters: 21B (3.6B active)
โข License: Apache 2.0
๐ Installation completed successfully!
๐ฏ Using GPT-OSS with SuperOptiX
1. Configure Playbook for GPT-OSS
# Example playbook configuration for GPT-OSS
language_model:
provider: mlx # or ollama or huggingface
model: lmstudio-community/gpt-oss-20b-MLX-8bit # for MLX-LM
# model: gpt-oss:20b # for Ollama
# model: openai/gpt-oss-20b # for HuggingFace
api_base: http://localhost:11434 # for Ollama
# api_base: http://localhost:8001 # for HuggingFace
temperature: 0.7
max_tokens: 2048
# GPT-OSS Language Model Configuration Examples
**๐ฆ Ollama Backend (Cross-platform - RECOMMENDED):**
```yaml
language_model:
provider: ollama
model: gpt-oss:20b
api_base: http://localhost:11434
temperature: 0.7
max_tokens: 4096
๐ MLX Backend (Apple Silicon - Native Support):
language_model:
provider: mlx
model: lmstudio-community/gpt-oss-20b-MLX-8bit
api_base: http://localhost:8000
temperature: 0.7
max_tokens: 4096
๐ค HuggingFace Backend (Limited on Apple Silicon):
language_model:
provider: huggingface
model: openai/gpt-oss-20b
api_base: http://localhost:8001
temperature: 0.7
max_tokens: 4096
๐ Starting MLX Server for GPT-OSS
Before using GPT-OSS with MLX in your playbook, start the MLX server:
# Start MLX server for GPT-OSS model
super model server mlx lmstudio-community/gpt-oss-20b-MLX-8bit --port 8000
# Or start on a different port
super model server mlx lmstudio-community/gpt-oss-20b-MLX-8bit --port 9000
Server Output:
๐ MLX Local Server
Starting MLX server for lmstudio-community/gpt-oss-20b-MLX-8bit on port 8000...
๐ Starting MLX server...
python -m mlx_lm.server --model lmstudio-community/gpt-oss-20b-MLX-8bit --port 8000
โ
MLX server is running on http://localhost:8000
Note: Keep the server running while using GPT-OSS models in your playbooks.
#### 2. **Test GPT-OSS Models**
```bash
# Test with MLX-LM backend (Apple Silicon - Native)
super model run lmstudio-community/gpt-oss-20b-MLX-8bit "Explain quantum computing with detailed reasoning" --backend mlx
# Test with Ollama backend (Cross-platform - Best Performance)
super model run gpt-oss:20b "Explain quantum computing with detailed reasoning" --backend ollama
# Test with HuggingFace backend (Limited on Apple Silicon)
super model run openai/gpt-oss-20b "Write a Python function to solve the traveling salesman problem" --backend huggingface
3. Basic Usage Examples
# MLX-LM (Apple Silicon - Native support)
super model run lmstudio-community/gpt-oss-20b-MLX-8bit "What is 2+2?" --backend mlx
super model run lmstudio-community/gpt-oss-20b-MLX-8bit "Explain machine learning" --backend mlx
super model run lmstudio-community/gpt-oss-20b-MLX-8bit "Design a distributed system architecture" --backend mlx
# Ollama (Cross-platform - Best performance)
super model run gpt-oss:20b "What is 2+2?" --backend ollama
super model run gpt-oss:20b "Explain machine learning" --backend ollama
super model run gpt-oss:20b "Design a distributed system architecture" --backend ollama
๐ Manage GPT-OSS Models
# List installed GPT-OSS models
super model list | grep gpt-oss
# Get detailed information
super model info gpt-oss:20b
super model info openai/gpt-oss-120b
# Test model performance
super model test gpt-oss:20b "Hello, how are you?"
๐ฏ Performance Recommendations
Use Case | Recommended Model | Hardware |
---|---|---|
Quick responses | GPT-OSS-20B | 16GB+ RAM |
Complex tasks | GPT-OSS-120B | H100 GPU |
Local development | GPT-OSS-20B | 16GB+ RAM |
๐ง Troubleshooting GPT-OSS
Error: error: 'mps.matmul' op detected operation with both F16 and BF16 operands which is not supported
Solution: ```bash
Use MLX-LM backend (native Apple Silicon support)
super model run lmstudio-community/gpt-oss-20b-MLX-8bit "prompt" --backend mlx
# Or use Ollama backend (optimized format)
super model run gpt-oss:20b "prompt" --backend ollama
```
Error: Model not found
or Model does not exist
Solution: ```bash
For MLX-LM (Apple Silicon)
super model install lmstudio-community/gpt-oss-20b-MLX-8bit --backend mlx
# For Ollama
ollama pull gpt-oss:20b
ollama pull gpt-oss:120b
# For HuggingFace
super model install openai/gpt-oss-20b --backend huggingface
super model install openai/gpt-oss-120b --backend huggingface
```
Error: CUDA out of memory
or Not enough memory
Solution:
๐ Resources
- GPT-OSS-120B Model - HuggingFace repository
- GPT-OSS-20B Model - HuggingFace repository
- Ollama Library - Ollama model library
- SuperOptiX Documentation - Complete framework documentation
- DSPy Framework - Foundation framework
๐ MLX (Apple Silicon)
MLX is Apple's native machine learning framework, offering blazing-fast inference on Apple Silicon Macs. MLX-LM v0.26.3 now provides native support for GPT-OSS models!
Apple Silicon Only
MLX only works on Apple Silicon Macs (M1, M2, M3). If you're on Intel Mac, use Ollama instead.
๐ Setup MLX
# Install MLX dependencies
pip install mlx-lm==0.26.3
# Or install with SuperOptiX
pip install "superoptix[mlx]"
๐ฆ Install MLX Models
# Install GPT-OSS models (native Apple Silicon support)
super model install openai/gpt-oss-20b --backend mlx
super model install openai/gpt-oss-120b --backend mlx
# Install popular MLX models
super model install -b mlx mlx-community/phi-2
super model install -b mlx mlx-community/Llama-3.2-3B-Instruct-4bit
super model install -b mlx mlx-community/Mistral-7B-Instruct-v0.2-4bit
super model install -b mlx lmstudio-community/gpt-oss-20b-MLX-8bit
๐ฅ๏ธ Start MLX Servers
# Start MLX server on specific port
super model server mlx phi-2 --port 8000
super model server mlx mlx-community/Llama-3.2-3B-Instruct-4bit --port 8000
Example Output:
๐ MLX Local Server
Starting MLX server for mlx-community_Llama-3.2-3B-Instruct-4bit on port 8000...
๐ Starting MLX server...
๐ก Server will be available at: http://localhost:8000
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m mlx_lm.server --model mlx-community_Llama-3.2-3B-Instruct-4bit --port 8000
๐ Example playbook configuration:
language_model:
provider: mlx
model: mlx-community_Llama-3.2-3B-Instruct-4bit
api_base: http://localhost:8000
๐ Executing: /path/to/python -m mlx_lm.server --model mlx-community_Llama-3.2-3B-Instruct-4bit --port 8000
โณ Server is starting... (Press Ctrl+C to stop)
๐ Manage MLX Models
Example Output:
๐ SuperOptiX Model Intelligence - 1 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ mlx-community_Llama-3.2-3B-Instruct-4bit โ ๐ mlx โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ
# Get model information
super model info mlx-community/phi-2
super model info mlx-community_Llama-3.2-3B-Instruct-4bit
# Models are ready to use with SuperOptiX agents
๐ฎ LM Studio
LM Studio provides a user-friendly interface for running local models, especially popular on Windows.
๐ Setup LM Studio
- Download LM Studio from https://lmstudio.ai
- Install and launch LM Studio
- Download a model through the interface
- Start the server (default port: 1234)
๐ฆ Install Models with SuperOptiX
# Install models (use the name from LM Studio)
super model install -b lmstudio llama-3.2-1b-instruct
super model install -b lmstudio llama-3.2-3b
super model install -b lmstudio your-model-name
๐ฅ๏ธ Start LM Studio Servers
# Start server with specific model
super model server lmstudio llama-3.2-1b-instruct --port 1234
super model server lmstudio llama-3.2-3b --port 1234
Example Output:
๐ฎ LM Studio Local Server
Starting LM Studio server for llama-3.2-1b-instruct on port 1234...
๐ Starting LM Studio server...
๐ก Server will be available at: http://localhost:1234
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
# Start server in LM Studio app first, then connect
๐ Example playbook configuration:
language_model:
provider: lmstudio
model: llama-3.2-1b-instruct
api_base: http://localhost:1234
โณ Server is starting... (Press Ctrl+C to stop)
๐ Manage LM Studio Models
Example Output:
๐ SuperOptiX Model Intelligence - 3 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama-3.2-1b-instruct โ ๐ฎ lmstudio โ โ
installed โ small โ chat โ
โ llama-3.3-70b-instruct โ ๐ฎ lmstudio โ โ
installed โ large โ chat โ
โ llama-4-scout-17b-16e-instruct โ ๐ฎ lmstudio โ โ
installed โ medium โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโ
# Get model information
super model info llama-3.2-1b-instruct
# Models are ready to use with SuperOptiX agents
๐ค HuggingFace
HuggingFace offers access to thousands of models, perfect for advanced users who want maximum flexibility.
๐ Setup HuggingFace
# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# Or install with SuperOptiX
pip install "superoptix[huggingface]"
๐ฆ Install HuggingFace Models
# Install popular models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small
super model install -b huggingface microsoft/DialoGPT-medium
super model install -b huggingface meta-llama/Llama-2-7b-chat-hf
๐ฅ๏ธ Start HuggingFace Servers
# Start server with specific model
super model server huggingface microsoft/Phi-4 --port 8001
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server huggingface microsoft/DialoGPT-medium --port 8001
Example Output:
๐ค HuggingFace Local Server
Starting HuggingFace server for microsoft/DialoGPT-small on port 8002...
๐ Starting HuggingFace server...
๐ก Server will be available at: http://localhost:8002
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m superoptix.models.backends.huggingface_server microsoft/DialoGPT-small --port 8002
๐ Example playbook configuration:
language_model:
provider: huggingface
model: microsoft/DialoGPT-small
api_base: http://localhost:8002
Device set to use cpu
INFO: Started server process [4652]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
๐ Manage HuggingFace Models
Example Output:
๐ SuperOptiX Model Intelligence - 2 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ
# Get model information
super model info microsoft/Phi-4
super model info microsoft/DialoGPT-small
# Models are ready to use with SuperOptiX agents
๐ฏ Model Management Commands
๐ฅ๏ธ Server Commands
Example Output:
usage: super model server [-h] [--port PORT] {mlx,huggingface,lmstudio} model_name
๐ Start local model servers for MLX, HuggingFace, or LM Studio. Examples:
super model server mlx mlx-community/Llama-3.2-3B-Instruct-4bit
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server lmstudio llama-3.2-1b-instruct
Backends:
mlx Apple Silicon optimized (default: port 8000)
huggingface Transformers models (default: port 8001)
lmstudio Desktop app models (default: port 1234)
Note: Ollama servers use 'ollama serve' command separately.
positional arguments:
{mlx,huggingface,lmstudio} Backend type
model_name Model name to start server for
options:
-h, --help show this help message and exit
--port PORT, -p PORT Port to run server on
๐ List and Explore Models
Example Output:
๐ SuperOptiX Model Intelligence - 9 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโโโณโโโโโโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ llama-3.2-1b-instruct โ ๐ฎ lmstudio โ โ
installed โ small โ chat โ
โ llama-3.3-70b-instruct โ ๐ฎ lmstudio โ โ
installed โ large โ chat โ
โ llama-4-scout-17b-16e-instruct โ ๐ฎ lmstudio โ โ
installed โ medium โ chat โ
โ llama3.1:8b โ ๐ฆ ollama โ โ
installed โ medium โ chat โ
โ llama3.2:1b โ ๐ฆ ollama โ โ
installed โ tiny โ chat โ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ mlx-community_Llama-3.2-3B-Instruct-4bit โ ๐ mlx โ โ
installed โ small โ chat โ
โ nomic-embed-text:latest โ ๐ฆ ollama โ โ
installed โ Unknown โ embedding โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโ
๐ Discover more models: super model discover
๐ฅ Install a model: super model install <model_name>
# List all available models (including uninstalled)
super model list --all
# Filter by backend
super model list --backend ollama
super model list --backend mlx
super model list --backend lmstudio
super model list --backend huggingface
# Verbose information
super model list --verbose
๐ Get Model Information
# Get detailed model info
super model info llama3.2:3b
super model info mlx-community/phi-2
super model info microsoft/Phi-4
super model info llama-3.2-1b-instruct
๐ฏ Choose Your Setup
๐ Beginner (Recommended)
# 1. Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# 2. Install SuperOptiX
pip install superoptix
# 3. Install a model
super model install llama3.2:3b
# 4. Models are ready to use with SuperOptiX agents
๐ Apple Silicon User
# 1. Install MLX dependencies
pip install mlx-lm
# 2. Install SuperOptiX
pip install superoptix
# 3. Install MLX model
super model install -b mlx mlx-community/phi-2
# 4. Start server
super model server mlx phi-2 --port 8000
# 5. Models are ready to use with SuperOptiX agents
๐ฎ Windows User
# 1. Install LM Studio from https://lmstudio.ai
# 2. Download a model in LM Studio
# 3. Start server in LM Studio
# 4. Install SuperOptiX
pip install superoptix
# 5. Connect to LM Studio
super model server lmstudio your-model-name --port 1234
# 6. Models are ready to use with SuperOptiX agents
๐ค Advanced User
# 1. Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# 2. Install SuperOptiX
pip install superoptix
# 3. Install HuggingFace model
super model install -b huggingface microsoft/Phi-4
# 4. Start server
super model server huggingface microsoft/Phi-4 --port 8001
# 5. Models are ready to use with SuperOptiX agents
๐ง Advanced Configuration
๐ Multiple Servers
Run multiple models simultaneously:
# Terminal 1: Ollama model
# Models are ready to use with SuperOptiX agents
# Terminal 2: MLX model (Apple Silicon)
super model server mlx phi-2 --port 8000
# Models are ready to use with SuperOptiX agents
# Terminal 3: HuggingFace model
super model server huggingface microsoft/Phi-4 --port 8001
# Models are ready to use with SuperOptiX agents
# Terminal 4: LM Studio model
super model server lmstudio llama-3.2-1b-instruct --port 1234
# Models are ready to use with SuperOptiX agents
๐จ Troubleshooting
Common Issues
Error: Model not found
or Model does not exist
Solution:
Error: Connection refused
or Cannot connect to server
Solution:
Error: Address already in use
Solution:
Error: MLX requires Apple Silicon
Solution:
Error: ModuleNotFoundError: No module named 'mlx_lm'
or ModuleNotFoundError: No module named 'transformers'
Solution:
Error: Command 'ollama' not found
or Command 'lms' not found
Solution:
Error: 401 Unauthorized
or Repository Not Found
Solution:
๐ Next Steps
Now that you have your local models set up:
- ๐ Quick Start Guide - Build your first agent with local models
- ๐ค Create Your First Genies Agent - Step-by-step tutorial
- ๐ช Marketplace - Discover pre-built agents
- ๐ Model Intelligence Guide - Advanced model management
๐ฌ Need Help?
- ๐ Documentation - Comprehensive guides
- ๐ Support Portal - Report bugs
๐ค Ready to Run Local Models?
๐ค HuggingFace
HuggingFace offers access to thousands of models, perfect for advanced users who want maximum flexibility.
๐ Setup HuggingFace
# Install HuggingFace dependencies
pip install transformers torch fastapi uvicorn
# Or install with SuperOptiX
pip install "superoptix[huggingface]"
๐ฆ Install HuggingFace Models
# Install popular models
super model install -b huggingface microsoft/Phi-4
super model install -b huggingface microsoft/DialoGPT-small
super model install -b huggingface microsoft/DialoGPT-medium
super model install -b huggingface meta-llama/Llama-2-7b-chat-hf
๐ฅ๏ธ Start HuggingFace Servers
# Start server with specific model
super model server huggingface microsoft/Phi-4 --port 8001
super model server huggingface microsoft/DialoGPT-small --port 8001
super model server huggingface microsoft/DialoGPT-medium --port 8001
Example Output:
๐ค HuggingFace Local Server
Starting HuggingFace server for microsoft/DialoGPT-small on port 8002...
๐ Starting HuggingFace server...
๐ก Server will be available at: http://localhost:8002
๐ก Use this URL in your playbook's api_base configuration
๐ง Manual server startup command:
python -m superoptix.models.backends.huggingface_server microsoft/DialoGPT-small --port 8002
๐ Example playbook configuration:
language_model:
provider: huggingface
model: microsoft/DialoGPT-small
api_base: http://localhost:8002
Device set to use cpu
INFO: Started server process [4652]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8002 (Press CTRL+C to quit)
๐ Manage HuggingFace Models
Example Output:
๐ SuperOptiX Model Intelligence - 2 models
โโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโณโโโโโโโโณโโโโโโโ
โ Model โ Backend โ Status โ Size โ Task โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ microsoft/DialoGPT-small โ ๐ค huggingface โ โ
installed โ small โ chat โ
โ microsoft/Phi-4 โ ๐ค huggingface โ โ
installed โ small โ chat โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโดโโโโโโโ