Skip to content

Three Execution Modes

SuperQode supports three distinct execution modes for connecting to AI models and agents. Each mode has different capabilities and use cases.


Overview

Mode Description Capabilities Best For
ACP Agent Client Protocol File editing, shell, MCP Advanced automation
BYOK Bring Your Own Key Chat, streaming, analysis Cloud providers
Local Local/Self-hosted Chat + streaming (+ tool calling if supported) Privacy, cost control
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    EXECUTION MODES                           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                              โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”‚
โ”‚  โ”‚     ACP     โ”‚  โ”‚    BYOK     โ”‚  โ”‚    LOCAL    โ”‚         โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค         โ”‚
โ”‚  โ”‚ Agent       โ”‚  โ”‚ Your API    โ”‚  โ”‚ Self-hosted โ”‚         โ”‚
โ”‚  โ”‚ Protocol    โ”‚  โ”‚ Keys        โ”‚  โ”‚ Models      โ”‚         โ”‚
โ”‚  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค         โ”‚
โ”‚  โ”‚ OpenCode    โ”‚  โ”‚ LiteLLM     โ”‚  โ”‚ Ollama      โ”‚         โ”‚
โ”‚  โ”‚ Claude Code โ”‚  โ”‚ Gateway     โ”‚  โ”‚ vLLM        โ”‚         โ”‚
โ”‚  โ”‚ Aider       โ”‚  โ”‚             โ”‚  โ”‚ LM Studio   โ”‚         โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜         โ”‚
โ”‚                                                              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

ACP Mode (Agent Client Protocol)

What is ACP?

ACP mode connects to full-featured coding agents that can edit files, run shell commands, and use MCP tools. The agent manages its own LLM interactions.

How It Works

  1. SuperQode connects to an ACP-compatible agent
  2. Agent spawns and handles LLM communication
  3. Agent has full coding capabilities
  4. SuperQode orchestrates and displays results

Capabilities

Capability Supported
Chat completion โœ“
Streaming โœ“
Tool calling โœ“
File editing โœ“
Shell execution โœ“
MCP tools โœ“
Extended thinking โœ“
Multi-file changes โœ“

Supported Agents

Agent Status Capabilities
OpenCode Supported File editing, shell, MCP, 75+ providers
Claude Code Coming Soon Native Claude integration
Aider Coming Soon Git-integrated pair programming
Cursor Planned IDE integration

Usage

# Install OpenCode first
npm i -g opencode-ai

# Connect via TUI
:connect acp opencode

# Connect via CLI
superqode connect acp opencode

Configuration

# superqode.yaml
default:
  mode: acp
  coding_agent: opencode

agents:
  opencode:
    description: "OpenCode coding agent"
    protocol: acp
    command: opencode
    capabilities:
      - file_editing
      - shell_execution
      - mcp_tools

Agent Capabilities

ACP agents can:

  • Edit Files: Create, modify, and delete files
  • Run Commands: Execute shell commands with streaming output
  • Use MCP Tools: Access Model Context Protocol tools
  • Multi-file Operations: Make coordinated changes across files
  • Extended Thinking: Show reasoning process

BYOK Mode (Bring Your Own Key)

What is BYOK?

BYOK mode allows you to use cloud AI providers by providing your own API keys. SuperQode never stores your keys-they're read from environment variables.

How It Works

  1. You set API keys as environment variables
  2. SuperQode connects via LiteLLM gateway
  3. Direct API calls to the provider
  4. Responses streamed back to you

Capabilities

Capability Supported
Chat completion โœ“
Streaming โœ“
Tool calling โœ“ (if model supports)
File editing โœ— (no agent)
Shell execution โœ— (no agent)
MCP tools โœ— (no agent)
Extended thinking โœ“ (Claude)
Cost tracking โœ“

Supported Providers

Provider Models Free Tier
Google AI Gemini 3 Pro, Gemini 3, Gemini 2.5 Flash Yes
Anthropic Claude Opus 4.5, Sonnet 4.5, Haiku 4.5 No
OpenAI GPT-5.2, GPT-4o, o1 No
xAI Grok 3, Grok 2 No
Mistral AI Mistral Large, Codestral No
Groq Llama 3.3, Mixtral Yes
Provider Models
Zhipu GLM-4, GLM-4V
Alibaba Qwen-Max, Qwen-Plus
Deepseek Deepseek-V3, Deepseek-R1
Provider Models
OpenRouter 95+ models
Together AI 200+ open models
Fireworks AI Optimized inference
Replicate Community models

Usage

# Set API key
export GOOGLE_API_KEY=your-api-key-here

# Connect via TUI
:connect byok google gemini-3-pro

# Connect via CLI
superqode connect byok google gemini-3-pro

Configuration

# superqode.yaml
default:
  mode: byok
  provider: google
  model: gemini-3-pro

providers:
  google:
    api_key_env: GOOGLE_API_KEY
    recommended_models:
      - gemini-3-pro
      - gemini-3
      - gemini-2.5-flash

Local Mode

What is Local Mode?

Local mode connects to self-hosted LLM servers running on your infrastructure. No API keys required-models run locally.

How It Works

  1. You run a local model server (Ollama, vLLM, etc.)
  2. SuperQode connects to the local endpoint
  3. All inference happens on your hardware
  4. Complete privacy-no data leaves your machine

Capabilities

Capability Supported
Chat completion โœ“
Streaming โœ“
Tool calling โœ“ (if model supports)
File editing โœ— (no agent)
Shell execution โœ— (no agent)
MCP tools โœ— (no agent)
Cost tracking โœ— (free)

Supported Providers

Provider Default Port Description
Ollama 11434 Easy local deployment
LM Studio 1234 GUI-based local models
vLLM 8000 High-performance inference
SGLang 30000 Structured generation
MLX-LM 8000 Apple Silicon optimized
TGI 80 Text Generation Inference
llama.cpp 8080 C++ inference

Usage

# Start Ollama
ollama serve

# Pull a model
ollama pull qwen3:8b

# Connect via TUI
:connect local ollama qwen3:8b

# Connect via CLI
superqode connect local ollama qwen3:8b

Configuration

# superqode.yaml
default:
  mode: local
  provider: ollama
  model: qwen3:8b

providers:
  ollama:
    base_url: http://localhost:11434
    type: openai-compatible
    recommended_models:
      - qwen3:8b
      - llama3.2:latest
      - codellama:13b

  vllm:
    base_url: http://localhost:8000
    type: openai-compatible

For QE tasks, these models work well:

Model Size Good For
qwen3:8b 8B General QE, fast
llama3.2:8b 8B General purpose
codellama:13b 13B Code analysis
deepseek-coder:6.7b 6.7B Code generation
mistral:7b 7B Fast inference

Mode Comparison

Feature Matrix

Feature BYOK ACP Local
Chat completion โœ“ โœ“ โœ“
Streaming โœ“ โœ“ โœ“
Tool calling โœ“* โœ“ โœ“*
File editing โœ— โœ“ โœ—
Shell execution โœ— โœ“ โœ—
MCP tools โœ— โœ“ โœ—
Extended thinking โœ“* โœ“ โœ“*
Cost tracking โœ“ โœ— โœ—
Privacy โœ— โœ— โœ“
No API key needed โœ— โœ“** โœ“

*Model-dependent **Agent handles its own auth

When to Use Each Mode

  • You need full coding agent capabilities
  • File editing and shell execution are required
  • You want to use MCP tools
  • You need multi-file coordinated changes
  • You want the agent to handle its own LLM
  • You need cloud model capabilities
  • You want to use specific providers (Google Gemini, Anthropic, OpenAI)
  • You need extended thinking (Claude/Gemini)
  • Cost tracking is important
  • You don't need file editing in QE
  • Privacy is paramount
  • You want to avoid API costs
  • You have sufficient local compute
  • Internet connectivity is limited
  • You're running in an air-gapped environment

Mixing Modes

You can configure different roles to use different modes:

team:
  modes:
    qe:
      roles:
        security_tester:
          mode: acp  # Agent for comprehensive security testing
          coding_agent: opencode

        api_tester:
          mode: byok  # Cloud model for API analysis
          provider: google
          model: gemini-3-pro

        unit_tester:
          mode: local  # Local model for cost-effective testing
          provider: ollama
          model: qwen3:8b

OpenResponses Gateway

For advanced local model usage, SuperQode supports the OpenResponses specification:

superqode:
  gateway:
    type: openresponses
    openresponses:
      base_url: http://localhost:11434
      reasoning_effort: medium
      truncation: auto
      enable_apply_patch: true
      enable_code_interpreter: true

OpenResponses provides:

  • Unified API across providers
  • 45+ streaming event types
  • Built-in tools (apply_patch, code_interpreter)
  • Reasoning/thinking content support

Next Steps