Skip to content

Architecture Overview

SuperClaw follows a modular architecture designed for extensibility and integration with the Superagentic AI ecosystem.

High-Level Architecture

flowchart TB
    subgraph CLI["CLI Layer"]
        A[superclaw CLI]
    end

    subgraph Core["Core Engine"]
        B[Attack Engine]
        C[Behavior Engine]
        D[Bloom Integration]
    end

    subgraph Adapters["Agent Adapters"]
        E[OpenClaw Adapter]
        F[ACP Adapter]
        G[Custom Adapters]
    end

    subgraph Integration["Integrations"]
        H[CodeOptiX]
        J[Reporting]
    end

    A --> B
    A --> C
    A --> D
    B --> E
    B --> F
    B --> G
    C --> H
    D --> H
    H --> J
    %% External integrations removed

Module Structure

superclaw/
โ”œโ”€โ”€ attacks/          # Attack implementations
โ”‚   โ”œโ”€โ”€ base.py       # Attack abstract base class
โ”‚   โ”œโ”€โ”€ prompt_injection.py
โ”‚   โ”œโ”€โ”€ encoding.py
โ”‚   โ”œโ”€โ”€ jailbreaks.py
โ”‚   โ”œโ”€โ”€ tool_bypass.py
โ”‚   โ””โ”€โ”€ multi_turn.py
โ”‚
โ”œโ”€โ”€ behaviors/        # Security behavior specs
โ”‚   โ”œโ”€โ”€ base.py       # BehaviorSpec base class
โ”‚   โ”œโ”€โ”€ injection_resistance.py
โ”‚   โ”œโ”€โ”€ tool_policy.py
โ”‚   โ”œโ”€โ”€ sandbox_isolation.py
โ”‚   โ”œโ”€โ”€ session_boundary.py
โ”‚   โ”œโ”€โ”€ config_drift.py
โ”‚   โ””โ”€โ”€ protocol_security.py
โ”‚
โ”œโ”€โ”€ adapters/         # Agent communication
โ”‚   โ”œโ”€โ”€ base.py       # AgentAdapter base class
โ”‚   โ””โ”€โ”€ openclaw.py   # OpenClaw WebSocket adapter
โ”‚
โ”œโ”€โ”€ bloom/            # Scenario generation
โ”‚   โ”œโ”€โ”€ scenarios.py  # Template-based scenarios
โ”‚   โ”œโ”€โ”€ ideation.py   # LLM-powered ideation
โ”‚   โ”œโ”€โ”€ rollout.py    # Scenario execution
โ”‚   โ””โ”€โ”€ judgment.py   # LLM-as-judge evaluation
โ”‚
โ”œโ”€โ”€ codeoptix/        # CodeOptiX integration
โ”‚   โ”œโ”€โ”€ adapter.py    # Behavior adapter bridge
โ”‚   โ”œโ”€โ”€ evaluator.py  # Multi-modal evaluator
โ”‚   โ””โ”€โ”€ engine.py     # Evaluation engine
โ”‚
โ”œโ”€โ”€ reporting/        # Report generation
โ”‚   โ”œโ”€โ”€ html.py       # Styled HTML reports
โ”‚   โ”œโ”€โ”€ json_report.py
โ”‚   โ””โ”€โ”€ sarif.py      # GitHub Code Scanning
โ”‚
โ”œโ”€โ”€ config/           # Configuration
โ”‚   โ”œโ”€โ”€ settings.py   # Runtime settings
โ”‚   โ””โ”€โ”€ schemas.py    # Pydantic models
โ”‚
โ””โ”€โ”€ cli.py            # Typer CLI application

Data Flow

sequenceDiagram
    participant CLI
    participant Attack as Attack Engine
    participant Adapter as Agent Adapter
    participant Agent as Target Agent
    participant Behavior as Behavior Spec
    participant Report as Reporter

    CLI->>Attack: run_attack(target, behaviors)
    Attack->>Attack: generate_payloads()

    loop For each payload
        Attack->>Adapter: send_prompt(payload)
        Adapter->>Agent: WebSocket/ACP message
        Agent-->>Adapter: Response
        Adapter-->>Attack: AgentOutput

        Attack->>Behavior: evaluate(output)
        Behavior-->>Attack: BehaviorResult
    end

    Attack->>Report: generate_report(results)
    Report-->>CLI: HTML/JSON/SARIF

Key Classes

Attack Base

class Attack(ABC):
    attack_type: str
    description: str

    @abstractmethod
    def generate_payloads(self) -> list[str]:
        """Return the payloads to execute for this attack."""
        raise NotImplementedError

    @abstractmethod
    def evaluate_response(self, payload, response) -> AttackResult:
        """Score a response for the given payload."""
        raise NotImplementedError

BehaviorSpec Base

class BehaviorSpec(ABC):
    default_severity: Severity

    @abstractmethod
    def get_name(self) -> str:
        """Unique behavior identifier used in registries and reports."""
        raise NotImplementedError

    @abstractmethod
    def get_description(self) -> str:
        """Human-readable description of the security behavior."""
        raise NotImplementedError

    @abstractmethod
    def get_contract(self) -> BehaviorContract:
        """Structured security behavior contract."""
        raise NotImplementedError

    @abstractmethod
    def evaluate(self, agent_output, context) -> BehaviorResult:
        """Return a BehaviorResult based on the agent output."""
        raise NotImplementedError

AgentAdapter Base

class AgentAdapter(ABC):
    @abstractmethod
    async def connect(self) -> bool:
        """Establish a connection to the target agent."""
        raise NotImplementedError

    @abstractmethod
    async def disconnect(self) -> None:
        """Cleanly close the connection to the agent."""
        raise NotImplementedError

    @abstractmethod
    async def send_prompt(self, prompt, context) -> AgentOutput:
        """Send a prompt and return the agent output."""
        raise NotImplementedError

Integration Points

CodeOptiX Integration

SuperClaw behaviors can be registered with CodeOptiX for multi-modal evaluation:

from superclaw.codeoptix import register_superclaw_behaviors

# Registers as 'security-prompt-injection-resistance', etc.
register_superclaw_behaviors()

CI/CD Integration

SARIF output enables GitHub Code Scanning integration:

- name: Run SuperClaw Security Scan
  run: |
    superclaw audit openclaw --report-format sarif --output results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v2
  with:
    sarif_file: results.sarif

Evidence Ledger

Adapters emit a normalized evidence ledger used by evaluators and reports:

  • messages: prompt/response pairs
  • tool_calls: tool invocations
  • tool_results: tool outputs
  • artifacts: files touched, URLs accessed, etc.
  • secrets_detected: detected secrets/patterns

Extension Points

  1. Custom Attacks - Extend Attack base class
  2. Custom Behaviors - Extend BehaviorSpec base class
  3. Custom Adapters - Extend AgentAdapter for new agents
  4. Custom Reporters - Extend ReportGenerator base class