Architecture Overview¶
SuperClaw follows a modular architecture designed for extensibility and integration with the Superagentic AI ecosystem.
High-Level Architecture¶
flowchart TB
subgraph CLI["CLI Layer"]
A[superclaw CLI]
end
subgraph Core["Core Engine"]
B[Attack Engine]
C[Behavior Engine]
D[Bloom Integration]
end
subgraph Adapters["Agent Adapters"]
E[OpenClaw Adapter]
F[ACP Adapter]
G[Custom Adapters]
end
subgraph Integration["Integrations"]
H[CodeOptiX]
J[Reporting]
end
A --> B
A --> C
A --> D
B --> E
B --> F
B --> G
C --> H
D --> H
H --> J
%% External integrations removed
Module Structure¶
superclaw/
โโโ attacks/ # Attack implementations
โ โโโ base.py # Attack abstract base class
โ โโโ prompt_injection.py
โ โโโ encoding.py
โ โโโ jailbreaks.py
โ โโโ tool_bypass.py
โ โโโ multi_turn.py
โ
โโโ behaviors/ # Security behavior specs
โ โโโ base.py # BehaviorSpec base class
โ โโโ injection_resistance.py
โ โโโ tool_policy.py
โ โโโ sandbox_isolation.py
โ โโโ session_boundary.py
โ โโโ config_drift.py
โ โโโ protocol_security.py
โ
โโโ adapters/ # Agent communication
โ โโโ base.py # AgentAdapter base class
โ โโโ openclaw.py # OpenClaw WebSocket adapter
โ
โโโ bloom/ # Scenario generation
โ โโโ scenarios.py # Template-based scenarios
โ โโโ ideation.py # LLM-powered ideation
โ โโโ rollout.py # Scenario execution
โ โโโ judgment.py # LLM-as-judge evaluation
โ
โโโ codeoptix/ # CodeOptiX integration
โ โโโ adapter.py # Behavior adapter bridge
โ โโโ evaluator.py # Multi-modal evaluator
โ โโโ engine.py # Evaluation engine
โ
โโโ reporting/ # Report generation
โ โโโ html.py # Styled HTML reports
โ โโโ json_report.py
โ โโโ sarif.py # GitHub Code Scanning
โ
โโโ config/ # Configuration
โ โโโ settings.py # Runtime settings
โ โโโ schemas.py # Pydantic models
โ
โโโ cli.py # Typer CLI application
Data Flow¶
sequenceDiagram
participant CLI
participant Attack as Attack Engine
participant Adapter as Agent Adapter
participant Agent as Target Agent
participant Behavior as Behavior Spec
participant Report as Reporter
CLI->>Attack: run_attack(target, behaviors)
Attack->>Attack: generate_payloads()
loop For each payload
Attack->>Adapter: send_prompt(payload)
Adapter->>Agent: WebSocket/ACP message
Agent-->>Adapter: Response
Adapter-->>Attack: AgentOutput
Attack->>Behavior: evaluate(output)
Behavior-->>Attack: BehaviorResult
end
Attack->>Report: generate_report(results)
Report-->>CLI: HTML/JSON/SARIF
Key Classes¶
Attack Base¶
class Attack(ABC):
attack_type: str
description: str
@abstractmethod
def generate_payloads(self) -> list[str]:
"""Return the payloads to execute for this attack."""
raise NotImplementedError
@abstractmethod
def evaluate_response(self, payload, response) -> AttackResult:
"""Score a response for the given payload."""
raise NotImplementedError
BehaviorSpec Base¶
class BehaviorSpec(ABC):
default_severity: Severity
@abstractmethod
def get_name(self) -> str:
"""Unique behavior identifier used in registries and reports."""
raise NotImplementedError
@abstractmethod
def get_description(self) -> str:
"""Human-readable description of the security behavior."""
raise NotImplementedError
@abstractmethod
def get_contract(self) -> BehaviorContract:
"""Structured security behavior contract."""
raise NotImplementedError
@abstractmethod
def evaluate(self, agent_output, context) -> BehaviorResult:
"""Return a BehaviorResult based on the agent output."""
raise NotImplementedError
AgentAdapter Base¶
class AgentAdapter(ABC):
@abstractmethod
async def connect(self) -> bool:
"""Establish a connection to the target agent."""
raise NotImplementedError
@abstractmethod
async def disconnect(self) -> None:
"""Cleanly close the connection to the agent."""
raise NotImplementedError
@abstractmethod
async def send_prompt(self, prompt, context) -> AgentOutput:
"""Send a prompt and return the agent output."""
raise NotImplementedError
Integration Points¶
CodeOptiX Integration¶
SuperClaw behaviors can be registered with CodeOptiX for multi-modal evaluation:
from superclaw.codeoptix import register_superclaw_behaviors
# Registers as 'security-prompt-injection-resistance', etc.
register_superclaw_behaviors()
CI/CD Integration¶
SARIF output enables GitHub Code Scanning integration:
- name: Run SuperClaw Security Scan
run: |
superclaw audit openclaw --report-format sarif --output results.sarif
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: results.sarif
Evidence Ledger¶
Adapters emit a normalized evidence ledger used by evaluators and reports:
messages: prompt/response pairstool_calls: tool invocationstool_results: tool outputsartifacts: files touched, URLs accessed, etc.secrets_detected: detected secrets/patterns
Extension Points¶
- Custom Attacks - Extend
Attackbase class - Custom Behaviors - Extend
BehaviorSpecbase class - Custom Adapters - Extend
AgentAdapterfor new agents - Custom Reporters - Extend
ReportGeneratorbase class