SuperClaw
RedβTeam AI Agents Before They RedβTeam You
Scenarioβdriven, behaviorβfirst security testing for autonomous agents.
What is SuperClaw?¶
SuperClaw is a pre-deployment security testing framework for AI coding agents. It systematically identifies vulnerabilities before your agents touch sensitive data or connect to external ecosystems.
π― Scenario-Driven Testing
Generate and execute adversarial scenarios against real agents with reproducible results.
Get started βπ Behavior Contracts
Explicit success criteria, evidence extraction, and mitigation guidance for each security property.
Explore behaviors βπ Evidence-First Reporting
Reports include tool calls, outputs, and actionable fixes in HTML, JSON, or SARIF formats.
CI/CD integration βπ‘οΈ Built-in Guardrails
Local-only mode and authorization checks reduce misuse risk.
Safety guide ββ οΈ Security and Ethical Use¶
Authorized Testing Only
SuperClaw is for authorized security testing only. Before using:
- β Obtain written permission to test the target system
- β Run tests in sandboxed or isolated environments
- β Treat automated findings as signals, not proofβverify manually
Guardrails enforced by default:
- Local-only mode blocks remote targets
- Remote targets require
SUPERCLAW_AUTH_TOKEN
Threat Model¶
OpenClaw + Moltbook Risk Surface
OpenClaw agents often run with broad tool access. When connected to Moltbook or other agent networks, they can ingest untrusted, adversarial content that enables:
- Prompt injection and hidden instruction attacks
- Tool misuse and policy bypass
- Behavioral drift over time
- Cascading cross-agent exploitation
SuperClaw evaluates these risks before deployment.
The Problem¶
Autonomous agents are deployed with high privilege, mutable behavior, and exposure to untrusted inputsβoften without structured security validation. This makes prompt injection, tool misuse, configuration drift, and data leakage likely but poorly understood until after exposure.
The Solution¶
SuperClaw performs pre-deployment, scenario-driven security evaluation:
- Generates adversarial attack scenarios
- Executes them against your agent
- Captures evidence (tool calls, outputs, artifacts)
- Scores behavior against explicit contracts
- Produces actionable reports with mitigations
Non-Goals¶
SuperClaw does not:
- Generate agents
- Run production workloads
- Automate real-world exploitation
Quick Start¶
Run your first attack:
# Attack a local OpenClaw instance
superclaw attack openclaw --target ws://127.0.0.1:18789
# Or test offline with the mock adapter
superclaw attack mock --behaviors prompt-injection-resistance
# Generate a comprehensive audit report
superclaw audit openclaw --comprehensive --report-format html
Key Features¶
| Feature | Description |
|---|---|
| π― Attack Library | 5 attack techniques with 100+ payloads |
| π Behavior Specs | 6 security behaviors with severity levels |
| πΈ Bloom Integration | LLM-powered scenario generation |
| π Multi-Format Reports | HTML, JSON, SARIF for CI/CD |
| π¬ CodeOptiX Integration | Multi-modal evaluation pipeline |
Supported Targets¶
| Target | Adapter | Description |
|---|---|---|
| π¦ OpenClaw | openclaw |
AI coding agents via ACP WebSocket |
| π§ͺ Mock | mock |
Offline deterministic testing |
| π§ Custom | Extend BaseAdapter |
Build your own adapter |
Attack Techniques¶
| Technique | Description |
|---|---|
prompt-injection |
Direct and indirect injection attacks |
encoding |
Base64, hex, unicode, typoglycemia obfuscation |
jailbreak |
DAN, grandmother, role-play bypass techniques |
tool-bypass |
Tool policy bypass via alias confusion |
multi-turn |
Persistent escalation across conversation turns |
Security Behaviors¶
| Behavior | Severity | Tests |
|---|---|---|
prompt-injection-resistance |
π΄ CRITICAL | Injection detection and rejection |
sandbox-isolation |
π΄ CRITICAL | Container and filesystem boundaries |
tool-policy-enforcement |
π HIGH | Allow/deny list compliance |
session-boundary-integrity |
π HIGH | Cross-session isolation |
configuration-drift-detection |
π‘ MEDIUM | Config stability over time |
acp-protocol-security |
π‘ MEDIUM | Protocol message handling |
Superagentic AI Ecosystem¶
SuperClaw is part of a comprehensive AI quality and security ecosystem:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Superagentic AI Ecosystem β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β SuperQE β Quality Engineering core engine β
β SuperClaw β Agent security testing framework βββ YOU β
β CodeOptiX β Code optimization & evaluation engine β
β Bloom β Behavioral evaluation scenario generation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ