Benchmark Commands¶
Run coding harness benchmarks across multiple agent targets.
benchmark run¶
Run benchmark tasks across one or more targets.
superqode benchmark run <tasks.json> [OPTIONS]
Arguments¶
| Argument | Description |
|---|---|
tasks.json | Path to a JSON file defining benchmark tasks |
Options¶
| Option | Description |
|---|---|
--target | Target to benchmark (repeatable, e.g., --target superqode --target opencode) |
Examples¶
superqode benchmark run tasks.json --target superqode --target opencode --target pi --target deepagents
Tasks File Format¶
The tasks file is a JSON array of task definitions:
[
{
"id": "task-001",
"prompt": "Implement a fibonacci function in Python"
},
{
"id": "task-002",
"prompt": "Write a markdown parser in TypeScript"
}
]
Each task contains a unique id and a prompt sent to each target. Results are compared across targets by task ID, producing a side-by-side report of completion rates, time, and output quality.
Supported Targets¶
| Target | Description |
|---|---|
superqode | SuperQode harness |
opencode | OpenCode ACP agent |
pi | pi.ai coding agent |
deepagents | DeepAgents harness |