CodeMode Integration¶
This page documents the implemented CodeMode strategy in HarnessRunner.
Scope¶
CodeMode is a harness strategy (strategy=codemode) for MCP-backed tool chaining.
It is not a replacement for default harness planning. The baseline remains strategy=tool_call.
Layer Separation¶
CodeMode in RLM is intentionally split into three layers:
| Layer | Owned by | Responsibility | Requirement in RLM |
|---|---|---|---|
| Harness strategy | RLM Code (HarnessRunner) | Prompting, guardrails, orchestration, telemetry | Always present |
| MCP bridge contract | External MCP server | Expose bridge tools consumed by harness | Must expose search_tools and call_tool_chain |
| Bridge implementation | Provider specific (UTCP, Cloudflare-based, custom) | How bridge is implemented/deployed | Any implementation is acceptable if contract matches |
This means RLM does not bind to Cloudflare-only or UTCP-only internals. RLM binds to MCP tool names and schemas.
Strategy Matrix¶
| Strategy | Planner shape | Execution shape | Default |
|---|---|---|---|
tool_call | Iterative JSON action loop (tool / final) | Multiple tool calls across steps | Yes |
codemode | Single generated JS/TS program | One guarded chain execution via MCP | No |
End-to-End Lifecycle¶
sequenceDiagram
participant User as User (/harness run)
participant HR as HarnessRunner
participant MCP as MCP Server
participant LLM as Model
User->>HR: strategy=codemode, mcp=on
HR->>MCP: search_tools(task_description, limit=10)
MCP-->>HR: discovered tool interfaces
HR->>LLM: task + typed tool surface prompt
LLM-->>HR: {"code": "..."}
HR->>HR: guardrail validation
HR->>MCP: call_tool_chain(code, timeout, max_output_size)
MCP-->>HR: execution payload
HR-->>User: final response Implementation path is in rlm_code/harness/runner.py:
HarnessRunner.run(..., strategy="codemode")_run_codemode_strategy(...)_validate_codemode_code(...)
Required MCP Tools¶
CodeMode requires both tools on the selected MCP server:
search_toolscall_tool_chain
In strict MCP mode, harness defaults to this allowlist:
search_toolslist_toolstools_infoget_required_keys_for_toolcall_tool_chain
This list is defined as HarnessRunner.STRICT_MCP_TOOL_ALLOWLIST.
Provider note:
- Cloudflare CodeMode packages and UTCP bridges are implementation choices.
- RLM harness only requires MCP tools with compatible names/inputs.
- If a provider stack does not expose this bridge contract directly, add an adapter MCP server.
Typed Tool Surface Generation¶
CodeMode creates a prompt surface from two sources:
search_toolsdiscovery payload (tools[].typescript_interfacewhen available)- MCP tool schemas already visible to harness (
input_schema)
The planner then receives:
- task
- target server
- typed declarations for MCP tools on that server
- constraints blocking import/network/process/fs/eval APIs
Guardrail Validation¶
Before execution, generated code is validated for:
- non-empty snippet
- max code length
- blocked API patterns
- max inferred tool-call count
Blocked patterns include:
- module loading:
import,require(...) - network APIs:
fetch,XMLHttpRequest,WebSocket,http,https,net,dns,tls - process APIs:
process.,child_process,spawn,exec - filesystem APIs:
fs,path,readFile,writeFile - dynamic eval:
eval,new Function
If validation fails, harness returns a guardrail block and does not execute call_tool_chain.
Runtime Controls¶
HarnessRunner.run() supports these CodeMode controls:
| Parameter | Default | Meaning |
|---|---|---|
codemode_timeout_ms | 30000 | call_tool_chain timeout |
codemode_max_output_chars | 200000 | max output bytes/chars returned from MCP chain |
codemode_max_code_chars | 12000 | max generated code length allowed |
codemode_max_tool_calls | 30 | max inferred tool-call expressions in code |
Current slash command surface does not expose these knobs directly; it uses defaults.
Output Semantics¶
After chain execution:
- if payload contains
success=false+error, harness returns a failure message. - if payload contains
nonMcpContentResults, harness returns that content. - otherwise harness returns raw chain output text.
CLI Usage¶
/harness run "implement and test" steps=8 mcp=on strategy=codemode mcp_server=codemode
/rlm bench preset=dynamic_web_filtering mode=harness strategy=codemode mcp=on mcp_server=codemode
Strategy coercions in slash commands:
strategy=codemodewithmcp=offauto-enables MCP with warning.tools=...allowlist is ignored forstrategy=codemode.strategyonly applies to/rlm benchwhenmode=harness.
Programmatic API Example¶
from rlm_code.harness.runner import HarnessRunner
runner = HarnessRunner(llm_connector=llm_connector, mcp_manager=mcp_manager, workdir=workdir)
result = runner.run(
task="implement task and add tests",
max_steps=8,
include_mcp=True,
strategy="codemode",
mcp_server="codemode",
codemode_timeout_ms=30000,
codemode_max_output_chars=200000,
codemode_max_code_chars=12000,
codemode_max_tool_calls=30,
)