CodeMode Guardrails¶

This page is the release safety spec for harness strategy=codemode.

Threat Model¶

CodeMode asks a model to generate executable JavaScript/TypeScript statements. The primary risks are:

CodeMode mitigates these with strict MCP tool policy, static guardrail checks, and runtime caps.

Boundary	Trusted	Untrusted
Planner output	Harness validator	Raw model-generated code
Tool execution	MCP bridge + allowlisted tools	Any API outside MCP tool path
Server selection	Explicit `mcp_server` (recommended)	Ambiguous auto-selection across multiple servers

Sandbox boundary note:

In CodeMode, execution happens in the MCP bridge runtime behind call_tool_chain.
RLM /sandbox controls do not automatically isolate that external bridge process.
Bridge deployment hardening is required for production isolation.

Validation is implemented in HarnessRunner._validate_codemode_code().

Category	Example patterns
Module loading	`import`, `require(...)`
Network APIs	`fetch`, `XMLHttpRequest`, `WebSocket`
Process APIs	`process.`, `child_process`, `spawn`, `exec`
Filesystem APIs	`fs`, `path`, `readFile`, `writeFile`
Dynamic eval	`eval(...)`, `new Function(...)`
Low-level network modules	`http`, `https`, `net`, `dns`, `tls`

If any check fails, chain execution is blocked and run status remains incomplete.

When MCP strict mode is active (default in harness benchmark path), only this MCP allowlist is exposed:

CodeMode additionally requires:

on the selected server.

Execution is passed to call_tool_chain with explicit bounds:

Condition	Observed behavior
MCP disabled	`Code-mode strategy requires mcp=on.`
No resolvable server	explicit failure describing `call_tool_chain/search_tools` requirement
Required tool missing	explicit failure listing missing required tools
Guardrail violation	`Code-mode guardrail blocked execution: ...`
Chain failure payload	surfaced as `Code-mode execution failed: <error>`

Harness benchmark case payloads include:

Use these fields to verify that CodeMode runs stayed on the intended MCP path and did not regress safety behavior.

Keep default strategy as tool_call.
Run side-by-side benchmark comparison on same preset/case set.
Require zero unexpected guardrail bypass signals (review codemode_guardrail_blocked and failures).
Use /rlm bench validate ... with explicit thresholds.
Roll back to strategy=tool_call immediately on policy or reliability regression.

Rollback is operationally simple because strategy remains opt-in:

/harness run "..." strategy=tool_call
/rlm bench preset=... mode=harness strategy=tool_call