Python Backends¶
PyFlue can run Python code through a dedicated Python backend.
This is separate from the shell sandbox. A shell sandbox is for commands such
as git, pytest, or npm. A Python backend is for safe execution of Python
snippets generated by an agent or provided by your application.
Monty¶
The first Python backend is monty, powered by pydantic-monty.
Install it:
Enable it:
agent = await init(
model="openai:gpt-5.5",
python_backend="monty",
)
session = await agent.session("analysis")
Run Python:
result = await session.run_python(
'{"total": sum(items), "count": len(items)}',
inputs={"items": [1, 2, 3]},
)
print(result.result)
Typed Results¶
run_python can validate the Python result with Pydantic:
from pydantic import BaseModel
class Metrics(BaseModel):
total: int
count: int
metrics = await session.run_python(
'{"total": sum(items), "count": len(items)}',
inputs={"items": [1, 2, 3]},
result=Metrics,
)
External Functions¶
Monty code cannot access host APIs directly. You must expose approved functions explicitly:
async def double(value: int) -> int:
return value * 2
result = await session.run_python(
"await double(value=21)",
external_functions={"double": double},
)
State Persistence¶
Monty keeps a REPL state for the PyFlue agent process. You can also serialize that state when you need to checkpoint it yourself:
await session.run_python("value = 37")
state = session.dump_python_state()
await session.run_python("value = 0")
session.load_python_state(state)
result = await session.run_python("value")
Dataclasses¶
Register dataclasses before running code that needs to return or reuse those types:
from dataclasses import dataclass
@dataclass
class Row:
name: str
score: int
session.register_python_dataclass(Row)
Resource Limits¶
Use resource_limits to pass Monty resource controls:
result = await session.run_python(
"sum(items)",
inputs={"items": [1, 2, 3]},
resource_limits={
"max_duration_secs": 5,
"max_memory": 20_000_000,
},
)
Model-Callable Code Tool¶
When python_backend="monty" is configured, the default DeepAgents harness
receives a run_code tool. This lets the agent use Monty for calculations and
small Python programs during session.prompt(...).
Files¶
When used with the virtual sandbox, PyFlue mounts current sandbox files into Monty's in-memory filesystem:
await session.write_file("data.txt", "hello")
result = await session.run_python(
"from pathlib import Path\nPath('/data.txt').read_text()",
)
When To Use Monty¶
Use Monty for:
- data wrangling
- loops and aggregations
- safe calculations
- programmatic tool calling
- agent-generated Python that should not run through the host shell
Use a remote sandbox provider instead when you need:
gitpippytest- browsers
- system packages
- third-party Python libraries inside the executed code