Python Backends¶

PyFlue can run Python code through a dedicated Python backend.

This is separate from the shell sandbox. A shell sandbox is for commands such as git, pytest, or npm. A Python backend is for safe execution of Python snippets generated by an agent or provided by your application.

Monty¶

The first Python backend is monty, powered by pydantic-monty.

Install it:

uv add "pyflue[monty]"

pip install "pyflue[monty]"

Enable it:

agent = await init(
    model="openai:gpt-5.5",
    python_backend="monty",
)
session = await agent.session("analysis")

Run Python:

result = await session.run_python(
    '{"total": sum(items), "count": len(items)}',
    inputs={"items": [1, 2, 3]},
)

print(result.result)

Typed Results¶

run_python can validate the Python result with Pydantic:

from pydantic import BaseModel


class Metrics(BaseModel):
    total: int
    count: int


metrics = await session.run_python(
    '{"total": sum(items), "count": len(items)}',
    inputs={"items": [1, 2, 3]},
    result=Metrics,
)

External Functions¶

Monty code cannot access host APIs directly. You must expose approved functions explicitly:

async def double(value: int) -> int:
    return value * 2


result = await session.run_python(
    "await double(value=21)",
    external_functions={"double": double},
)

State Persistence¶

Monty keeps a REPL state for the PyFlue agent process. You can also serialize that state when you need to checkpoint it yourself:

await session.run_python("value = 37")
state = session.dump_python_state()

await session.run_python("value = 0")
session.load_python_state(state)

result = await session.run_python("value")

Dataclasses¶

Register dataclasses before running code that needs to return or reuse those types:

from dataclasses import dataclass


@dataclass
class Row:
    name: str
    score: int


session.register_python_dataclass(Row)

Resource Limits¶

Use resource_limits to pass Monty resource controls:

result = await session.run_python(
    "sum(items)",
    inputs={"items": [1, 2, 3]},
    resource_limits={
        "max_duration_secs": 5,
        "max_memory": 20_000_000,
    },
)

Model-Callable Code Tool¶

When python_backend="monty" is configured, the default DeepAgents harness receives a run_code tool. This lets the agent use Monty for calculations and small Python programs during session.prompt(...).

Files¶

When used with the virtual sandbox, PyFlue mounts current sandbox files into Monty's in-memory filesystem:

await session.write_file("data.txt", "hello")

result = await session.run_python(
    "from pathlib import Path\nPath('/data.txt').read_text()",
)

When To Use Monty¶

Use Monty for:

data wrangling
loops and aggregations
safe calculations
programmatic tool calling
agent-generated Python that should not run through the host shell

Use a remote sandbox provider instead when you need:

git
pip
pytest
browsers
system packages
third-party Python libraries inside the executed code