Skip to content

MCP (Model Context Protocol) Support

PyFlue provides comprehensive MCP support, enabling agents to connect to external tools and services through the Model Context Protocol.

Overview

MCP servers provide tools that agents can use to perform actions. When an MCP server exposes many tools, it can consume significant context tokens. PyFlue addresses this with two operation modes.

Modes

Direct Mode (Default)

In direct mode, all MCP tools are exposed directly to the agent. Each tool becomes available to the LLM with its name, description, and parameters.

from pyflue import init

agent = await init(
    mcp_servers={
        "filesystem": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
        }
    }
)

Use Direct mode when: - The MCP server exposes fewer than 50 tools - You need simple, predictable tool access - Context token usage is not a concern

Search + Execute Mode (Opt-in)

In search_execute mode, PyFlue exposes only two tools that give the agent the ability to dynamically discover and call any MCP tool:

  • mcp_search(query) - Search for relevant MCP tools using keywords or semantic similarity
  • mcp_execute(server, tool, arguments) - Execute a specific tool on a specific server

The agent first searches to find the right tool, then executes it. This approach keeps context usage fixed at ~2 tools regardless of how many tools the MCP server actually provides.

from pyflue import init

agent = await init(
    mcp_servers={
        "my-api": {
            "url": "http://localhost:3000/mcp",
            "transport": "streamable-http"
        }
    },
    mcp_mode="search_execute",
    mcp_search_limit=10,
    mcp_search_backend="bm25"
)

Use Search + Execute mode when: - The MCP server exposes many tools (100+) - You connect to multiple MCP servers - Context window limits are a concern - You want progressive tool discovery

Search Backends

BM25 (Default)

BM25 is a keyword-based ranking algorithm that scores tools based on term frequency and document length. It requires no external dependencies and works out of the box.

agent = await init(
    mcp_servers={"...": "..."},
    mcp_mode="search_execute",
    mcp_search_backend="bm25"
)

Semantic search uses embeddings to find tools that are conceptually similar to the query, even if they don't share exact keywords. Requires the sentence-transformers package:

pip install sentence-transformers
agent = await init(
    mcp_servers={"...": "..."},
    mcp_mode="search_execute",
    mcp_search_backend="semantic"
)

MCP Server Configuration

Stdio Servers

Run MCP servers as local processes:

mcp_servers={
    "local-tools": {
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
        "env": {"NODE_ENV": "production"},
        "transport": "stdio"
    }
}

The same configuration can be written in pyflue.toml:

[mcp]
mode = "direct"

[mcp.servers.local-tools]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
env = { NODE_ENV = "production" }

HTTP Servers (streamable-http)

Connect to MCP servers over HTTP using the modern streamable-http transport:

mcp_servers={
    "remote-api": {
        "url": "http://localhost:3000/mcp",
        "transport": "streamable-http",
        "headers": {"Authorization": "Bearer token123"}
    }
}

TOML form:

[mcp.servers.remote-api]
url = "http://localhost:3000/mcp"
transport = "streamable-http"

[mcp.servers.remote-api.headers]
Authorization = "Bearer token123"

SSE Servers (Legacy)

For older MCP servers that use Server-Sent Events:

mcp_servers={
    "legacy-server": {
        "url": "http://localhost:8080/sse",
        "transport": "sse"
    }
}

Using MCPClient Directly

For programmatic access to MCP servers, use the MCPClient class directly:

from pyflue.mcp import MCPClient

# Create client with server configuration
client = MCPClient({
    "my-server": {
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
    }
})

# List all available tools
tools = client.list_tools()
print(f"Found {len(tools)} tools")

# Search for relevant tools using BM25
results = client.search_tools(query="read file contents", limit=5)
for tool in results:
    print(f"{tool.name} (score: {tool.score})")

# Call a specific tool
result = client.call_tool(
    server="my-server",
    tool="read_file",
    arguments={"path": "/tmp/test.txt"}
)

API Reference

init() Parameters

Parameter Type Description
mcp_servers dict Map of server name to server configuration
mcp_mode "direct" or "search_execute" How to expose MCP tools
mcp_search_limit int Max tools to return in search (default: 10)
mcp_search_backend "bm25" or "semantic" Search algorithm to use

Call await agent.destroy() when your process is shutting down to close direct MCP connections cleanly.

MCPClient Methods

Method Description
list_tools() Get all available tools from all servers
list_tools_async() Async version of list_tools
search_tools(query, limit, server) Search tools by query
call_tool(server, tool, arguments) Execute a tool
call_tool_async() Async version of call_tool
load_index() Pre-load tool index for faster searches

Choosing a Mode

Scenario Mode
MCP server with <50 tools Direct
MCP server with 100+ tools Search + Execute
Multiple MCP servers Search + Execute
Simple, predictable tool access Direct
Limited context window Search + Execute
Need semantic tool matching Search + Execute + semantic backend