MCP (Model Context Protocol) Support¶
PyFlue provides comprehensive MCP support, enabling agents to connect to external tools and services through the Model Context Protocol.
Overview¶
MCP servers provide tools that agents can use to perform actions. When an MCP server exposes many tools, it can consume significant context tokens. PyFlue addresses this with two operation modes.
Modes¶
Direct Mode (Default)¶
In direct mode, all MCP tools are exposed directly to the agent. Each tool becomes available to the LLM with its name, description, and parameters.
from pyflue import init
agent = await init(
mcp_servers={
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
}
}
)
Use Direct mode when: - The MCP server exposes fewer than 50 tools - You need simple, predictable tool access - Context token usage is not a concern
Search + Execute Mode (Opt-in)¶
In search_execute mode, PyFlue exposes only two tools that give the agent the ability to dynamically discover and call any MCP tool:
mcp_search(query)- Search for relevant MCP tools using keywords or semantic similaritymcp_execute(server, tool, arguments)- Execute a specific tool on a specific server
The agent first searches to find the right tool, then executes it. This approach keeps context usage fixed at ~2 tools regardless of how many tools the MCP server actually provides.
from pyflue import init
agent = await init(
mcp_servers={
"my-api": {
"url": "http://localhost:3000/mcp",
"transport": "streamable-http"
}
},
mcp_mode="search_execute",
mcp_search_limit=10,
mcp_search_backend="bm25"
)
Use Search + Execute mode when: - The MCP server exposes many tools (100+) - You connect to multiple MCP servers - Context window limits are a concern - You want progressive tool discovery
Search Backends¶
BM25 (Default)¶
BM25 is a keyword-based ranking algorithm that scores tools based on term frequency and document length. It requires no external dependencies and works out of the box.
agent = await init(
mcp_servers={"...": "..."},
mcp_mode="search_execute",
mcp_search_backend="bm25"
)
Semantic Search¶
Semantic search uses embeddings to find tools that are conceptually similar to the query, even if they don't share exact keywords. Requires the sentence-transformers package:
agent = await init(
mcp_servers={"...": "..."},
mcp_mode="search_execute",
mcp_search_backend="semantic"
)
MCP Server Configuration¶
Stdio Servers¶
Run MCP servers as local processes:
mcp_servers={
"local-tools": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
"env": {"NODE_ENV": "production"},
"transport": "stdio"
}
}
The same configuration can be written in pyflue.toml:
[mcp]
mode = "direct"
[mcp.servers.local-tools]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
env = { NODE_ENV = "production" }
HTTP Servers (streamable-http)¶
Connect to MCP servers over HTTP using the modern streamable-http transport:
mcp_servers={
"remote-api": {
"url": "http://localhost:3000/mcp",
"transport": "streamable-http",
"headers": {"Authorization": "Bearer token123"}
}
}
TOML form:
[mcp.servers.remote-api]
url = "http://localhost:3000/mcp"
transport = "streamable-http"
[mcp.servers.remote-api.headers]
Authorization = "Bearer token123"
SSE Servers (Legacy)¶
For older MCP servers that use Server-Sent Events:
Using MCPClient Directly¶
For programmatic access to MCP servers, use the MCPClient class directly:
from pyflue.mcp import MCPClient
# Create client with server configuration
client = MCPClient({
"my-server": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
}
})
# List all available tools
tools = client.list_tools()
print(f"Found {len(tools)} tools")
# Search for relevant tools using BM25
results = client.search_tools(query="read file contents", limit=5)
for tool in results:
print(f"{tool.name} (score: {tool.score})")
# Call a specific tool
result = client.call_tool(
server="my-server",
tool="read_file",
arguments={"path": "/tmp/test.txt"}
)
API Reference¶
init() Parameters¶
| Parameter | Type | Description |
|---|---|---|
mcp_servers |
dict |
Map of server name to server configuration |
mcp_mode |
"direct" or "search_execute" |
How to expose MCP tools |
mcp_search_limit |
int |
Max tools to return in search (default: 10) |
mcp_search_backend |
"bm25" or "semantic" |
Search algorithm to use |
Call await agent.destroy() when your process is shutting down to close direct
MCP connections cleanly.
MCPClient Methods¶
| Method | Description |
|---|---|
list_tools() |
Get all available tools from all servers |
list_tools_async() |
Async version of list_tools |
search_tools(query, limit, server) |
Search tools by query |
call_tool(server, tool, arguments) |
Execute a tool |
call_tool_async() |
Async version of call_tool |
load_index() |
Pre-load tool index for faster searches |
Choosing a Mode¶
| Scenario | Mode |
|---|---|
| MCP server with <50 tools | Direct |
| MCP server with 100+ tools | Search + Execute |
| Multiple MCP servers | Search + Execute |
| Simple, predictable tool access | Direct |
| Limited context window | Search + Execute |
| Need semantic tool matching | Search + Execute + semantic backend |