Server Composition Gateway Spike: FastMCP mount()¶
Date: 2026-02-24 Version: v0.18.0 Phase 2 Status: Complete
Summary¶
Evaluated FastMCP 3.x mount() for per-session MCP server aggregation — a single gateway endpoint that composes shoal-orchestrator with each session's configured MCP servers (memory, github, filesystem, etc.).
Recommendation: No-go for now. The composition gateway adds process overhead and latency without solving a problem Shoal actually has today. The existing pool + proxy architecture is simpler and already works. Revisit when agent clients support multi-server configs natively or when Shoal needs cross-server tool orchestration.
1. What mount() Does in FastMCP 3.x¶
FastMCP 3.x provides mount() for composing multiple MCP servers into a single unified endpoint. A parent server mounts child servers, exposing all their tools, resources, and prompts through one connection.
API:
def mount(
self,
server: FastMCP,
namespace: str | None = None, # prefix for tool names
tool_names: dict[str, str] | None = None, # custom name overrides
) -> None
Namespacing: When namespace="memory" is provided, tool create becomes memory_create. An underscore separator is auto-inserted — never include a trailing underscore in the namespace string (known gotcha, GitHub #1308).
Dynamic link: Unlike import_server() (one-time static copy), mount() creates a live connection — tools added to the child after mounting are immediately accessible through the parent.
Remote servers via create_proxy(): Third-party MCP servers can be mounted by first wrapping them in a proxy:
from fastmcp import FastMCP
from fastmcp.server import create_proxy
gateway = FastMCP("Session Gateway")
# Mount shoal-orchestrator in-process (zero overhead)
gateway.mount(shoal_mcp, namespace="shoal")
# Mount third-party stdio server via proxy
config = {"mcpServers": {"default": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-memory"]
}}}
gateway.mount(create_proxy(config), namespace="memory")
Supported transports: stdio, HTTP (streamable-http), SSE, in-memory. No Unix domain socket transport — FastMCP cannot connect to Shoal's existing UDS pool sockets.
2. Architecture Options¶
Option A: Gateway-per-session¶
Each session gets its own gateway process that mounts the session's configured MCP servers.
Session "feature-ui" (mcp: [memory, github])
└─ Gateway (HTTP :8391)
├─ shoal-orchestrator (in-process, namespace="shoal")
├─ memory (stdio proxy, namespace="memory")
└─ github (stdio proxy, namespace="github")
Session "feature-api" (mcp: [memory])
└─ Gateway (HTTP :8392)
├─ shoal-orchestrator (in-process, namespace="shoal")
└─ memory (stdio proxy, namespace="memory")
| Aspect | Assessment |
|---|---|
| Per-session tool surface | Each agent sees only its configured servers |
| Process overhead | 1 gateway process + N proxied subprocesses per session |
| Memory | ~90 MB per gateway (Python + uvicorn + FastMCP) |
| Port allocation | Needs dynamic port management per session |
| Lifecycle | Must start/stop gateway with session create/kill |
Option B: Shared gateway¶
Single gateway mounts all registered MCP servers. All sessions connect to the same endpoint.
Shared Gateway (HTTP :8390)
├─ shoal-orchestrator (in-process, namespace="shoal")
├─ memory (stdio proxy, namespace="memory")
├─ github (stdio proxy, namespace="github")
└─ filesystem (stdio proxy, namespace="fs")
| Aspect | Assessment |
|---|---|
| Per-session tool surface | All agents see all servers (no session-level filtering) |
| Process overhead | 1 gateway + N proxied subprocesses (shared) |
| Memory | ~90 MB total |
| Port allocation | Single well-known port |
| Lifecycle | Starts once, survives session churn |
Which fits Shoal? Neither is a clear win. Option A is architecturally cleaner (session isolation) but expensive — 90 MB per gateway is steep when sessions already have their own tmux panes, worktrees, and agent processes. Option B is simpler but violates Shoal's per-session MCP configuration model (sessions declare which MCP servers they need).
3. Integration with Existing Architecture¶
Current flow (pool + proxy)¶
mcp_pool.py: asyncio Unix socket listener per MCP type. Each client connection spawns a fresh MCP command and bridges I/O.mcp_proxy.py: stdio-to-UDS bridge binary. Agent's MCP client config pointscommandtoshoal-mcp-proxy <name>.shoal-orchestrator: Already a FastMCP server, runs via HTTP (port 8390) or stdio.
What a gateway would change¶
The gateway would replace the proxy + pool path for mounted servers, using create_proxy() to manage subprocess spawning instead of mcp_pool.py:
Integration points:
shoal-orchestrator: Can be mounted in-process (gateway.mount(mcp, namespace="shoal")). This is the cleanest part — zero overhead, no serialization.- Third-party stdio servers: Must use
create_proxy()with MCPConfig dicts. This duplicates whatmcp_pool.pyalready does (spawn + bridge) but through FastMCP's proxy layer instead of raw asyncio. - Session lifecycle:
create_session_lifecycle()would need to start a gateway per session (Option A) or register with a shared gateway (Option B). Either requires new lifecycle code. - Agent config: Agents currently get per-server entries in their MCP config. A gateway collapses these to one endpoint. This works if the agent client supports HTTP transport — Claude Code and OpenCode do.
What stays unchanged¶
- Tool configs (
~/.config/shoal/tools/*.toml) — unchanged - Session templates — still declare
mcp: [memory, github] mcp-servers.tomlregistry — still defines available servers- Status detection, journals, lifecycle hooks — all orthogonal
4. Known Limitations¶
No UDS transport¶
FastMCP does not support Unix domain sockets. The gateway cannot connect to Shoal's existing pool sockets. A gateway must manage its own subprocess spawning via create_proxy(), meaning the pool and gateway are parallel systems, not layered.
No shared state in spawn-per-connection model¶
create_proxy() with stdio targets spawns a fresh process per client session by default. This matches Shoal's current pool behavior — MCP memory written by one agent is not readable by another. A connected Client object can share state, but FastMCP warns about "context mixing in concurrent scenarios."
Proxy latency overhead¶
| Operation | Local (in-process) | Proxied (stdio subprocess) |
|---|---|---|
list_tools() |
1-2ms | 300-400ms |
call_tool() |
1-2ms | 200-500ms |
This compounds across mounted servers. A gateway mounting 3 proxied servers adds 300-400ms to tool discovery. Not catastrophic for AI agents (they call tools infrequently relative to thinking time), but measurable.
Tool namespace collisions¶
Without namespacing, the most recently mounted server wins on name conflicts. With namespacing, tool names become longer (memory_create vs create), consuming more of the agent's context window. Minor issue in practice.
Tool count overload¶
Mounting many servers exposes many tools to the LLM. Claude Code and OpenCode handle this well today, but it's a known concern in the MCP ecosystem. The fastmcp-gateway project addresses this with "progressive discovery" meta-tools.
Namespace gotcha¶
The auto-inserted underscore separator means namespace="f1_" produces f1__add (double underscore). Always omit trailing underscores.
5. Decision¶
No-go (defer to backlog)¶
| Factor | Assessment |
|---|---|
| Problem severity | Low — agents already connect to multiple MCP servers via per-server config entries. No user pain reported. |
| Architecture fit | Poor — duplicates pool subprocess management. No UDS support means gateway and pool are parallel systems. |
| Complexity | High — per-session gateway requires lifecycle management, port allocation, memory overhead. Shared gateway breaks session-level MCP isolation. |
| Benefit | Marginal — single endpoint per session vs N endpoints. Agents handle multiple endpoints fine today. |
| Latency | 300-400ms proxy overhead per mounted server, compounding. Acceptable but not free. |
| Dependency risk | FastMCP 3.x is new (2026-02). mount() and create_proxy() APIs are stabilizing but have open issues (#1308, #2802). |
Rationale¶
The composition gateway solves a problem Shoal doesn't have yet. Today's architecture — where each agent gets per-server MCP config entries pointing to the pool proxy — works correctly, is battle-tested, and adds no extra processes.
A gateway becomes valuable when:
- Cross-server orchestration is needed (e.g., a tool that reads from memory and writes to github in one operation)
- Agent clients stop supporting multi-server configs (unlikely — the trend is toward more MCP support)
- Robo supervisor needs a unified tool surface across all session MCP servers (possible in Phase 4)
Follow-up¶
- Move "Server Composition Gateway" from Phase 2 to backlog in ROADMAP.md
- Revisit if robo supervisor (Phase 4) needs unified cross-session MCP access
- Monitor FastMCP UDS transport support — if added, the gateway becomes cheaper to integrate
- Consider
import_server()(static copy, no proxy overhead) if a lightweight variant is needed
6. References¶
- FastMCP Server Composition
- FastMCP Proxy Servers
- FastMCP Server API Reference
- Introducing FastMCP 3.0
- What's New in FastMCP 3.0
- GitHub #1308: prefix parameter namespace issue
- Shoal Transport Spike (v0.15.0) — prior art on FastMCP transport evaluation