Fluent MCP
by FluentData
Fluent MCP is a modern framework for building Model Context Protocol (MCP) servers with intelligent reasoning capabilities. It provides a structured approach to building servers that can perform embedded reasoning with language models, register and execute tools, and manage prompts and configurations.
Last updated: N/A
Fluent MCP
A modern framework for building Model Context Protocol (MCP) servers with intelligent reasoning capabilities.
Overview
Fluent MCP is a toolkit for scaffolding and managing MCP servers with a focus on AI integration. It provides a structured approach to building servers that can perform embedded reasoning with language models, register and execute tools, and manage prompts and configurations.
The framework is designed to be extensible, allowing LLMs to build and register their own tools, supporting the development of self-improving AI systems.
Core Architecture Pattern
Fluent MCP implements a powerful architectural pattern that fundamentally changes how AI systems interact:
Two-Tier LLM Architecture
- Embedded LLM: An internal LLM that performs complex reasoning and multi-step tasks
- Consuming LLM: The external LLM (like Claude) that interacts with your MCP server
Tool Separation
- Embedded Tools: Internal tools that are ONLY available to the embedded LLM, not exposed externally
- External Tools: Tools that are exposed to consuming LLMs through the MCP protocol
Reasoning Offloading
- Complex multi-step reasoning is offloaded from the consuming LLM to the embedded LLM
- External tools can leverage embedded reasoning internally while presenting a simple interface
- This creates a "reasoning sandwich" where complex logic happens in the middle layer
Benefits
- Token Efficiency: Consuming LLMs use fewer tokens by offloading reasoning to embedded LLMs
- Cost Reduction: Smaller, specialized models can handle specific reasoning tasks at lower cost
- Complexity Hiding: Complex multi-step processes are hidden behind simple interfaces
- Separation of Concerns: Clear boundaries between what's exposed and what's internal

Fluent MCP Architecture
Features
- Reasoning Offloading: Offload complex reasoning from consuming LLMs to embedded LLMs for token and cost efficiency
- Tool Separation: Clear distinction between embedded tools (internal) and external tools (exposed)
- Server Scaffolding: Generate new MCP server projects with the proper structure
- LLM Integration: Seamlessly connect to language models from different providers
- Tool Registry: Register both embedded tools (used internally) and external tools (exposed to consuming LLMs)
- Embedded Reasoning: Run reasoning with LLMs and execute their tool calls
- Prompt Management: Load and manage prompts from files with support for tool definitions in frontmatter
- Error Handling: Robust error handling for LLM integration and tool execution
Installation
pip install fluent_mcp
For development:
# Clone the repository
git clone https://github.com/yourusername/fluent_mcp.git
cd fluent_mcp
# Install in development mode with extra dependencies
pip install -e ".[dev]"
Quick Start
Creating a New Server
Use the CLI to scaffold a new server:
fluent-mcp new my_server
Or create a server programmatically:
from fluent_mcp import scaffold_server
scaffold_server(
output_dir=".",
server_name="my_server",
description="My MCP server with AI capabilities"
)
Implementing the Core Architecture Pattern
from fluent_mcp.core.tool_registry import register_embedded_tool, register_external_tool
from fluent_mcp.core.llm_client import run_embedded_reasoning
import asyncio
# 1. Define embedded tools (ONLY available to the embedded LLM)
@register_embedded_tool()
def search_database(query: str) -> list:
"""Search the database for information (only used internally)."""
# Implementation...
return ["result1", "result2"]
@register_embedded_tool()
def analyze_data(data: list) -> dict:
"""Analyze data and extract insights (only used internally)."""
# Implementation...
return {"key_insight": "finding", "confidence": 0.95}
# 2. Define an external tool that leverages embedded reasoning
@register_external_tool()
async def research_question(question: str) -> dict:
"""
Research a question and provide a comprehensive answer.
This external tool is exposed to consuming LLMs but internally
uses embedded reasoning with access to embedded tools.
"""
# Define system prompt for embedded reasoning
system_prompt = """
You are a research assistant with access to internal tools:
- search_database: Search for information
- analyze_data: Analyze and extract insights
Use these tools to thoroughly research the question.
"""
# Run embedded reasoning (this is where the magic happens)
result = await run_embedded_reasoning(
system_prompt=system_prompt,
user_prompt=f"Research this question: {question}"
)
# Return a clean, structured response to the consuming LLM
return {
"answer": result["content"],
"confidence": 0.9,
"sources": ["source1", "source2"]
}
Running a Server with the Architecture Pattern
from fluent_mcp import create_mcp_server
from my_tools import search_database, analyze_data, research_question
# Create and run MCP server
server = create_mcp_server(
server_name="my_server",
# Embedded tools (ONLY available to the embedded LLM)
embedded_tools=[search_database, analyze_data],
# External tools (exposed to consuming LLMs)
external_tools=[research_question],
config={
"provider": "ollama",
"model": "llama2",
"base_url": "http://localhost:11434",
"api_key": "ollama"
}
)
server.run()
Using Prompts with Tool Definitions
Fluent MCP supports defining which tools are available to a prompt directly in the prompt's frontmatter:
---
name: math_tools
description: A prompt that uses math-related tools
model: gpt-4
temperature: 0.3
tools:
- add_numbers
- multiply_numbers
---
You are a math assistant that can perform calculations.
Use the available tools to help solve math problems.
When using this prompt with embedded reasoning, only the specified tools will be available:
from fluent_mcp.core.llm_client import run_embedded_reasoning
# Get a prompt with tool definitions
math_prompt = server.get_prompt("math_tools")
# Run embedded reasoning with only the tools defined in the prompt
result = await run_embedded_reasoning(
system_prompt=math_prompt["template"],
user_prompt="What is 5 + 3?",
prompt=math_prompt # Pass the prompt to use its tool definitions
)
This approach allows for more precise control over which tools are available to different prompts, improving security and reducing the chance of unintended tool usage.
Documentation
For more detailed documentation, see the docs directory:
- Getting Started
- Creating Tools
- Running a Server
- Embedded Reasoning
- Rate Limiting
- Self-Improving Systems
- Troubleshooting
- API Reference
Examples
Check out the examples directory for complete working examples:
- Tool Registry Example
- Embedded Reasoning Example
- External Tools Example
- Reasoning External Tool Example
- Prompt Loader Example
- Prompt Tools Example
Development
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
flake8
black .
isort .
License
MIT