MCP RAG Server logo

MCP RAG Server

by shtse8/sylphlab

The mcp-rag-server is a Model Context Protocol (MCP) server that enables Retrieval Augmented Generation (RAG) capabilities for connected LLMs. It indexes documents from your project and provides relevant context to enhance LLM responses.

View on GitHub

Last updated: N/A

What is MCP RAG Server?

The MCP RAG Server is a Model Context Protocol (MCP) server designed to provide Retrieval Augmented Generation (RAG) capabilities to Language Learning Models (LLMs). It leverages local models (Ollama) and vector stores (ChromaDB) to index project documents and provide relevant context to enhance LLM responses.

How to use MCP RAG Server?

To use the server, first install Docker Desktop and clone the repository. Then, start the services using docker-compose up -d --build. After the services are running, pull the embedding model into the Ollama container using docker exec ollama ollama pull nomic-embed-text. Finally, configure your MCP client to connect to the server.

Key features of MCP RAG Server

  • Automatic Indexing: Scans the project directory on startup and indexes supported files.

  • Supported File Types: Supports .txt, .md, code files, .json, .jsonl, and .csv files.

  • Hierarchical Chunking: Intelligently chunks Markdown files.

  • Vector Storage: Uses ChromaDB for persistent vector storage.

  • Local Embeddings: Leverages Ollama for local embedding generation.

  • MCP Tools: Exposes RAG functions as standard MCP tools (indexDocuments, queryDocuments, removeDocument, removeAllDocuments, listDocuments).

  • Dockerized: Includes a docker-compose.yml for easy setup.

Use cases of MCP RAG Server

  • Enhance LLM responses with relevant context from project documents.

  • Provide a local and private RAG solution.

  • Integrate with Model Context Protocol (MCP) ecosystems.

  • Automatically index and query project files for improved LLM performance.

FAQ from MCP RAG Server

What is MCP?

MCP stands for Model Context Protocol, a standard for interacting with LLMs and providing context.

What is ChromaDB?

ChromaDB is a vector database used for storing and retrieving document embeddings.

What is Ollama?

Ollama is a tool for running and managing local LLMs.

How do I configure the server?

The server is configured via environment variables, typically set in the docker-compose.yml file.

What file types are supported for indexing?

The server supports .txt, .md, code files, .json, .jsonl, and .csv files.