Knowledge Base MCP Server
by MCP-Mirror
This MCP server provides tools for listing and retrieving content from different knowledge bases. It leverages semantic search to find relevant information within specified knowledge domains.
Last updated: N/A
Knowledge Base MCP Server
This MCP server provides tools for listing and retrieving content from different knowledge bases.
<a href="https://glama.ai/mcp/servers/n0p6v0o0a4"> <img width="380" height="200" src="https://glama.ai/mcp/servers/n0p6v0o0a4/badge" alt="Knowledge Base Server MCP server" /> </a>Setup Instructions
These instructions assume you have Node.js and npm installed on your system.
Prerequisites
-
Clone the repository:
git clone <repository_url> cd knowledge-base-mcp-server -
Install dependencies:
npm install -
Configure environment variables:
- The server requires the
HUGGINGFACE_API_KEYenvironment variable to be set. This is the API key for the Hugging Face Inference API, which is used to generate embeddings for the knowledge base content. You can obtain a free API key from the Hugging Face website (https://huggingface.co/). - The server requires the
KNOWLEDGE_BASES_ROOT_DIRenvironment variable to be set. This variable specifies the directory where the knowledge base subdirectories are located. If you don't set this variable, it will default to$HOME/knowledge_bases, where$HOMEis the current user's home directory. - The server supports the
FAISS_INDEX_PATHenvironment variable to specify the path to the FAISS index. If not set, it will default to$HOME/knowledge_bases/.faiss. - The server supports the
HUGGINGFACE_MODEL_NAMEenvironment variable to specify the Hugging Face model to use for generating embeddings. If not set, it will default tosentence-transformers/all-MiniLM-L6-v2. - You can set these environment variables in your
.bashrcor.zshrcfile, or directly in the MCP settings.
- The server requires the
-
Build the server:
npm run build -
Add the server to the MCP settings:
- Edit the
cline_mcp_settings.jsonfile located at/home/jean/.vscode-server/data/User/globalStorage/saoudrizwan.claude-dev/settings/. - Add the following configuration to the
mcpServersobject:
"knowledge-base-mcp": { "command": "node", "args": [ "/path/to/knowledge-base-mcp-server/build/index.js" ], "disabled": false, "autoApprove": [], "env": { "KNOWLEDGE_BASES_ROOT_DIR": "/path/to/knowledge_bases", "HUGGINGFACE_API_KEY": "YOUR_HUGGINGFACE_API_KEY", }, "description": "Retrieves similar chunks from the knowledge base based on a query." },- Replace
/path/to/knowledge-base-mcp-serverwith the actual path to the server directory. - Replace
/path/to/knowledge_baseswith the actual path to the knowledge bases directory.
- Edit the
-
Create knowledge base directories:
- Create subdirectories within the
KNOWLEDGE_BASES_ROOT_DIRfor each knowledge base (e.g.,company,it_support,onboarding). - Place text files (e.g.,
.txt,.md) containing the knowledge base content within these subdirectories.
- Create subdirectories within the
- The server recursively reads all text files (e.g.,
.txt,.md) within the specified knowledge base subdirectories. - The server skips hidden files and directories (those starting with a
.). - For each file, the server calculates the SHA256 hash and stores it in a file with the same name in a hidden
.indexsubdirectory. This hash is used to determine if the file has been modified since the last indexing. - The file content is splitted into chunks using the
MarkdownTextSplitterfromlangchain/text_splitter. - The content of each chunk is then added to a FAISS index, which is used for similarity search.
- The FAISS index is automatically initialized when the server starts. It checks for changes in the knowledge base files and updates the index accordingly.
Usage
The server exposes two tools:
list_knowledge_bases: Lists the available knowledge bases.retrieve_knowledge: Retrieves similar chunks from the knowledge base based on a query. Optionally, if a knowledge base is specified, only that one is searched; otherwise, all available knowledge bases are considered. By default, at most 10 document chunks are returned with a score below a threshold of 2. A different threshold can optionally be provided using thethresholdparameter.
You can use these tools through the MCP interface.
The retrieve_knowledge tool performs a semantic search using a FAISS index. The index is automatically updated when the server starts or when a file in a knowledge base is modified.
The output of the retrieve_knowledge tool is a markdown formatted string with the following structure:
## Semantic Search Results
**Result 1:**
[Content of the most similar chunk]
**Source:**
```json
{
"source": "[Path to the file containing the chunk]"
}
```
---
**Result 2:**
[Content of the second most similar chunk]
**Source:**
```json
{
"source": "[Path to the file containing the chunk]"
}
```
> **Disclaimer:** The provided results might not all be relevant. Please cross-check the relevance of the information.
Each result includes the content of the most similar chunk, the source file, and a similarity score.