MCP Server for Documentation Processing

by Rybens92

Data/Document Processing Documentation Text Generation Roo Code Cline Cursor Windsurf

The MCP server processes technical documentation into llm.txt format, which can be used as context for LLMs. It supports environments like Cursor, Windsurf, Cline or Roo Code.

View on GitHub

Last updated: N/A

What is MCP Server for Documentation Processing?

The MCP Server is a tool designed to convert technical documentation from websites into a format suitable for use as context by Large Language Models (LLMs). It automates the process of extracting, cleaning, and formatting documentation content into .txt files that can be easily ingested by LLMs in various development environments.

How to use MCP Server for Documentation Processing?

Clone the repository. 2. Configure the mcp_settings.json file with the correct path to the server.py script. 3. Install the required dependencies using uv pip install -r requirements.txt or pip install -r requirements.txt. 4. Run the server in development mode using fastmcp dev src/server.py or install it in Claude Desktop using fastmcp install src/server.py. 5. Configure the server in Roo Code or Cline's mcp_settings.json. 6. Use the <use_mcp_tool> tag with the server name (docs-to-llm), tool name (process_documentation), and arguments (URL, library name, save path).

Key features of MCP Server for Documentation Processing

Automatic detection of documentation navigation sections
Conversion of relative to absolute URLs
Removal of unnecessary HTML elements (scripts, styles, menus)
Progress reporting during processing
Detailed error logging
Smart scoring system to find relevant documentation links
Fallback mechanisms when automatic detection fails
Sanitized filenames based on library names

Use cases of MCP Server for Documentation Processing

Providing context to LLMs for code completion and documentation generation
Creating searchable documentation databases for internal use
Generating training data for LLMs specialized in technical domains
Automating the process of updating documentation for software libraries

FAQ from MCP Server for Documentation Processing

What is the purpose of the llm_.txt files?

These files contain the processed documentation content, formatted for use as context by LLMs. The _short.txt file contains only titles and links, while the _full.txt file contains the complete documentation content.

What environments are supported by the MCP server?

The server is designed to work with environments like Cursor, Windsurf, Cline, and Roo Code.

How do I specify the URL of the documentation to process?

You specify the URL in the <arguments> section of the <use_mcp_tool> tag, using the url parameter.

What happens if the automatic detection of documentation sections fails?

The server has fallback mechanisms to handle cases where automatic detection fails, ensuring that documentation is still processed as effectively as possible.

Can I customize the output filenames?

The output filenames are based on the library name you provide in the arguments. The server also sanitizes the filenames to ensure they are valid.