docs-mcp-server
by arabold
The docs-mcp-server is a Model Context Protocol (MCP) server designed to scrape, process, index, and search documentation for various software libraries and packages. It fetches content from specified URLs, splits it into meaningful chunks, generates vector embeddings, and stores the data in an SQLite database.
Last updated: N/A
What is docs-mcp-server?
This project provides a Model Context Protocol (MCP) server that scrapes, processes, indexes, and searches documentation for software libraries and packages. It fetches content, splits it semantically, generates vector embeddings, and stores the data in an SQLite database for efficient hybrid search.
How to use docs-mcp-server?
The server can be run using Docker or npx. Configure the MCP settings with the appropriate command and arguments, providing necessary environment variables like the OpenAI API key. Use the MCP tools to start scraping jobs, check job status, list jobs, cancel jobs, search documentation, list indexed libraries, find appropriate versions, remove indexed documents, and fetch single URLs.
Key features of docs-mcp-server
Versatile Scraping: Fetch documentation from diverse sources.
Intelligent Processing: Automatically split content and generate embeddings.
Optimized Storage: Leverage SQLite with
sqlite-vec
and FTS5.Powerful Hybrid Search: Combine vector similarity and full-text search.
Asynchronous Job Handling: Manage scraping and indexing tasks efficiently.
Simple Deployment: Get up and running quickly using Docker or npx.
Use cases of docs-mcp-server
Providing documentation search for AI assistants.
Indexing and searching documentation for internal software libraries.
Creating a searchable archive of documentation for different versions of a library.
Fetching and converting single URLs to Markdown for use in other applications.
FAQ from docs-mcp-server
What embedding models are supported?
What embedding models are supported?
The server supports OpenAI, Google Gemini, Azure OpenAI, AWS Bedrock, and Ollama embedding models. You need to configure the appropriate environment variables for each provider.
How do I configure the embedding model?
How do I configure the embedding model?
Use the DOCS_MCP_EMBEDDING_MODEL
environment variable to specify the provider and model name. You also need to set the required API keys or credentials for the chosen provider.
How do I run the server?
How do I run the server?
You can run the server using Docker (recommended) or npx. Docker provides a straightforward deployment, while npx is suitable for local file access.
How do I persist the indexed documentation?
How do I persist the indexed documentation?
When using Docker, mount a Docker named volume or a host directory to the /data
directory inside the container. This ensures that the database is persisted even if the container is stopped or removed.
How do I use the CLI?
How do I use the CLI?
Use the docs-cli
command via Docker or npx, depending on how you are running the server. The CLI provides commands for scraping, searching, and managing the documentation index.