MCP Server Readability Parser
by MCP-Mirror
A Python implementation of the MCP server that extracts and transforms webpage content into clean, LLM-optimized Markdown. It removes ads, navigation, and other non-essential content for better LLM processing.
Last updated: N/A
What is MCP Server Readability Parser?
This is a Python based MCP server that utilizes the Readability algorithm to extract the main content from a webpage, cleans it, and converts it into Markdown format. It is optimized for use with Large Language Models (LLMs).
How to use MCP Server Readability Parser?
- Clone the repository. 2. Create and activate a virtual environment. 3. Install dependencies using
pip install -r requirements.txt
. 4. Start the server usingfastmcp run server.py
. 5. Send a POST request to the/tools/extract_content
endpoint with a JSON payload containing the URL to parse.
Key features of MCP Server Readability Parser
Removes ads, navigation, footers and other non-essential content
Converts clean HTML into well-formatted Markdown
Handles errors gracefully
Optimized for LLM processing
Use cases of MCP Server Readability Parser
Preparing web content for LLM training
Summarizing articles for research
Creating clean content for documentation
Automating content extraction from websites
FAQ from MCP Server Readability Parser
Why use this instead of just fetching the HTML?
Why use this instead of just fetching the HTML?
This server extracts only relevant content using the Readability algorithm, eliminates noise like ads, popups, and navigation menus, reduces token usage, and provides consistent Markdown formatting.
What is the main tool provided by the server?
What is the main tool provided by the server?
The main tool is extract_content
, which fetches and transforms webpage content into clean Markdown.
What arguments does the extract_content
tool accept?
What arguments does the extract_content
tool accept?
It accepts a url
argument, which is a string representing the website URL to parse.
What does the extract_content
tool return?
What does the extract_content
tool return?
It returns a JSON object with a content
field, which contains the Markdown content extracted from the webpage.
How do I configure this server with MCP?
How do I configure this server with MCP?
Add the provided JSON configuration to your MCP settings file, specifying the command and arguments to run the server.