MCP Server Readability Parser logo

MCP Server Readability Parser

by MCP-Mirror

A Python implementation of the MCP server that extracts and transforms webpage content into clean, LLM-optimized Markdown. It removes ads, navigation, and other non-essential content for better LLM processing.

View on GitHub

Last updated: N/A

What is MCP Server Readability Parser?

This is a Python based MCP server that utilizes the Readability algorithm to extract the main content from a webpage, cleans it, and converts it into Markdown format. It is optimized for use with Large Language Models (LLMs).

How to use MCP Server Readability Parser?

  1. Clone the repository. 2. Create and activate a virtual environment. 3. Install dependencies using pip install -r requirements.txt. 4. Start the server using fastmcp run server.py. 5. Send a POST request to the /tools/extract_content endpoint with a JSON payload containing the URL to parse.

Key features of MCP Server Readability Parser

  • Removes ads, navigation, footers and other non-essential content

  • Converts clean HTML into well-formatted Markdown

  • Handles errors gracefully

  • Optimized for LLM processing

Use cases of MCP Server Readability Parser

  • Preparing web content for LLM training

  • Summarizing articles for research

  • Creating clean content for documentation

  • Automating content extraction from websites

FAQ from MCP Server Readability Parser

Why use this instead of just fetching the HTML?

This server extracts only relevant content using the Readability algorithm, eliminates noise like ads, popups, and navigation menus, reduces token usage, and provides consistent Markdown formatting.

What is the main tool provided by the server?

The main tool is extract_content, which fetches and transforms webpage content into clean Markdown.

What arguments does the extract_content tool accept?

It accepts a url argument, which is a string representing the website URL to parse.

What does the extract_content tool return?

It returns a JSON object with a content field, which contains the Markdown content extracted from the webpage.

How do I configure this server with MCP?

Add the provided JSON configuration to your MCP settings file, specifying the command and arguments to run the server.