MCP Web Extractor
by iemong
MCP Web Extractor is a Model Context Protocol (MCP) server that extracts web content using Readability.js. It fetches web pages and extracts the main content, making it ideal for saving clean, readable versions of articles.
Last updated: N/A
What is MCP Web Extractor?
A Model Context Protocol (MCP) server that extracts web content from URLs using Readability.js, providing clean, readable versions of articles.
How to use MCP Web Extractor?
- Clone the repository. 2. Install dependencies using npm install. 3. Build the project usingnpm run build. 4. Start the server usingnpm start. You can then use the client example or integrate it with Obsidian.
Key features of MCP Web Extractor
- Extracts readable content from any URL 
- Removes ads, sidebars, and other distractions 
- Returns clean text along with metadata (title, excerpt, etc.) 
- Easy integration with Obsidian via MCP 
Use cases of MCP Web Extractor
- Saving clean, readable versions of articles 
- Creating Obsidian notes from web content 
- Building an Obsidian plugin for web content extraction 
- Extracting content for research and analysis 
FAQ from MCP Web Extractor
What is Readability.js?
What is Readability.js?
Readability.js is a library that extracts the main content from a web page, removing clutter like ads and navigation.
What is MCP?
What is MCP?
MCP stands for Model Context Protocol. It's used here to facilitate communication between the server and other applications like Obsidian.
How do I integrate this with Obsidian?
How do I integrate this with Obsidian?
The obsidian-integration.ts file provides an example of how to integrate this MCP server with Obsidian.  You can use it as a starting point for creating an Obsidian plugin.
What data does the server return?
What data does the server return?
The server returns the title, content, textContent, excerpt, and siteName of the extracted web page.
Where does the server run by default?
Where does the server run by default?
The server starts on http://localhost:3000 with the MCP endpoint at http://localhost:3000/mcp.
