Webscraper MCP
by saishridhar
Webscraper MCP is a server for Claude desktop that allows Claude to scrape text from websites, YouTube transcripts, and PDFs. By providing Claude with a link, it can extract and utilize the content.
Last updated: N/A
What is Webscraper MCP?
Webscraper MCP is a server designed to provide Claude, the AI assistant, with the ability to scrape content from various online sources. It supports scraping text from general websites, extracting transcripts from YouTube videos, and converting PDFs to markdown text.
How to use Webscraper MCP?
The server provides three main tools: get_pdf
, get_webpage_content
, and get_youtube_transcript
. To use these tools, provide the appropriate URL as an argument (input_url for PDFs, url for webpages and YouTube videos). The server will then return the extracted text content.
Key features of Webscraper MCP
Scrapes text from websites
Extracts transcripts from YouTube videos
Converts PDFs to markdown text
Integrates with Claude desktop
Provides tools for specific content types
Use cases of Webscraper MCP
Answering user questions based on webpage content
Summarizing YouTube videos
Extracting information from PDF documents
Providing Claude with access to online information
Automating data extraction from websites
FAQ from Webscraper MCP
What types of links are supported?
What types of links are supported?
The server supports links to general webpages, YouTube videos, and PDF files.
How does the server handle different webpage structures?
How does the server handle different webpage structures?
The get_webpage_content
tool extracts the main text content from a webpage, attempting to filter out irrelevant elements.
Is there a limit to the size of PDFs that can be converted?
Is there a limit to the size of PDFs that can be converted?
The server may have limitations on the size of PDFs it can handle. Large PDFs may take longer to process or may result in errors.
How accurate is the YouTube transcript extraction?
How accurate is the YouTube transcript extraction?
The accuracy of the transcript depends on the quality of the automatically generated captions on YouTube. Some transcripts may contain errors.
Can I use this server with other AI models besides Claude?
Can I use this server with other AI models besides Claude?
While designed for Claude, the server's functionality could potentially be adapted for use with other AI models that require web scraping capabilities.