Webscraper MCP logo

Webscraper MCP

by saishridhar

Webscraper MCP is a server for Claude desktop that allows Claude to scrape text from websites, YouTube transcripts, and PDFs. By providing Claude with a link, it can extract and utilize the content.

View on GitHub

Last updated: N/A

What is Webscraper MCP?

Webscraper MCP is a server designed to provide Claude, the AI assistant, with the ability to scrape content from various online sources. It supports scraping text from general websites, extracting transcripts from YouTube videos, and converting PDFs to markdown text.

How to use Webscraper MCP?

The server provides three main tools: get_pdf, get_webpage_content, and get_youtube_transcript. To use these tools, provide the appropriate URL as an argument (input_url for PDFs, url for webpages and YouTube videos). The server will then return the extracted text content.

Key features of Webscraper MCP

  • Scrapes text from websites

  • Extracts transcripts from YouTube videos

  • Converts PDFs to markdown text

  • Integrates with Claude desktop

  • Provides tools for specific content types

Use cases of Webscraper MCP

  • Answering user questions based on webpage content

  • Summarizing YouTube videos

  • Extracting information from PDF documents

  • Providing Claude with access to online information

  • Automating data extraction from websites

FAQ from Webscraper MCP

What types of links are supported?

The server supports links to general webpages, YouTube videos, and PDF files.

How does the server handle different webpage structures?

The get_webpage_content tool extracts the main text content from a webpage, attempting to filter out irrelevant elements.

Is there a limit to the size of PDFs that can be converted?

The server may have limitations on the size of PDFs it can handle. Large PDFs may take longer to process or may result in errors.

How accurate is the YouTube transcript extraction?

The accuracy of the transcript depends on the quality of the automatically generated captions on YouTube. Some transcripts may contain errors.

Can I use this server with other AI models besides Claude?

While designed for Claude, the server's functionality could potentially be adapted for use with other AI models that require web scraping capabilities.