PodCrawlerMCP logo

PodCrawlerMCP

by infinitimeless

PodCrawlerMCP is an MCP server designed for podcast discovery through web crawling. It enables AI assistants to find podcast episodes on specific topics by crawling the web for RSS feeds.

View on GitHub

Last updated: N/A

PodCrawlerMCP

An MCP (Model Context Protocol) server for podcast discovery through web crawling. PodCrawlerMCP enables AI assistants to find podcast episodes on specific topics by crawling the web for RSS feeds.

Features

  • πŸ•ΈοΈ Crawls podcast directories to discover RSS feeds
  • πŸŽ™οΈ Parses RSS feeds to extract episode data
  • πŸ” Filters episodes by topic or domain
  • πŸ”Œ Exposes functionality through MCP tools
  • πŸ€– Seamlessly integrates with AI assistants like Claude

Installation

pip install podcrawler-mcp

Or with Poetry:

poetry add podcrawler-mcp

Quick Start

Run the server directly:

python -m podcrawler.server

Or in your Python code:

from podcrawler import PodCrawlerServer

server = PodCrawlerServer()
server.run()

Integrating with Claude Desktop

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "podcrawler": {
      "command": "python",
      "args": ["-m", "podcrawler.server"]
    }
  }
}

Available Tools

discover_podcasts

Discovers podcasts on a specific topic.

Parameters:

  • topic (string): The topic to search for (e.g., "technology", "history")
  • max_results (integer, optional): Maximum number of results to return (default: 10)

Example Usage:

What are some science podcasts about black holes?

Project Structure

podcrawler-mcp/
β”œβ”€β”€ podcrawler/                # Main package
β”‚   β”œβ”€β”€ __init__.py            # Package initialization
β”‚   β”œβ”€β”€ server.py              # MCP server implementation
β”‚   β”œβ”€β”€ tools/                 # MCP tools
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── discovery.py       # Podcast discovery tool
β”‚   β”œβ”€β”€ crawler/               # Web crawling components
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ spider.py          # Web crawler implementation
β”‚   β”‚   └── parser.py          # RSS feed parser
β”‚   └── utils/                 # Utility functions
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ filtering.py       # Topic filtering utilities
β”‚       └── formatting.py      # Output formatting utilities
β”œβ”€β”€ tests/                     # Tests
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── test_server.py         # Server tests
β”œβ”€β”€ examples/                  # Usage examples
β”‚   └── basic_discovery.py     # Basic discovery example
β”œβ”€β”€ pyproject.toml             # Project configuration
β”œβ”€β”€ README.md                  # Project documentation
β”œβ”€β”€ LICENSE                    # MIT License
└── CONTRIBUTING.md            # Contribution guidelines

Development

  1. Clone the repository

    git clone https://github.com/infinitimeless/podcrawler-mcp.git
    cd podcrawler-mcp
    
  2. Install dependencies using Poetry

    poetry install
    
  3. Run tests

    poetry run pytest
    

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.