YouTube Transcript API
by minhleathvn
A Python service that provides APIs to fetch and transcribe YouTube video content. It supports both REST API (Flask) and MCP server implementations.
View on GitHub
Last updated: N/A
YouTube Transcript API
A Python service that provides APIs to fetch and transcribe YouTube video content. It supports both REST API (Flask) and MCP server implementations.
Features
- Fetch YouTube video transcripts in multiple languages (English and Vietnamese)
- Auto-detect and use available transcripts
- Fallback to audio transcription using Whisper when transcripts are unavailable
- Support for both REST API and MCP server interfaces
- Automatic language detection
- Temporary file cleanup
- Progress reporting for long-running operations
Installation
pip install -r requirements.txt
Usage
REST API (Flask)
Start the Flask server:
python apps/flask_server.py
Available endpoints:
GET /transcript?video_id=<video_id>&language=<lang>
- Get video transcriptGET /video/info?video_id=<video_id>
- Get video informationGET /health
- Health check endpoint
MCP Server
Start the MCP server:
python apps/mcp_server.py
Available tools:
get_transcript(video_id, language)
- Get video transcriptextract_transcript(video_id, language)
- Extract transcript from audiosearch_youtube_video(query)
- Search for YouTube videos
Language Support
- English (en)
- Vietnamese (vi)
- Auto-detection for other languages
Dependencies
- youtube-transcript-api
- pytube
- whisper
- torch
- langdetect
- flask (for REST API)
- mcp (for MCP server)
Development
The project structure:
apps/
├── __init__.py
├── flask_server.py # REST API implementation
├── mcp_server.py # MCP server implementation
└── utils.py # Shared utilities
License
MIT License