MLX Whisper MCP Server
by kachiO
A simple Model Context Protocol (MCP) server that provides audio transcription capabilities using MLX Whisper on Apple Silicon Macs. It allows transcribing audio files, base64-encoded data, and YouTube videos.
Last updated: N/A
MLX Whisper MCP Server
A simple Model Context Protocol (MCP) server that provides audio transcription capabilities using MLX Whisper on Apple Silicon Macs.
Features
- Transcribe audio files directly from disk
- Transcribe audio from base64-encoded data
- Download and transcribe YouTube videos
- Uses the high-quality mlx-community/whisper-large-v3-turbomodel
- Self-contained script with automatic dependency management via uv run
- Rich console output for easy debugging
- Saves transcription text files alongside audio files
Requirements
- Python 3.12 or higher
- Apple Silicon Mac (M-series)
- uvinstalled (- pip install uvor- curl -sS https://astral.sh/uv/install.sh | bash)
Quick Start
Run directly with uv run:
uv run mlx_whisper_mcp.py
That's it! The script will automatically install its own dependencies and start the MCP server.
Using with Claude Desktop
- Edit your Claude Desktop configuration file:
# On macOS:
code ~/Library/Application\ Support/Claude/claude_desktop_config.json
# On Windows:
code %APPDATA%\Claude\claude_desktop_config.json
- Add the MLX Whisper MCP server configuration:
{
  "mcpServers": {
    "mlx-whisper": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/mlx_whisper_mcp/",
        "run",
        "mlx_whisper_mcp.py"
      ]
    }
  }
}
- Restart Claude Desktop
Available Tools
The server provides the following tools:
1. transcribe_file
Transcribes an audio file from a path on disk.
Parameters:
- file_path: Path to the audio file
- language: (Optional) Language code to force a specific language
- task: "transcribe" or "translate" (translates to English)
2. transcribe_audio
Transcribes audio from base64-encoded data.
Parameters:
- audio_data: Base64-encoded audio data
- language: (Optional) Language code to force a specific language
- file_format: Audio file format (wav, mp3, etc.)
- task: "transcribe" or "translate" (translates to English)
3. download_youtube
Downloads a YouTube video.
Parameters:
- url: YouTube video URL
- keep_file: If True, keeps the downloaded file (default: True)
4. transcribe_youtube
Downloads and transcribes a YouTube video.
Parameters:
- url: YouTube video URL
- language: (Optional) Language code to force a specific language
- task: "transcribe" or "translate" (translates to English)
- keep_file: If True, keeps the downloaded file (default: True)
Example Prompts for Claude Desktop
- "Transcribe the audio file at /Users/username/Desktop/recording.mp3"
- "Translate this Spanish audio recording to English" (when uploading an audio file)
- "What is being said in this recording?" (when uploading an audio file)
- "Download and transcribe this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"
- "Download this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"
How It Works
This server uses the MCP Python SDK to expose MLX Whisper's transcription capabilities to clients like Claude. When a transcription is requested:
- The audio data is received (either as a file path, base64-encoded data, or YouTube URL)
- For YouTube URLs, the video is downloaded to ~/.mlx-whisper-mcp/downloads
- For base64 data, a temporary file is created
- MLX Whisper is used to perform the transcription
- The transcription text is saved to a .txt file alongside the audio file
- The transcription text is returned to the client
- Temporary files are cleaned up (unless keep_file=True)
Troubleshooting
- Import Error: If you see an error about MLX Whisper not being found, make sure you're running on an Apple Silicon Mac
- File Not Found: Make sure you're using absolute paths when referencing audio files
- Memory Issues: Very long audio files may cause memory pressure with the large model
- YouTube Download Errors: Some videos may be restricted or require authentication
- JSON Errors: If you see "not valid JSON" errors in logs, make sure server logging output is properly directed to stderr
License
Apache License 2.0 See LICENSE for details.
