Kokoro Text to Speech (TTS) MCP Server

by mberg

AI/Speech TTS Text to Speech MP3 S3 Kokoro-TTS Audio Generation

The Kokoro Text to Speech MCP server generates .mp3 files from text, with the option to upload them to S3. It leverages the Kokoro-TTS model from Hugging Face.

View on GitHub

Last updated: N/A

What is Kokoro Text to Speech (TTS) MCP Server?

The Kokoro TTS MCP Server is a service that converts text into speech using the Kokoro-TTS model. It provides an interface to generate MP3 audio files and optionally upload them to an S3 bucket for storage and retrieval.

How to use Kokoro Text to Speech (TTS) MCP Server?

Clone the repository. 2. Download the Kokoro Onnx Weights and store them in the same repository. 3. Configure the server using MCP configs, including AWS credentials if S3 upload is desired. 4. Install ffmpeg for .wav to .mp3 conversion. 5. Run the server using uv run mcp-tts.py. 6. Use the mcp_client.py script to send TTS requests to the server, customizing voice, speed, and S3 upload options as needed.

Key features of Kokoro Text to Speech (TTS) MCP Server

Text-to-speech conversion using Kokoro-TTS
MP3 audio file generation
Optional S3 upload
Configurable voice and speed
Local MP3 file storage and management
Automatic cleanup of local MP3 files
Command-line client for easy TTS requests

Use cases of Kokoro Text to Speech (TTS) MCP Server

Generating audio content for applications
Creating voiceovers for videos
Providing accessibility features for websites
Automating audio production workflows
Building interactive voice response (IVR) systems

FAQ from Kokoro Text to Speech (TTS) MCP Server

How do I configure the server?

You configure the server using MCP configs, setting environment variables for AWS credentials, S3 bucket details, voice, speed, and other options. Refer to the README for a complete list of supported environment variables.

How do I upload the generated mp3 file to S3?

To enable S3 uploads, set S3_ENABLED to true and configure the necessary AWS credentials (access key ID, secret access key, region, bucket name, and folder path) in your .env file or MCP configs.

How do I specify the voice and speed for the TTS?

You can specify the voice and speed using the TTS_VOICE and TTS_SPEED environment variables in your .env file or MCP configs. You can also override these settings using command-line options in the mcp_client.py script.

How do I clean up old MP3 files?

You can configure automatic cleanup of local MP3 files by setting the MP3_RETENTION_DAYS environment variable to the number of days you want to keep the files. You can also set DELETE_LOCAL_AFTER_S3_UPLOAD to true to delete local files immediately after successful S3 upload.

What dependencies do I need to install?

You need to install ffmpeg for .wav to .mp3 conversion. You also need to download the Kokoro Onnx Weights and store them in the same repository.