Voice Recorder MCP Server logo

Voice Recorder MCP Server

by DefiBax

An MCP server designed for recording audio from a microphone and transcribing it using OpenAI's Whisper model. It can be used as a standalone server or integrated as a custom extension with Goose AI.

View on GitHub

Last updated: N/A

Voice Recorder MCP Server

An MCP server for recording audio and transcribing it using OpenAI's Whisper model. Designed to work as a Goose custom extension or standalone MCP server.

Features

  • Record audio from the default microphone
  • Transcribe recordings using Whisper
  • Integrates with Goose AI agent as a custom extension
  • Includes prompts for common recording scenarios

Installation

# Install from source
git clone https://github.com/DefiBax/voice-recorder-mcp.git
cd voice-recorder-mcp
pip install -e .

Usage

As a Standalone MCP Server

# Run with default settings (base.en model)
voice-recorder-mcp

# Use a specific Whisper model
voice-recorder-mcp --model medium.en

# Adjust sample rate
voice-recorder-mcp --sample-rate 44100

Testing with MCP Inspector

The MCP Inspector provides an interactive interface to test your server:

# Install the MCP Inspector
npm install -g @modelcontextprotocol/inspector

# Run your server with the inspector
npx @modelcontextprotocol/inspector voice-recorder-mcp

With Goose AI Agent

  1. Open Goose and go to Settings > Extensions > Add > Command Line Extension

  2. Set the name to voice-recorder

  3. In the Command field, enter the full path to the voice-recorder-mcp executable:

    /full/path/to/voice-recorder-mcp
    

    Or for a specific model:

    /full/path/to/voice-recorder-mcp --model medium.en
    

    To find the path, run:

    which voice-recorder-mcp
    
  4. No environment variables are needed for basic functionality

  5. Start a conversation with Goose and introduce the recorder with: "I want you to take action from transcriptions returned by voice-recorder. For example, if I dictate a calculation like 1+1, please return the result."

Available Tools

  • start_recording: Start recording audio from the default microphone
  • stop_and_transcribe: Stop recording and transcribe the audio to text
  • record_and_transcribe: Record audio for a specified duration and transcribe it

Whisper Models

This extension supports various Whisper model sizes:

| Model | Speed | Accuracy | Memory Usage | Use Case | |-------|-------|----------|--------------|----------| | tiny.en | Fastest | Lowest | Minimal | Testing, quick transcriptions | | base.en | Fast | Good | Low | Everyday use (default) | | small.en | Medium | Better | Moderate | Good balance | | medium.en | Slow | High | High | Important recordings | | large | Slowest | Highest | Very High | Critical transcriptions |

The .en suffix indicates models specialized for English, which are faster and more accurate for English content.

Requirements

  • Python 3.12+
  • An audio input device (microphone)

Configuration

You can configure the server using environment variables:

# Set Whisper model
export WHISPER_MODEL=small.en

# Set audio sample rate
export SAMPLE_RATE=44100

# Set maximum recording duration (seconds)
export MAX_DURATION=120

# Then run the server
voice-recorder-mcp

Troubleshooting

Common Issues

  • No audio being recorded: Check your microphone permissions and settings
  • Model download errors: Ensure you have a stable internet connection for the initial model download
  • Integration with Goose: Make sure the command path is correct
  • Audio quality issues: Try adjusting the sample rate (default: 16000)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.