MCP Image Recognition Server
by mario-andreschak
This is an MCP server that provides image recognition capabilities by leveraging Anthropic and OpenAI vision APIs. It supports multiple image formats and offers configurable provider options.
Last updated: N/A
What is MCP Image Recognition Server?
The MCP Image Recognition Server is a tool that allows you to analyze images and generate detailed descriptions using either Anthropic's Claude Vision or OpenAI's GPT-4 Vision. It can be integrated into MCP systems to provide image understanding capabilities.
How to use MCP Image Recognition Server?
First, clone the repository and configure the .env
file with your API keys and desired settings. Then, build the project and run the server using the provided scripts (python -m image_recognition_server.server
or run.bat server
). You can then use the available tools like describe_image
(for Base64 encoded images) or describe_image_from_file
(for image files) to get descriptions.
Key features of MCP Image Recognition Server
Image description using Anthropic Claude Vision or OpenAI GPT-4 Vision
Support for multiple image formats (JPEG, PNG, GIF, WebP)
Configurable primary and fallback providers
Base64 and file-based image input support
Optional text extraction using Tesseract OCR
Use cases of MCP Image Recognition Server
Automated image tagging and categorization
Content moderation and safety analysis
Image-based search and retrieval
Accessibility improvements through image descriptions
FAQ from MCP Image Recognition Server
What API keys do I need?
What API keys do I need?
You need either an Anthropic API key or an OpenAI API key, depending on which provider you want to use. You may also need an OpenRouter API key if you want to use it.
How do I enable OCR?
How do I enable OCR?
Set ENABLE_OCR
to true
in your .env
file and ensure that Tesseract OCR is installed on your system.
What is OpenRouter?
What is OpenRouter?
OpenRouter allows you to access various models using the OpenAI API format. You can use it by setting OPENAI_API_KEY
and OPENAI_BASE_URL
to the OpenRouter values, and VISION_PROVIDER
to openai
.
What are the default models?
What are the default models?
The default models are claude-3.5-sonnet-beta
for Anthropic and gpt-4o-mini
for OpenAI.
How do I run the tests?
How do I run the tests?
Use the run.bat test
command to run all tests, or run.bat test <suite>
to run a specific test suite (e.g., run.bat test server
).