omniparser-autogui-mcp
by NON906
This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI. It is confirmed to work on Windows.
Last updated: N/A
What is omniparser-autogui-mcp?
This is an MCP (Model Context Protocol) server that leverages OmniParser to analyze the screen and automate GUI operations.
How to use omniparser-autogui-mcp?
- Clone the repository recursively and navigate to the directory. 2. Use
uv sync
to install dependencies. 3. Set theOCR_LANG
environment variable. 4. Rundownload_models.py
to download necessary models. 5. Add the server configuration to yourclaude_desktop_config.json
file, adjusting the path to the cloned repository. 6. Configure environment variables as needed for specific use cases.
Key features of omniparser-autogui-mcp
Screen analysis using OmniParser
Automated GUI operation
MCP server implementation
Configurable target window
Support for remote OmniParser server
SSE communication option
Use cases of omniparser-autogui-mcp
Automating browser tasks (e.g., searching)
Interacting with desktop applications
Creating automated workflows based on screen content
Integrating with other MCP-compatible clients
FAQ from omniparser-autogui-mcp
What is OmniParser?
What is OmniParser?
OmniParser is a tool used for analyzing the screen.
What is MCP?
What is MCP?
MCP stands for Model Context Protocol. It is a protocol for communication between applications.
How do I specify a target window?
How do I specify a target window?
Set the TARGET_WINDOW_NAME
environment variable to the name of the window you want to operate on.
How do I use a remote OmniParser server?
How do I use a remote OmniParser server?
Set the OMNI_PARSER_SERVER
environment variable to the address and port of the remote server.
What if it doesn't work with other clients like LibreChat?
What if it doesn't work with other clients like LibreChat?
Specify 1
for the OMNI_PARSER_BACKEND_LOAD
environment variable.