mcp-server-macos-use logo

mcp-server-macos-use

by mediar-ai

The mcp-server-macos-use is a Model Context Protocol (MCP) server written in Swift for controlling macOS applications using accessibility APIs. It allows interaction with applications through MCP commands via standard input/output.

View on GitHub

Last updated: N/A

What is mcp-server-macos-use?

This is an MCP server that enables control of macOS applications by leveraging accessibility APIs. It acts as a bridge between MCP clients and macOS applications, allowing for programmatic interaction.

How to use mcp-server-macos-use?

Build the server using swift build. Then, configure your MCP client (e.g., Claude Desktop) to point to the server executable. The server listens for MCP commands via stdin/stdout. Use the available tools by calling the CallTool MCP method with the appropriate parameters.

Key features of mcp-server-macos-use

  • Control macOS applications via accessibility APIs

  • Supports multiple tools for interacting with applications (open, click, type, press key, refresh traversal)

  • Communicates via stdin/stdout

  • Configurable with MCP clients like Claude Desktop

Use cases of mcp-server-macos-use

  • Automating tasks in macOS applications

  • Integrating macOS applications with AI models

  • Creating accessibility tools

  • Testing macOS applications

FAQ from mcp-server-macos-use

What is the purpose of the macos-use_open_application_and_traverse tool?

This tool opens or activates a specified application and then traverses its accessibility tree, providing information about the application's UI elements.

How do I specify the application to interact with?

For some tools, you use the application's Process ID (PID). For others, you use the application's name, bundle ID, or file path.

What are modifier flags in the macos-use_press_key_and_traverse tool?

Modifier flags are keys like 'Shift', 'Control', 'Option', and 'Command' that can be held down while pressing another key.

What is the macos-use_refresh_traversal tool used for?

It performs an accessibility tree traversal for the specified application without performing any action. It's useful for getting the current UI state.

Where can I get help or request customizations?

You can reach out to [email protected] or on Discord at m13v_. You can also open an issue on the GitHub repository.