Computer Use AI SDK logo

Computer Use AI SDK

by m13v

The Computer Use AI SDK is an open-source alternative to OpenAI's operator and Claude's computer use, allowing you to build agents that control your computer natively on macOS. It leverages underlying desktop-rendered elements for faster and more reliable interaction than pixel-based vision models.

View on GitHub

Last updated: N/A

What is Computer Use AI SDK?

The Computer Use AI SDK is a set of tools and libraries that enable AI agents to interact with and control a computer, specifically macOS, without relying on virtual machines or pixel-based vision. It provides computational primitives for AI to perform tasks on your computer.

How to use Computer Use AI SDK?

To use the SDK, clone the repository, install Rust and Node.js, run the backend server, and then choose either the CLI interface or the web app interface. Set your Anthropic API key, install dependencies, and run the chosen interface. The README provides detailed instructions for each option.

Key features of Computer Use AI SDK

  • Native macOS support

  • Desktop-rendered element interaction (no pixel-based vision)

  • Tools for launching apps, reading content, clicking, entering text, and pressing keys

  • Simple Hello World Template for getting started

  • CLI and Web app interfaces

Use cases of Computer Use AI SDK

  • Build custom workflows of agents to perform various actions

  • Build custom UI to make it easy for users to automate their computer work

  • Save workflows and run in cron

  • Combine with other MCP servers to do something cool, e.g.: fill out a google sheet based on the history of people i talk to throughout the day

FAQ from Computer Use AI SDK

What is the primary advantage of this SDK over pixel-based vision models?

It relies on underlying desktop-rendered elements, making it much faster and far more reliable.

What operating system is natively supported?

macOS

What are some of the tools provided by the MCP Server?

Launch apps, Read content, Click, Enter text, Press keys

How can I contribute to the project?

Request features and endpoints in github issues

Where can I find the Hello World Template?

It is available within the SDK repository. Follow the 'Get Started' instructions in the README.