Steel MCP Server logo

Steel MCP Server

by steel-dev

The Steel MCP Server enables LLMs like Claude to navigate the web using Puppeteer-based tools and Steel. It provides tools for standard web actions like clicking, scrolling, typing, and taking screenshots, based on the Web Voyager framework.

View on GitHub

Last updated: N/A

What is Steel MCP Server?

The Steel MCP Server is a Model Context Protocol server that allows Large Language Models (LLMs) like Claude to interact with and navigate the web. It leverages Puppeteer for browser automation and Steel for session management, providing a suite of tools for performing various web-based tasks.

How to use Steel MCP Server?

To use the Steel MCP Server, you need to configure Claude Desktop with the server details, including the command to run the server and environment variables for either local or cloud mode. You can then interact with the server through Claude, asking it to perform tasks like searching, navigating, clicking, and filling out forms. Detailed instructions are provided in the Quick Start section of the README.

Key features of Steel MCP Server

  • Browser automation with Puppeteer

  • Steel integration for browser session management

  • Visual element identification through numbered labels

  • Screenshot capabilities

  • Basic web interaction (navigation, clicking, form filling)

  • Lazy-loading support through scrolling

  • Local and remote Steel instance support

Use cases of Steel MCP Server

  • Search for a recipe and save the ingredients list

  • Track a package delivery status

  • Find and compare prices for a specific product

  • Fill out an online job application

FAQ from Steel MCP Server

How do I run the server in cloud mode?

Set STEEL_LOCAL to "false" and provide a valid STEEL_API_KEY in the environment variables.

How do I run the server locally?

Set STEEL_LOCAL to "true" and ensure your local Steel service is running. Optionally, set STEEL_BASE_URL to point to your local Steel server.

What is the purpose of the numbered labels on the web page?

The numbered labels identify interactive elements (buttons, links, inputs) for click and type operations.

How do I capture a screenshot?

Use the 'save_unmarked_screenshot' tool and specify a resource name. The screenshot will be stored as a resource accessible via an MCP resource URI.

Why is Claude slowing down after several browser actions?

After ~15-20 browser actions Claude starts to slow down as it's context window gets filled with images. There may be some latency, especially with the Claude Desktop client lagging behind.