MCP Databricks logo

MCP Databricks

by leminkhoa

MCP Databricks connects AI assistants to Databricks workspaces through the Model Context Protocol (MCP). It provides tools for managing Databricks environments, enabling AI assistants to manage compute resources, execute SQL queries, and manipulate workspace objects.

View on GitHub

Last updated: N/A

What is MCP Databricks?

MCP Databricks is a server that integrates Databricks with AI assistants using the Model Context Protocol. It provides a comprehensive toolkit for managing various aspects of a Databricks environment, allowing AI assistants to interact with and control Databricks resources.

How to use MCP Databricks?

To use MCP Databricks, clone the repository, configure environment variables with your Databricks credentials, and choose either Docker or local installation using uv. Then, configure your MCP client (e.g., Cursor) to connect to the server using the appropriate configuration. Finally, start the server and connect to it using your preferred MCP client.

Key features of MCP Databricks

  • Cluster Management (list, create, delete, start, get)

  • Library Management (install libraries)

  • Command Execution (Python, Scala, SQL)

  • SQL Warehouse Management (list, create)

  • Workspace Object Management (delete, get status, import, create directory)

Use cases of MCP Databricks

  • Automating Databricks cluster management via AI assistants

  • Executing SQL queries and analyzing results through natural language commands

  • Managing Databricks workspace objects using AI-powered interfaces

  • Integrating Databricks workflows with AI-driven automation pipelines

FAQ from MCP Databricks

What is the Model Context Protocol (MCP)?

MCP is a protocol that enables seamless communication between AI assistants and various tools and services, allowing them to interact and control these services programmatically.

What are the prerequisites for using MCP Databricks?

You need Python 3.11 or higher, a Databricks workspace, a Databricks Personal Access Token (PAT), and the required Python packages installed.

How do I configure the server?

Create a .env file in the project root with your Databricks host, token, server host, port, debug mode, transport method, and log level.

What transport methods are supported?

The server primarily uses the stdio transport for compatibility with Claude Desktop and other MCP clients.

How do I install libraries on a cluster?

The install_libraries tool allows you to install libraries (JAR, WHL, PyPI, Maven, CRAN, etc.) on a running cluster.