parquet_mcp_server
by DeepSpringAI
A powerful MCP server for manipulating and analyzing Parquet files, designed to work with Claude Desktop. It offers functionalities like text embedding generation, Parquet file analysis, and integration with DuckDB and PostgreSQL.
Last updated: N/A
What is parquet_mcp_server?
The parquet_mcp_server is an MCP (Model Control Protocol) server that provides tools for manipulating and analyzing Parquet files. It integrates with Claude Desktop and offers functionalities such as generating text embeddings, analyzing Parquet file metadata, converting Parquet files to DuckDB databases, converting Parquet files to PostgreSQL tables with pgvector support, and processing Markdown files into structured chunks.
How to use parquet_mcp_server?
To use the server, install it via Smithery or by cloning the repository and setting up the environment. Configure Claude Desktop to use the server by adding it to the claude_desktop_config.json
file. Then, use the available tools by sending appropriate prompts to the agent, specifying the required parameters for each tool.
Key features of parquet_mcp_server
Text Embedding Generation using Ollama models
Parquet File Analysis (schema, row count, file size)
DuckDB Integration for efficient querying
PostgreSQL Integration with pgvector support for vector similarity search
Markdown Processing to chunk text with metadata
Use cases of parquet_mcp_server
Data scientists working with large Parquet datasets
Applications requiring vector embeddings for text data
Projects needing to analyze or convert Parquet files
Workflows that benefit from DuckDB's fast querying capabilities
Applications requiring vector similarity search with PostgreSQL and pgvector
FAQ from parquet_mcp_server
How do I install the Parquet MCP Server?
How do I install the Parquet MCP Server?
You can install it via Smithery using the command npx -y @smithery/cli install @DeepSpringAI/parquet_mcp_server --client claude
or by cloning the repository and following the installation instructions in the README.
What environment variables are required?
What environment variables are required?
You need to create a .env
file with variables such as EMBEDDING_URL
, OLLAMA_URL
, EMBEDDING_MODEL
, POSTGRES_DB
, POSTGRES_USER
, POSTGRES_PASSWORD
, POSTGRES_HOST
, and POSTGRES_PORT
.
How do I configure Claude Desktop to use this server?
How do I configure Claude Desktop to use this server?
Add the server configuration to your claude_desktop_config.json
file, specifying the command and arguments to run the server.
What are the available tools?
What are the available tools?
The server provides tools for embedding Parquet files, getting Parquet file information, converting to DuckDB, converting to PostgreSQL, and processing Markdown files.
What do I do if embeddings are not generated?
What do I do if embeddings are not generated?
Check that the Ollama server is running and accessible, the specified model is available, and the text column exists in your input Parquet file.