Model-Context-Protocol Servers
by adi2355
This repository contains a collection of specialized Model Context Protocol (MCP) servers designed for various use cases. These servers provide functionalities ranging from data scraping and code analysis to AI model access and JSON manipulation.
Last updated: N/A
Model-Context-Protocol Servers
A collection of specialized MCP (Model Context Protocol) servers for different use cases.
1. Leafly Cannabis Strain Data Scraper
This MCP server implements a specialized scraper for collecting structured cannabis strain data from Leafly.com, following a standardized schema and methodology.
Installation
- Ensure you have Node.js 18+ installed
- Clone the repository
- Install dependencies:
cd firecrawl-mcp-server npm install
- Set up environment variables:
# Copy the example .env file cp .env.example .env # Edit the .env file and add your Firecrawl API key # You can obtain an API key from https://mendable.ai/firecrawl nano .env # or use any text editor
- Build the project:
npm run build
API Key Requirement
Important: This scraper requires a valid Firecrawl API key to function. If you try to run the scraper without a valid API key, you will receive a 401 Unauthorized error.
You can set the API key in one of two ways:
-
Environment Variable:
export FIRECRAWL_API_KEY=your_api_key_here
-
In a .env file:
FIRECRAWL_API_KEY=your_api_key_here
Features
- Scrapes 66 standardized data points for each cannabis strain from Leafly.com
- Follows a consistent methodology for data extraction and normalization
- Handles cases where data is missing or inconsistent
- Exports data in CSV or JSON format
- Built-in fallback mechanisms for strains that aren't directly accessible
Implementation Approaches
Regex-Based Extraction
The repository includes a regex-based scraper that extracts data using pattern matching:
// Extract cannabinoids using regex
function extractCannabinoids(content: string, strainData: StrainData): void {
// THC extraction
const thcMatch = content.match(/THC\s+(\d+(?:\.\d+)?)-?(\d+(?:\.\d+)?)?\s*%/i);
if (thcMatch) {
// Take higher end of range per methodology
const thcValue = thcMatch[2] ? parseFloat(thcMatch[2]) : parseFloat(thcMatch[1]);
strainData["cannabinoids.THC"] = thcValue / 100;
}
// Similar patterns for other cannabinoids
}
LLM-Powered Extraction
The repository also includes an advanced LLM-powered extraction method that uses structured schemas and AI to extract information more accurately:
// Using the extract tool for LLM-powered extraction
const strainSchema = {
type: 'object',
properties: {
"strain_name": { type: 'string' },
"aliases": { type: 'string' },
"strain_classification": { type: 'string' },
"thc_percentage": { type: 'number' },
"cbd_percentage": { type: 'number' },
"cbg_percentage": { type: 'number' },
"terpenes": {
type: 'object',
properties: {
"myrcene": { type: 'string' },
"caryophyllene": { type: 'string' }
// Other terpenes...
}
},
// Other properties...
}
};
// Extract data using LLM
const extractedData = await client.extract([strainUrl], {
schema: strainSchema,
systemPrompt: "Extract precise cannabis strain data. Use exact numbers when available.",
prompt: `Extract all available data for the cannabis strain "${strain}" according to the schema.`
});
Benefits of LLM extraction:
- Better handling of unstructured text and variations in formatting
- More resilient to website changes
- Can infer missing values based on context
- Extracts relationships between data points
Data Structure
The scraper collects the following categories of data for each strain:
- Basic Information: Strain name
- Terpenes: myrcene, pinene, caryophyllene, limonene, linalool, terpinolene, ocimene, humulene, other
- Cannabinoids: THC, CBD, CBG, CBN, other
- Medical Effects: Stress, Anxiety, Depression, Pain, Insomnia, Lack of Appetite, Nausea, other
- User Effects: Happy, Euphoric, Creative, Relaxed, Uplifted, Energetic, Focused, Sleepy, Hungry, Talkative, Tingly, Giggly, DryMouth, DryEyes, Dizzy, Paranoid, Anxious, other
- Onset and Duration: onset_minutes, duration_hours
- Interactions: Sedatives, Anti-anxiety (benzodiazepines), Antidepressants (SSRIs), Opioid analgesics, Anticonvulsants, Anticoagulants, other
- Flavors: Berry, Sweet, Earthy, Pungent, Pine, Vanilla, Minty, Skunky, Citrus, Spicy, Herbal, Diesel, Tropical, Fruity, Grape, other
Usage
As a Firecrawl MCP Tool
Once integrated with the Firecrawl MCP server, the tool can be called with the following parameters:
{
"name": "firecrawl_leafly_strain",
"arguments": {
"strains": ["Blue Dream", "OG Kush", "Sour Diesel"],
"exportFormat": "csv" // or "json"
}
}
Using the Extract Tool Directly
For more advanced extraction with the LLM-powered approach:
{
"name": "firecrawl_extract",
"arguments": {
"urls": ["https://www.leafly.com/strains/blue-dream"],
"schema": {
"type": "object",
"properties": {
"strain_name": { "type": "string" },
"thc_percentage": { "type": "number" },
"cbd_percentage": { "type": "number" },
"effects": { "type": "string" },
"flavors": { "type": "string" },
"medical": { "type": "string" },
"terpenes": { "type": "object" }
}
},
"prompt": "Extract comprehensive cannabis strain data from this Leafly page."
}
}
Using the CLI Script
You can also use the included CLI script to run the scraper directly:
# Set the Firecrawl API key (required)
export FIRECRAWL_API_KEY=your_api_key_here
# Using npm scripts
npm run scrape-leafly -- output.csv "Blue Dream,OG Kush,Sour Diesel"
# Or running directly
node dist/leafly-scraper-cli.js output.csv "Blue Dream,OG Kush,Sour Diesel"
Methodology
The scraper follows a rigorous methodology for extracting and normalizing data:
- Lab Data Priority: Lab-tested cannabinoid and terpene data is prioritized when available
- Consistent Normalization: When exact values aren't available, standardized normalization is applied:
- For terpenes: dominant = 0.008, second = 0.005, third = 0.003, others = 0.001
- For effects and flavors: Values are normalized to a 0.0-1.0 scale
- Default Values: Standard defaults are applied for commonly missing fields
Troubleshooting
TypeScript Errors
If you encounter TypeScript compilation errors:
- Ensure you have all dependencies installed:
npm install
- Make sure TypeScript is installed:
npm install -g typescript
- TypeScript module errors can typically be fixed by installing the @types packages:
npm install --save-dev @types/node
2. Python Codebase MCP Server
This MCP server provides code analysis capabilities and file system access for codebase navigation.
Installation
- Ensure you have Python 3.7+ installed
- Install dependencies:
pip install mcp-python-sdk watchdog
Features
- File system navigation and file reading
- Code search functionality
- Project structure analysis
- Real-time file change monitoring
- Function and component discovery
- Dependency analysis
Usage
Start the server:
python mcp_server.py
The server provides tools for code analysis:
- search_function: Find function definitions in code files
- search_code: Search for text across all code files
- get_project_structure: Generate a tree-like structure of the project
- analyze_dependencies: Analyze project dependencies
- find_components: Discover React/React Native components
Resources
- /file/list/{directory}: List files in a directory
- /file/read/{filepath}: Read file contents
- /file/info/{filepath}: Get file metadata
- /file/changes/{directory}: Get recently modified files
3. DeepSeek R1 extended MCP Server
This MCP server provides access to DeepSeek AI models for text generation, summarization, and document processing.
Installation
- Ensure you have Node.js 14+ installed
- Install dependencies:
npm install @modelcontextprotocol/sdk openai dotenv
- Set up environment variables:
# Create a .env file echo "DEEPSEEK_API_KEY=your_api_key_here" > .env
Features
- Text generation using DeepSeek R1 model
- Text summarization
- Streaming text generation
- Multi-model support
- Document processing (summarize, extract entities, analyze sentiment)
- File operations for saving outputs
Usage
Start the server:
node deepseek_mcp.js
The server provides the following tools:
- deepseek_r1: Generate text using DeepSeek R1 model
- deepseek_summarize: Summarize text
- deepseek_stream: Stream text generation
- deepseek_multi: Generate text using different DeepSeek models
- deepseek_document: Process documents (summarize, extract entities, analyze sentiment)
Resources
- /model/info: Get information about supported models
- /server/status: Check server status
- /file/save/{filename}: Save content to a file
- /file/list: List saved files
- /file/read/{filename}: Read saved file contents
4. JSON Manager MCP Server
This MCP server provides advanced JSON querying and manipulation capabilities.
Installation
- Ensure you have Node.js 14+ installed
- Install dependencies:
npm install @modelcontextprotocol/sdk node-fetch jsonpath
Features
- Query JSON data using JSONPath
- Advanced filtering
- String operations
- Numeric operations
- Date operations
- Array transformations
- Complex data comparisons
- Result caching
- Save and manage query results
Usage
Start the server:
node json_mcp.js
The server provides the following tools:
- query: Query JSON data using JSONPath expressions
- filter: Filter JSON data based on conditions
- save_query: Save query results to a file
- compare_json: Compare two JSON datasets
Resources
- /saved_queries/list: List saved queries
- /saved_queries/get/{filename}: Retrieve a saved query
- /cache/status: Check cache status
- /cache/clear: Clear the cache
License
MIT License