Web Search MCP Server

by joao-santillo

Data/Databases web search vector database ChromaDB LangChain MCP server semantic search

This MCP server provides tools for web search and vector database functionality using LangChain and ChromaDB. It allows you to search documentation and web pages, and store/retrieve documents with vector embeddings.

View on GitHub

Last updated: N/A

What is Web Search MCP Server?

This server is a Web Search MCP (Microservice Communication Protocol) server that combines web search capabilities with a ChromaDB vector database. It enables users to search documentation for popular libraries, extract content from web pages, and store/retrieve documents with vector embeddings for semantic similarity search.

How to use Web Search MCP Server?

To use this server, first install the dependencies using pip install -e . or uv pip install -e .. Then, create a .env file with the required API keys and ChromaDB configuration. Finally, run the server using python main.py. Available tools include functions for web search and vector database operations, which can be accessed as described in the 'Available Tools' section of the README.

Key features of Web Search MCP Server

Web search for documentation of LangChain, LlamaIndex, and OpenAI
Extraction of content from web pages
Storage and retrieval of documents with vector embeddings using ChromaDB
Semantic similarity search
Filtering documents based on metadata
Batch operations for efficient document processing

Use cases of Web Search MCP Server

Searching documentation for specific information about libraries
Building a knowledge base from web content
Implementing a semantic search engine
Creating a chatbot that can answer questions based on a vector database

FAQ from Web Search MCP Server

What is ChromaDB?

ChromaDB is a vector database that allows you to store and retrieve documents with vector embeddings.

What is LangChain?

LangChain is a framework for developing applications powered by language models.

How do I set up the server?

Install dependencies, create a .env file with API keys and configurations, and run the main.py file.

What embedding model is used?

The default embedding model is sentence-transformers/all-MiniLM-L6-v2, but this can be configured in the .env file.

Can I use this server for commercial purposes?

Please refer to the licenses of the underlying libraries (LangChain, ChromaDB, etc.) and the Serper API for their respective terms of use.