OpenAlex MCP Server
by DavidEncrypted
This project provides a Model Context Protocol (MCP) server that allows AI agents to interact with the OpenAlex database, focusing on scholarly works. It utilizes pyalex to communicate with the OpenAlex API and the MCP Python SDK to expose functionality as MCP tools.
Last updated: N/A
OpenAlex MCP Server
This project provides a Model Context Protocol (MCP) server that allows AI agents and other MCP clients to interact with the OpenAlex database, specifically focusing on scholarly works. It utilizes the pyalex Python library to communicate with the OpenAlex API and the MCP Python SDK (fastmcp
) to expose functionality as MCP tools.
Examples
For inspiration, here are some example queries that you can ask an AI agent that has this mcp enabled:
- Find the 5 most cited papers that cite the foundational paper introducing the convolutional neural network
- I am looking for papers related to Self-Supervised visual odometry. Find papers related to this topic, then score the relevance of each paper by reading the title+abstract and then giving them a relevance score of 1-10.
- Find the most influential papers on transformer models published since 2020 and summarize how the architecture has evolved
- Find papers co-authored by researchers at MIT and Stanford
My favorite AI agent for this is cline: https://github.com/cline/cline
Features
This server exposes the following MCP tools for interacting with OpenAlex works:
-
search_works
:- Description: Searches for OpenAlex works based on keywords and filters. Supports boolean operators in the search query. Returns selected fields and uses cursor pagination.
- Parameters:
search_query
(string, required): Search term(s).filters
(object, optional): Key-value pairs for filtering (e.g.,{"publication_year": 2023, "is_oa": true}
). See OpenAlex filter documentation for possible keys.search_field
(string, optional, default: "default"): Field to search within ('title', 'abstract', 'fulltext', 'title_and_abstract', 'default' - searches title, abstract, and fulltext).select_fields
(array of strings, optional): List of root-level fields to return (e.g.,["id", "doi", "title", "abstract"]
). Includes plaintext abstract if requested.sort
(object, optional): Field to sort by and direction (e.g.,{"cited_by_count": "desc"}
).per_page
(integer, optional, default: 25): Results per page (max 200).cursor
(string, optional): Pagination cursor.
- Returns: Object with
results
(list of work objects) andmeta
(pagination info includingnext_cursor
).
-
get_work_details
:- Description: Retrieves detailed information for a specific OpenAlex work by its ID (OpenAlex ID URL, DOI URL, PMID URL, MAG ID).
- Parameters:
work_id
(string, required): Identifier for the work.select_fields
(array of strings, optional): List of root-level fields to return. Includes plaintext abstract if requested.
- Returns: Object representing the work with selected fields, or an error object.
-
get_referenced_works
:- Description: Retrieves the list of OpenAlex IDs cited by a specific OpenAlex work (outgoing citations). Note: This tool currently returns only the list of IDs. Use
get_work_details
for more info on each reference. - Parameters:
work_id
(string, required): OpenAlex ID of the citing work.
- Returns: Object with
referenced_work_ids
(list of strings), or an error object.
- Description: Retrieves the list of OpenAlex IDs cited by a specific OpenAlex work (outgoing citations). Note: This tool currently returns only the list of IDs. Use
-
get_citing_works
:- Description: Retrieves the list of works that cite a specific OpenAlex work (incoming citations). Uses cursor pagination.
- Parameters:
work_id
(string, required): OpenAlex ID of the cited work.select_fields
(array of strings, optional): List of root-level fields for each citing work.per_page
(integer, optional, default: 25): Results per page (max 200).cursor
(string, optional): Pagination cursor.
- Returns: Object with
results
(list of citing work objects) andmeta
(pagination info), or an error object.
-
get_work_ngrams
:- Description: Retrieves the N-grams (word proximity information) for a specific OpenAlex work's full text, if available.
- Parameters:
work_id
(string, required): OpenAlex ID of the work.
- Returns: Object representing the N-grams, or an error object (e.g., if N-grams are not found).
Note
OpenAlex generally does not store Full Text due to copywrite reasons. This also means that the openalex search functionality does not search over the full text, but only the title + abstract
Setup and Installation
Installation
-
Clone the repository (optional):
git clone <repository-url> cd openalex-mcp-server
-
Install dependencies:
uv venv source .venv/bin/activate uv pip install .
This installs the server package along with its dependencies (
mcp[cli]
,pyalex
). -
Run server:
uv --directory /YOUR/INSTALL/DIRECTORY/openalex-mcp-server run server.py
Configuration
- OpenAlex Polite Pool: To use the faster, more reliable OpenAlex polite pool, set the
OPENALEX_EMAIL
environment variable to your email address before running the server or when configuring it in your MCP client.
If this variable is not set, the server will use the anonymous pool, which has stricter rate limits.export OPENALEX_EMAIL="[email protected]"
MCP Integration
To use this server with an MCP client (like the Claude VS Code Extension or Claude Desktop), you need to add its configuration to the client's settings file.
Example Configuration (cline_mcp_settings.json
or similar):
{
"mcpServers": {
"... other servers ...": {},
"openalex": {
"autoApprove": [],
"disabled": false,
"timeout": 60,
"command": "uv",
"args": [
"--directory", "/YOUR/INSTALL/DIRECTORY/openalex-mcp-server", "run", "server.py"
],
"env": {
"OPENALEX_EMAIL": "[email protected]" // Set your email here!
},
"transportType": "stdio"
}
}
}
- Replace
"[email protected]"
with your actual email address. - Restart your MCP client (e.g., reload the VS Code window) after adding the configuration.
Usage
Once the server is configured and running (either via the MCP client integration or manually for development), you can interact with it using the use_mcp_tool
command within your AI agent.
Example Tool Calls:
-
Search for papers about "machine learning" published in 2023:
<use_mcp_tool> <server_name>openalex</server_name> <tool_name>search_works</tool_name> <arguments> { "search_query": "machine learning", "filters": { "publication_year": 2023 }, "select_fields": ["id", "doi", "title", "publication_year", "cited_by_count"] } </arguments> </use_mcp_tool>
-
Get details for a specific work:
<use_mcp_tool> <server_name>openalex</server_name> <tool_name>get_work_details</tool_name> <arguments> { "work_id": "W2741809807", "select_fields": ["title", "authorships", "abstract", "open_access"] } </arguments> </use_mcp_tool>
-
Get references for a work:
<use_mcp_tool> <server_name>openalex</server_name> <tool_name>get_referenced_works</tool_name> <arguments> { "work_id": "W2741809807" } </arguments> </use_mcp_tool>
Development
To run the server locally for development and testing:
- Ensure dependencies are installed (see Installation).
- Set the
OPENALEX_EMAIL
environment variable. mcp dev server.py