Introducing the ORION-DBs MCP Server
Navigating Open Research Information Resources on BigQuery with LLMs
orion-mcp, an MCP server that lets LLMs like Claude explore schemas, draft SQL, estimate costs, and run queries, providing a practical entry point for users less familiar with open research information resources and BigQuery.
About
Working with ORION-DBs can be tricky: providers use different schemas and pre-processing routines, making cross-database comparisons hard and writing queries against complex schemas a steep learning curve. To give a sense of scale: ORION-DBs currently spans six different BigQuery projects from different providers, 52 datasets with various tables, and over 14 000 GB of data.
orion-mcp is a Model Context Protocol server that addresses this by connecting ORION-DBs to AI apps like Claude Desktop. Once installed, you can ask questions about the available data sources or have the LLM draft and run SQL queries against them.
The tool builds directly on the comprehensive schema documentation that ORION-DBs providers maintain, making that work directly useful when chatting with an LLM. It is at an early stage, so feedback is welcome!
What it does
Explore schemas
These tools work without a BigQuery account, using pre-fetched schemas the ORION-DBs website when orion-mcp is started:
orion_list_datasets— list all available ORION-DBs datasetsorion_list_tables— list tables in a specific datasetorion_get_db_schema— inspect the full schema of a table
Query BigQuery
Once you know what you want to query, the LLM writes and executes SQL. To avoid surprise costs, a dry-run cost estimate is always shown before any query runs. SELECT * queries are blocked to prevent unnecessary large scans.
orion_estimate_query_cost— estimate bytes scanned and cost before runningorion_run_bq_query— execute the confirmed query
Use case
This screencast demonstrates a typical session.
First, I ask whether OpenAlex is available and which version is the most recent. Then, I ask Claude to compare the version provided by MultiObs with the version provided by SUB Göttingen. Having gained this overview, I ask Claude to retrieve the number of diamond open access articles from first authors from Germany between 2021 and 2025. Throughout, Claude provides me with the estimated query costs and presents the SQL for the queries.
You may wish to be more explicit about how the results are presented. Often, a dynamic chart is unnecessary.
Installation
Full instructions are in the GitHub repo README.
In summary, the server runs in a Docker container connected to Claude Desktop via its MCP config file. Authentication uses Google’s Application Default Credentials, so your local gcloud credentials are used directly — no service account keys needed. A Google Cloud account includes 1 TB of free queries per month.
Requirements: Docker and the Google Cloud CLI (gcloud). The server is implemented in R using {mcptools} and {ellmer}.
Responsible use
LLMs make mistakes. Always verify that queries return the results you intended before using them in any analysis. If you plan to use this in a publication, check the outlet’s policy on AI-assisted work and document your process accordingly. Please acknowledge resources used.
Reuse
Citation
@online{jahn2026,
author = {Jahn, Najko},
title = {Introducing the {ORION-DBs} {MCP} {Server}},
date = {2026-04-07},
url = {https://orion-dbs.community/blog/posts/orion-mcp-welcome/},
langid = {en},
abstract = {ORION-DBs provides access to multiple open research
information resources, but their heterogeneous schemas make
exploration and querying difficult. This post presents `orion-mcp`,
an MCP server that lets LLMs like Claude explore schemas, draft SQL,
estimate costs, and run queries, providing a practical entry point
for users less familiar with open research information resources and
BigQuery.}
}