Alternatives for what?

Public infrastructure alternatives to Google Big Query - an overview of features

news
google big query
alternatives
public infrastructure
Authors

Bianca Kramer

Cameron Neylon

Published

July 6, 2026

Abstract
There are emerging alternatives to Google Big Query in the cloud and for local computing, and organisations exploring these alternatives or interested in doing so. With ORION-DBs, we hope to spur on these developments and their application for opening up use of scholarly data sets. As a first contribution, in this blog post we discuss some features, or lack thereof, of Google Big Query infrastructure, what they mean for how data are made available and how public infrastructure could strive to emulate or improve these features.

There are pragmatic reasons why, as both data providers and users, we independently arrived at Google Big Query as a useful tool for sharing open data sets. Google solves a bunch of the hard problems, including authentication without the need for institutional affiliation, systems provisioning and a highly performant database system.

At the same time, Google Big Query is certainly not open scholarly infrastructure, nor is it fully equitable and accessible in all parts of the world, and Google is not an organisation many of us feel able to trust. Thus, reliance on Google is not a desirable long-term solution.

There are emerging alternatives both in the cloud and for local computing, and organisations exploring these alternatives or interested in doing so. With ORION-DBs, we hope to spur on these developments and their application for opening up use of scholarly data sets.

As a first contribution, in this blog we discuss some features, or lack thereof, of Google Big Query infrastructure, what they mean for how data are made available and how public infrastructure could strive to emulate or improve these features. Considering these features separately can help to ensure conscious and justified choices are made as to which to prioritise when thinking about local server or cloud deployment.

All this is not to uncritically defend our current choice for Google Big Query, but to explain what we, at least, see as some of its separate characteristics that are relevant both for data providers and users. We recommend considering all elements in this list (and potentially others) when scoping the development of public infrastructure for sharing and facilitating usage of public scholarly metadata sources.