ORION-DBs
  • Collections
    • Overview
    • CWTS Leiden Datasets
    • InSySPo Campinas Datasets
    • MultiObs Campinas Datasets
    • Sesame Open Science Datasets
    • SUB Göttingen Datasets
  • News & Tutorials
  • About
  • Contribute

Datasets

  • cr_history
  • cr_instant
  • hoaddata
  • oa2020
  • openalex
  • openalex_walden
  • openbib
  • resources
  • semantic_scholar
  • upw_history
  • upw_instant

Open Scholarly Data @ SUB Göttingen

The Scholarly Communication Analytics team at the State and University Library in Göttingen maintains a publicly accessible Open Scholarly Data Warehouse, which is based on Google BigQuery.

The warehouse features monthly Crossref snapshots, as well as data from various other sources, including OpenAlex, Semantic Scholar and Unpaywall, and provides access to bibliometric data from the German Competence Network for Bibliometrics.

Google BigQuery is provided as part of the OCRE 2024 Framework, with support from the GWGD.

More info: https://subugoe.github.io/scholcomm_analytics/

Contact: Najko Jahn, Nick Haupka

cr_history

Description
Historical Crossref Snapshots. Only includes publications with type ‘journal-article’.
Created: Oct 29, 2021 07:20 | Location: US | View in BigQuery Console

cr_instant

Description
This dataset contains the most recent Crossref Snapshot.
Created: Oct 29, 2021 07:37 | Location: US | View in BigQuery Console

hoaddata

Description
Datasets used to compile hoaddata, an R package containing data about hybrid open access publishing https://subugoe.github.io/hoaddata/
Created: May 12, 2023 11:06 | Location: US | View in BigQuery Console

oa2020

Description

Estimating global publishing output by leading commercial publishers using open metadata.

Work carried out for OA2020 WG on financial flows and future cost scenarios https://oa2020.org/working-groups/
Created: Dec 11, 2025 10:48 | Location: US | View in BigQuery Console

openalex

Description
This dataset contains the most recent OpenAlex Snapshot (before Walden).
Created: Jan 10, 2022 14:46 | Location: US | View in BigQuery Console

openalex_walden

Description
This dataset contains the most recent OpenAlex-Walden Snapshot.
Created: Dec 03, 2025 10:07 | Location: US | View in BigQuery Console

openbib

Description

This dataset contains the most recent OPENBIB snapshot.

For more information, see: https://zenodo.org/records/18429476
Created: Mar 28, 2025 14:22 | Location: US | View in BigQuery Console

resources

Created: Nov 02, 2021 07:53 | Location: US | View in BigQuery Console

semantic_scholar

Description
This dataset contains a snapshot from Semantic Scholar.
Created: Jun 10, 2024 13:46 | Location: US | View in BigQuery Console

upw_history

Description
Historical Unpaywall Snapshots. Only includes records from 2008 onwards.
Created: Oct 29, 2021 07:51 | Location: US | View in BigQuery Console

upw_instant

Description
This dataset contains the most recent Unpaywall Snapshot. Only records from 2008 onwards are included.
Created: Oct 29, 2021 07:52 | Location: US | View in BigQuery Console
  • The content on this website is licensed under CC0.
  • Privacy

  • Contact

  • Built with Quarto