CWTS Leiden Datasets
The data science team at the Centre for Science and Technology Studies (CWTS) at Leiden University provides the CWTS Leiden Ranking Open Edition, time-specific versions of OpenAlex, and other resources.
GitHub Repo: https://github.com/CWTSLeiden/CWTS-OpenAlex-databases
Contact: Nees Jan van Eck
leiden_ranking_open_edition_2023
This dataset contains the data used to create the Open Edition of the CWTS Leiden Ranking 2023. The dataset includes (1) data about the universities included in the Leiden Ranking Open Edition 2023 and the links between these universities and their affiliated organizations, (2) data about the publications included in the Leiden Ranking Open Edition 2023 and the links between these publications and universities and main fields, (3) indicators at the level of publications, and (4) indicators at the level of universities and main fields.
The Leiden Ranking Open Edition 2023 is based on the OpenAlex snapshot released on November 21, 2023. The snapshot data is not included in this dataset.
The source code for creating this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-Leiden-Ranking-Open-Edition
See the following blog post for more information about the Leiden Ranking Open Edition 2023: https://doi.org/10.59350/89wpz-hpz32leiden_ranking_open_edition_2024
This dataset contains the data used to create the Open Edition of the CWTS Leiden Ranking 2024. The dataset includes (1) data about the universities included in the Leiden Ranking Open Edition 2024 and the links between these universities and their affiliated organizations, (2) data about the publications included in the Leiden Ranking Open Edition 2024 and the links between these publications and universities and main fields, (3) indicators at the level of publications, and (4) indicators at the level of universities and main fields.
The Leiden Ranking Open Edition 2024 is based on the OpenAlex snapshot released on August 30, 2024. The snapshot data is not included in this dataset.
The source code for creating this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-Leiden-Ranking-Open-Edition
See the following blog post for more information about the Leiden Ranking Open Edition 2024: https://doi.org/10.59350/r512t-r8h93leiden_ranking_open_edition_2025
This dataset contains the data used to create the CWTS Leiden Ranking Open Edition 2025. The dataset includes (1) data about the universities included in the Leiden Ranking Open Edition 2025 and the links between these universities and their affiliated organizations, (2) data about the publications included in the Leiden Ranking Open Edition 2025 and the links between these publications and universities and main fields, (3) indicators at the level of publications, and (4) indicators at the level of universities and main fields.
The Leiden Ranking Open Edition 2025 is based on the OpenAlex snapshot from August, 2025. The snapshot data is not included in this dataset.
The source code for creating this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-Leiden-Ranking-Open-Edition
See the following blog post for more information about the Leiden Ranking Open Edition 2025: https://doi.org/10.59350/jvjy2-4ww95openalex_2023nov
This dataset contains OpenAlex data in a relational format. It is organized into multiple interrelated tables representing key OpenAlex entities such as works, authors, institutions, and sources, along with their relationships.
The dataset is based on the OpenAlex snapshot released on November 21, 2023.
The source code used to construct this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databasesopenalex_2023nov_classification
This dataset contains an algorithmic classification of research publications based on data from OpenAlex. The classification is based on the OpenAlex snapshot released on November 21, 2023.
See the following Zenodo record for more information about the classification: https://doi.org/10.5281/zenodo.10560276openalex_2023nov_core
This dataset contains data on core sources and core publications identified in the OpenAlex database (based on the OpenAlex snapshot released on November 21, 2023).
The source code used to identify core sources and core publications in OpenAlex is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databases/tree/2023nov
See the following report for more information about the identification of core sources and core publications in OpenAlex: https://doi.org/10.5281/zenodo.10949622openalex_2024aug
This dataset contains OpenAlex data in a relational format. It is organized into multiple interrelated tables representing key OpenAlex entities such as works, authors, institutions, and sources, along with their relationships.
The dataset is based on the OpenAlex snapshot released on August 30, 2024.
The source code used to construct this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databasesopenalex_2024aug_core
This dataset contains data on core sources and core publications identified in the OpenAlex database (based on the OpenAlex snapshot released on August 30, 2024).
The source code used to identify core sources and core publications in OpenAlex is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databases/tree/2024aug
See the following report for more information about the identification of core sources and core publications in OpenAlex: https://doi.org/10.5281/zenodo.13879947openalex_2025aug
This dataset contains OpenAlex data in a relational format. It is organized into multiple interrelated tables representing key OpenAlex entities such as works, authors, institutions, and sources, along with their relationships.
The dataset is based on the OpenAlex snapshot from August, 2025.
The source code used to construct this dataset is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databasesopenalex_2025aug_core
This dataset contains data on core sources and core publications identified in the OpenAlex database (based on the OpenAlex snapshot from August 2025).
The source code used to identify core sources and core publications in OpenAlex is available in the following GitHub repository: https://github.com/CWTSLeiden/CWTS-OpenAlex-databases/tree/2025aug
See the following report for more information about the identification of core sources and core publications in OpenAlex: https://doi.org/10.5281/zenodo.17200830