The OPENBIB project, maintained by the Kompetenznetzwerk Bibliometrie,
provides access to curated OpenAlex data with a focus on the German research landscape.
Curated data is provided for following entities:
- Authors 👩🎓
- Publishers 📚
- Funding information 📄
- Document types 🗂️
- Address information 🏛️
- Transformative Agreements 📑️
Annual snapshots from the OPENBIB project are openly available to users of the Kompetenznetzwerk Bibliometrie, via the Open Scholarly Data Warehouse of the SUB Göttingen and Zenodo.
The current release is based on the August 2024 snapshot of OpenAlex, limited to works with publication years 2014 to 2024.
The following figure compares the assignment of publications to German institutions in OpenAlex and in OPENBIB. While OpenAlex combines rule-based and machine learning algorithms to match address affiliations in documents with institutions, OPENBIB applies a pattern matching approach. The figure only displays institutions that are present in both OpenAlex and OPENBIB and can be assigned a unique Research Organisation Registry (ROR) ID.
Fig.1: Publications assigned to German institutions in OpenAlex and OPENBIB based on ROR-Matching. Only publications published between 2014 and 2024 are considered.The following figure compares the classification of article and reviews in OpenAlex and in OPENBIB. OpenAlex counts more article and reviews than OPENBIB, because OpenAlex also labels case reports, abstracts, book reviews and editorials as articles and reviews.
Fig.2: Classification of article and reviews in journals for German institutions in OpenAlex and by OPENBIB. Only publications published between 2014 and 2024 are considered.The following figure compares the number of publications with funding information of the German Research Foundation in OpenAlex and in OPENBIB. Only publications funded by the German Open-Access-Publikationskosten program are considered.
Fig.3: Publications containing funding information of the German Research Foundation per German institution in OpenAlex and by OPENBIB. Only publications published between 2020 and 2024 are considered.-
If you are a user of the Kompetenznetzwerk Bibliometrie you can access the data snapshot via the KB data infrastructure hosted by FIZ Karlsruhe.
-
For big scholarly data analysis in a Google Cloud environment, you can use the Open Scholarly Data Warehouse maintained by the SUB Göttingen.
-
Alternatively, you can download the snapshot from Zenodo: https://zenodo.org.
A list of all entities and fields included in the OPENBIB snapshot can be found here.
- A jupyter notebook containing code examples for working with the OPENBIB snapshot in the KB data infrastructure can be found here.
- A jupyter notebook containing code examples for working with the OPENBIB snapshot in the Open Scholarly Data Warehouse of the SUB Göttingen can be found here.
To export a complete OPENBIB snapshot from the KB database, use the following code.
from scripts.export_files import OpenBibDataRelease
openbib_snapshot = OpenBibDataRelease(
export_directory='openbib_export',
export_file_name='kbopenbib_release',
host='host',
database='database',
port='port',
user='user',
password='password'
)
openbib_snapshot.make_archive(export_format='csv')
If you see mistakes, want to suggest changes or submit feature requests, please create an issue.
Data is made available under the CC0 license.