This Network-graph based literature review tool uses the open-source version of Neo4j with Jupyter Notebooks written in Python to import academic literature metadata from a variety of sources including OpenAlex, arXiv, Web of Science and more. Using a simple data model schema, literature metadata can be quickly imported, aggregated and normalized for analysis.
Jupyter Notebooks using Python and Neoj's Scripting language are available to import data from:
- [Open Alex](https://openalex.org/)
- [Semantic Scholar](https://www.semanticscholar.org/)
- [arXiV](https://arxiv.org/)
- [Web of Science](https://www.webofscience.com/wos/)
Machine Learning using vector embeddings generated by OpenAI is also available leveraging Neo4j's Vector Search Index capabilities in a simple Jupyter Notebook
This tool is described in the paper "A Network-Graph Based IT Artifact Aiding the Theory Building Process" published by the 2022 Hawaii International Conference on System Sciences (HICSS).
Modeling academic literature data as a network graph helps answer questions involving:
- Understanding relationships between entities (such as Work and Authors, Authors and Institution)
- Self-referencing to the same type of entity (such as Works referencing Works)
- Exploring relationships of varying or unknown depth (such as References of References)
- Discovering different paths (such as Author connections through Intitution or co-authored Works)
The Jupyter Notebooks are written to work with Neo4j version 5.x and higher with the APOC Library installed. To use Neo4j's Vector Search Index capabilities, Neo4j version 5.11 or higher is needed.
- To import and export files as needed, an apoc.conf file in your Neo4j configuration directory is needed with the two lines below.
- apoc.import.file.enabled=true
- apoc.export.file.enabled=true