PublicationGraph

This repository holds a Restful API implementation that uses data provided by a graph database (Neo4j DB) and exposes information on various queries related to computer science publication. All data used by this project were extracted from the DBLP (Online Computer Science Bibliography) website.

Publication Graph project consists of three essential parts:

DBLP dataset Parser
Neo4j (Graph Database) setup
RestFul API

DBLP Parser

DBLP Parser project aims to perform the ETL process leveraging dataset of computer science publications by DBLP data source. More specifically, DBLP provides a large xml dataset containing articles, proceedings, incollection, etc.

Extract phase

In our use case we focused only on the following three categories of publication.

Article: An article published in a journal.
Inproceedings: Article that appeared in the proceedings of a conference.
Incollection: Part of a book having its own title.

Transform phase

Based on the graph schema we chose (see below the Labeled Property Graph model) we collected and transformed the data of each category in a way to serve us for the final step of load phase.

Load phase

The transformed data used to create the final csv files. These files contain the suitable information for the needs of publication graph project such as authors, publications, relationships between authors and publications (author-[PUBLISHED]->publication), journals, conferences and publication issued on conferences and journal (publication-[ISSUED]->(journal, conference)).

Finally, you must copy the csv files under the docker-compose-setup/neo4j/import directory to be ready for the insertion in Neo4j.

More information about DBLP Parser can be found here.

Neo4j setup

Neo4j Graph Schema

The schema of Neo4j consists of the following nodes and relationships:

Nodes
1. Author: Properties: name
2. Publication: Properties: id, title, year, pages
3. Journal: Properties: name
4. Conference: Properties: name
Relationship
1. PUBLISHED: Author-[PUBLISHED]->Publication, Properties: order (values: first, middle and last)
2. ISSUED: Publication -[ISSUED]-> Conference or Journal

Data insertion

When you execute the docker up command that you can find in Run Project section, you can execute the script to import data into Neo4j by using the following process.

Execution of script:

Open browser
Go to http://localhost:7474/browser/
Login to Neo4j DB
Run script.

Rest API

A Rest API was created on top of Neo4j exposing results on various queries related to nodes and relationships of graph schema. For the needs of the project implemented 18 cypher queries which indicate the powerful features and efficiency of a graph database like Neo4j on heavy queries.

All curl requests from the endpoints of Rest API are collected and can be found here.

The Rest API of project was implemented by the Flask framework.

Run Project

To run project you have to install on your local environment the following prerequisite services.

Install docker and docker-compose services

1.  https://docs.docker.com/get-docker/
2.  https://docs.docker.com/compose/install/

After installation of docker and docker-compose services, you are ready to run the PublicationGraph project using the following commands

# Go to docker-compose-setup directory
$ cd docker-compose-setup/

# Build Rest API image
$ docker-compose build

# Start Neo4j and Rest API containers
$ docker-compose up

# Stop Neo4j and Rest API containers
$ docker-compose stop

# Destroy Neo4j and Rest API containers
$ docker-compose down

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
DBLPParser		DBLPParser
Neo4jQueries		Neo4jQueries
RestAPI		RestAPI
docker-compose-setup		docker-compose-setup
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PublicationGraph

DBLP Parser

Extract phase

Transform phase

Load phase

Neo4j setup

Neo4j Graph Schema

Data insertion

Rest API

Run Project

About

Releases

Packages

Contributors 2

Languages

License

nbompetsis/PublicationGraph

Folders and files

Latest commit

History

Repository files navigation

PublicationGraph

DBLP Parser

Extract phase

Transform phase

Load phase

Neo4j setup

Neo4j Graph Schema

Data insertion

Rest API

Run Project

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages