This project is part of the final assignment for non-SQL databases class.
For this assignment, the DBLP Computer Science Bibliography (https://dblp.uni-trier.de/db/) database is used. The data is downloaded in XML format from https://dblp.uni-trier.de/xml/, then it must be processed, cleaned and converted into JSON format.
Afterwards, the data must be stored in a MongoDB database in the appropriate form.
Finally, data must be analyzed (through MongoDB queries) in order to answer the following questions:
1. List all publications of an author.
2. Number of publications of an author.
3. Number of journal articles in 2017.
4. Number of occasional authors (authors with less than 5 total publications).
5. Number of journal articles (article) and congress articles (inproceedings) of the authors with the most total publications.
6. Average number of authors for the publications in the database.
7. List of co-authors for an author.
8. Time between first and last publications of the 5 authors that have been active for the longest period.
9. Number of authors that have been active for less than 5 years.
10. Percentage of journal articles of the total number of publications.