This is a database with olympic games data from 1896 to 2016, structured for postgresql with all the archives related to building and management the database.
Note: Despite this description and some info on the database is in english, the commentaries on the code are on spanish, and also if you want to see the proyect description (this file) on spanish go here.
First, we made the E-R diagram in order to get a first sketch on how to make the database and the possibles analysis of it.
Entity-Relationship diagram of the database
Then, with that, it was possible to start thinking on how the tables of the database will work, with all the constraints around it and searching the Third-Normal Form (3NF) to get a good database structure, preventing issues with it in future. So, we made the Relational diagram in pgModeler, and this is what we got:
Relational diagram of the database in 3NF
With that we had a complete DDL guide for the database, so the next step was writing (with help of pgModeler) the SQL code to build the DDL of our database, and this is what it ended up like.
It consists of 7 main tables:
Event:
It has all the info about the Olympic Events between 1986 and 2016.City:
The cities where the Olympic Events are hosted.Sport:
It has all about the sports played on a Event.Category:
All the categories in a Sport.Athlete:
The info about all the athletes that have participated on a Olympic Event.Region:
The Regions that participate on Olympic events.Medal:
The basic 3 types of medals to be won.
Some Examples can be seen here
And there are also another secondary and relational tables.
Now with all the DDL done, we started to load all data on the database, and making the SQL code to do it using COPY for all the data in the csv archives. The information load to the database was a transparent process. No changes were made to the original files and no problems occurred during any of the phases.
And then, having the database with the resective data on it, we made the python module to make the connection to the database and get the data from it to analyse it, you can se the module here.
Finally, we need to do some analysis using the data on tha database, using the python packages Pandas
and Dash
.
With the DataBase-Python connection ready to work, we can start building the web app where we are going to include the graphs and their respective analysis.
In this way, we created the Python-webapp folder that'll contain all the files about the app, and started by creating the Connection an SQLquerys files where we'll setup the connection to the database and create the querys we want to call them later, respectively. Using that, we created the Graphs file where we make all the graphs using dash around the data obtained from the querys in order to import them later on the web app.
Note: here you have to install the dash-bootstrap-components additional to the
Dash
main components, use:
pip install dash-bootstrap-components
Finally, we made the "main" file where all the structure of the web app rest, here we used the Dash
components like dash-core-components
, dash-html-components
, dash-bootstrap-components
and also Plotly
and Pandas
for making it possible. Here we imported the graphs created by the previous mentioned file to put them into cards with tabs to analyse them. The file has a little description for some things that need explanation.
Maybe you are wondering, what is the style.css
file that is on the assets folder? Well, it is the file that contains the classes and styles for the appearence of the web app... What? Why it is not imported on the main file? could you ask. That is because, it is a Dash property, it loads all the css and js files in the assets folder (if exists) of the web app, so we don't need to import the file in our main.py
.
On the web app there is a summarized version of this file describing the project, and other thing to take into account, is that we use base64
library to encode local images and using them into the web app because Dash
don't allow this, but we still let that part up to the user, is possible also to use them leaving in the "src" attribute of the <img>
tag as a url to the images on this repo (as we did! leaving the other part hidden as comments).
Hope you find usefull this "little" description of our database proyect and how it works!
- None :)