This repo is a proposed solution to the "Data Engineering Challenge" first step on the Titanic kaggle competition.
First copy the .env.sample
file to .env
and fill in the values.
cp .env.sample .env
Then download your service account key from Google Cloud Platform and save it as credentials/service-account.json
.
Once this is done and if you are using pyenv-virtualenv
you can run the following command to setup the project.
make init_env
To run the pipeline you can use the following command:
make train
Launch the api in local with :
make run_api
You should test it by following the link that will be displayed in the terminal.
Don't hesitate to go the /docs
endpoint to see the documentation of the API.
You can also test the API with the make test_api
rule in another terminal.
Finally launch the streamlit app with :
make streamlit