- Create Python 3 virtual environment and install the required python modules by running
pip3 install -r requirements.txt - Download the training data and the embeddings from Kaggle: https://www.kaggle.com/c/quora-insincere-questions-classification/data. For this exercise, only the training.csv (as training and testing data) and the GoogleNews-vectors-negative300.bin (for text embeddings) were used. Alternatively, the embeddings can be downloaded from this directory: https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit
- Place train.csv in the same directory as the .ipynb files. Make sure the GoogleNews-vectors-negative300.bin file is nested in the GoogleNews-vectors-negative300 directory, which should be in the same directory as train.csv
Awesome! Now we can spin up Jupyter Notebook!