Deep Draw is a project from Le wagon data science school in Paris, batch #1002 (Sept.-Dec. 2022). The objective is to develop, train and apply neural networks models on the QuickDraw dataset published by Google Creative Lab. 100 categories of sketches have been selected and were used to train a CNN-based model and a RNN-based model in order to categorize drawings.
π Thanks to our supervizor Laure de Grave and our Lead Teacher Vincent Moreau for their help and investment on this project.
π Thanks to Google Creative Lab for the quickdraw-dataset from googlecreativelab repository
- Initialize our Repository Github for deepdraw
- Downloading, loading and prepare the Quick Draw dataset for CNN Model
- Initialize and Run the CNN model
- Create an API and fast API with streamlit π it Will be our user interface
- Store the work with Mlflow
- Create a Docker container and push it in production with GCP
- Going further π do the same with a sequential data and an RNN model
We create our working environment diagrammed by this tree directory
.
βββ Dockerfile # Contain our docker
βββ Makefile # Task manager
βββ README.md
βββ accueil_deep_draw.png
βββ build
βΒ Β βββ lib
βΒ Β βββ deep_draw
βΒ Β βββ fast_api.py
βββ deep_draw # Main project directory
βΒ Β βββ __init__.py
βΒ Β βββ dl_logic # Deep-Learning classification directory
βΒ Β βΒ Β βββ __init__.py
βΒ Β βΒ Β βββ categories.yaml # Listing of our choosen categories
βΒ Β βΒ Β βββ cnn.py # CNN model
βΒ Β βΒ Β βββ data.py # Loading , cleaning, encoding data
βΒ Β βΒ Β βββ params.py # Manage main variables
βΒ Β βΒ Β βββ preprocessor.py # Preprocessing data
βΒ Β βΒ Β βββ registry.py # Manage model
βΒ Β βΒ Β βββ rnn.py # RNN model
βΒ Β βΒ Β βββ test_categories.yaml
βΒ Β βΒ Β βββ tfrecords.py # Encoding data bitmap --> tfrecords obj
βΒ Β βΒ Β βββ utils.py
βΒ Β βββ fast_api.py # Initialize API
βΒ Β βββ interface
βΒ Β βββ Deep_Draw.py
βΒ Β βββ __init__.py
βΒ Β βββ accueil_deep_draw.png
βΒ Β βββ app.py
βΒ Β βββ main.py
βΒ Β βββ pages
βΒ Β βΒ Β βββ Probabilities_π.py
βΒ Β βΒ Β βββ Submit_π.py
βΒ Β βββ utils.py
βββ deep_draw.egg-info
βββ notebooks # Stockage notebooks
βββ packages.txt
βββ raw_data # Stockage data
βΒ Β βββ dataset.py
βΒ Β βββ ndjson_simplified
βΒ Β βββ npy
βββ requirements.txt # all the dependencies we need to run the package
βββ requirements_prod.txt
βββ setup.py # package installerFor our CNN model, we use the data in .npy type from QuickDraw dataset. This allow us to use bitmap format for our images. One categorie (cats for exemple) contain 100 000 differents draws .
The real challenge is to load and run the model for at least 100 categories, corresponding to 10 000 000 draws !!! π
Thats' why we need to convert the data in an object tensorflow. With it, we can split the data into many packs of 32 draws and make the model easily and faster. Then, we can avoid the expected problemes from RAM memory.
A conventionnal CNN model is initialized using the initialize_cnn method.
Three Conv2D layers followed by three MaxPooling2D layers are used before the Flatten and Dense layers.
The output layers uses the softmax activation function to predict 100 probabilities.
The model is compiled using compile_cnn. An Adam optimizer, a sparse categorical crossentropy loss function and the accuracy metrics his monitored.
#Initialize a CNN Model
model = Sequential()
model.add(Conv2D(16, (3,3), activation='relu', input_shape=(28,28,1)))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
#model.add(Dropout(0.4))
model.add(Dense(num_classes, activation = 'softmax'))
#Compile
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])The final accuracy is around 80% which is sufficient for categorizing sketches.
Here is a 3D visualization of the CNN model
Here is the final confusion matrix and the final classification report.
the activation map shows how neurones specialize whithin the first Conv2D layer.
3 examples from 3 categories π± π· πΈ are represented bellow.
The RNN model is initialized using the initialize_rnn_tfrecords method.
One Masking layer followed by two LSTM layers are used before the Dense layer. The output layers uses the softmax activation function to predict 100 probabilities.
The RNN model is compiled as the same way than Like the CNN model.
#Initialize a RNN Model
model = Sequential()
model.add(layers.Masking(mask_value=1000, input_shape=(1920,3)))
model.add(layers.LSTM(units = 20, activation= 'tanh', return_sequences= True))
model.add(layers.LSTM(units = 20, activation= 'tanh', return_sequences= False))
model.add(Dense(50, activation='relu'))
model.add(Dense(num_classes, activation = 'softmax'))The final accuracy for the RNN model is around 75% which is sufficient for categorizing sketches.
Here is the final confusion matrix and the final classification report.








