Add converter script to keras by walidabn · Pull Request #567 · epfml/disco

walidabn · 2023-02-13T18:55:34Z

Fixes issue @480 in the following sense :

This PR is not meant to be merged on develop, but essentially brings the DeepBreath model to JS. In more details :

The converted DeepBreath model in JS is in /tfjs_model
The Tensorflow saved_model is in /saved_model
Three Python scripts were written, the first one called convertDB.py is a conversion from Pytorch to ONNX, a second one called onnx2TensorFlow.py is a rewritting from the ONNX format to tensorflow/TensorflowJS, but by comparing it to the original Pytorch model, some layers had to be dropped due to incompatibilites between Pytorch and ONNX. The third script is an accurate rewritting from the Pytorch model to Tensorflow/TFJS, called torchToTFModel.py. Torch is not used in it, inside we write a TF/Keras model and then save it.

A few notes : When converting from Pytorch to Tensorflow/TFJS, unless the model is trivial, we should not rely on automatic conversion tools like pytorch2keras, or onnx2keras, as these libraries are out of date and don't support many layers/essential functionalities. Redevelopping the model manually on Tensorflow is the best way to go for these kind of tasks.

martinjaggi · 2023-02-13T20:31:39Z

in the description/readme could you clarify the 3rd way in some more details? i guess you mean to write a model in python TF/keras and then saving. this is in torchToTFModel.py while torch is not used in it, right?

can you link to our documentation somehow how to take python TF/keras models and convert it to TFjs?

how about the preprocessing pipeline in this case? i guess it remains in python but how did you abstract it?

were you able to test the resulting JS model if it does the same as the pytorch python one? maybe with and without preprocessing?

morganridel · 2023-02-16T14:50:20Z

Did the 70MB+ files "output" needed to be pushed in Git? @s314cy worked a lot to clean the Git history to get a light repo

s314cy · 2023-02-20T09:23:50Z

Did the 70MB+ files "output" needed to be pushed in Git? @s314cy worked a lot to clean the Git history to get a light repo

yes, such heavy static files should not be pushed to the repo, even if not merged in dev/prod, since the repo will keep track of the files as long as they exist in some commit in some remote branch (so it won't be enough to remove them and commit)

a good solution would be for the branch to only contain the source code that generated the static model files that were pushed; the latter can be gitignored (note that logs and other situational files should not be pushed either, thus they can be gitignored as well)

technically, this means adding all the log, onnx and bin files to the gitignore, git rm them, commit, squash the git rm commit with the commit that added the files, and finally push force

walidabn · 2023-02-27T11:51:05Z

For Martin's first point, it's exactly that. I edited the message to clarify that point.
I added and clarified the documentation for model conversion in FAQ.md and adapted the one in TASK.md. I think it's a bit clearer now.
Adding some details on the preprocessing pipeline :
The preprocessing pipeline goes as follows :
convert the raw audio file into a text file using the python preprocessing pipeline abstracted in preprocess.py into a text file with numeric values. In order to do so, we used the following preprocessing pipeline that can be found [here] (https://github.com/epfl-iglobalhealth/DeepBreath-App-Giorgio/blob/main/models/train_models_for_mobile/train_model_disease_classification.py).

Running the script yields a .txt file containing a certain amount of floating point values, which will be a multiple of 32. In order to use the TensorflowJS model on the data, read values from the text file by loading them in a float tensor of size [32,-1].

Comparing performance : As weights cannot be transfered from Pytorch to TensorflowJS, assessing and comparing the performance of the two models cannot be done.

s314cy · 2023-03-16T12:33:37Z

@walidabn should I merge this once the .py files are removed? or would you prefer to add the deepbreath task in this PR as well?

…ions

…on on custom preprocessing in JS vs Python in TASK

walidabn force-pushed the 480-DeepBreathConversion branch from 491040e to e6cb7df Compare February 27, 2023 11:43

s314cy assigned walidabn Mar 7, 2023

s314cy force-pushed the 480-DeepBreathConversion branch from 1826b55 to 516539d Compare March 16, 2023 13:02

walidabn added 3 commits March 16, 2023 14:03

Modify doc + remove tfjs_model content

83327d5

Modified FAQ and Task markdowns to add clarification on model convers…

88d5ca9

…ions

Remove Model conversion from FAQ, add it to TASK, and add clarificati…

cae03a4

…on on custom preprocessing in JS vs Python in TASK

s314cy force-pushed the 480-DeepBreathConversion branch from 516539d to cae03a4 Compare March 16, 2023 13:04

s314cy approved these changes Mar 23, 2023

View reviewed changes

s314cy added the documentation Improvements or additions to documentation label Mar 23, 2023

martinjaggi and others added 2 commits March 23, 2023 15:32

Update TASK.md

d62bad0

Update TASK.md

8430df6

s314cy merged commit 6be90bb into develop Mar 23, 2023

s314cy deleted the 480-DeepBreathConversion branch March 23, 2023 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add converter script to keras#567

Add converter script to keras#567
s314cy merged 5 commits intodevelopfrom
480-DeepBreathConversion

walidabn commented Feb 13, 2023 •

edited

Loading

Uh oh!

martinjaggi commented Feb 13, 2023

Uh oh!

morganridel commented Feb 16, 2023

Uh oh!

s314cy commented Feb 20, 2023

Uh oh!

walidabn commented Feb 27, 2023

Uh oh!

s314cy commented Mar 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

walidabn commented Feb 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martinjaggi commented Feb 13, 2023

Uh oh!

morganridel commented Feb 16, 2023

Uh oh!

s314cy commented Feb 20, 2023

Uh oh!

walidabn commented Feb 27, 2023

Uh oh!

s314cy commented Mar 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

walidabn commented Feb 13, 2023 •

edited

Loading