Vocal Tract Segmentation with U-net based framework from MRI images with overimposed Gaussian noise
Project from the course Neuroengineering @ Politecnico di Milano
-
Mattia Cazzolla (@MattiaCazzolla) mattia.cazzolla@mail.polimi.it
-
Alix de Langlais (@Adelanglais) alixanne.delanglais@mail.polimi.it
-
Paolo Marzolo (@pollomarzo) paolo.marzolo@mail.polimi.it
-
Olmo Notarianni (michelangeloolmo.nogara@mail.polimi.it)
-
Sara Rescalli (sara.rescalli@mail.polimi.it)
Final grade: 32/30
The dataset provided was generated using the frames of Dynamic Supine MRI (dsMRI) videos recorded for different patients under specific speech protocols.
All the images had additive Gaussian noise overimposed.
The dataset contained a total of 820 images from 4 patients (respectively 280, 240, 150, 150).
The preprocessing pipeline implemented aims at:
- Removing the Gaussian noise with a Total Variation Denoising technique (link)
- Enhancing the high frequency component
The U-net architecture implemented consists of a variation from the IMU-NET described in this paper
The images were split into different datasets as follows:
- Patient 1 and 2
$\rightarrow$ Training Set - Patient 3
$\rightarrow$ Validation Set - Patient 4
$\rightarrow$ Test Set
The results on the test set are reported in the following table
Class | DICE (mean |
---|---|
Background | 0.991 |
Upper Lip | 0.901 |
Lower Lip | 0.898 |
Hard Palate | 0.819 |
Soft Palate | 0.797 |
Tongue | 0.931 |
Head | 0.968 |
The progress in learning can be observed by the segmentation at each epoch of the training
The project required us to produce a 3 minutes video explaining our approach.
Untitled.mp4
Voiced by: @Adelanglais, @pollomarzo
Animated by: @Adelanglais, @MattiaCazzolla
This project is licensed under the MIT License.