mel-sound-upsampling

Code for my final project "GAN Mel Spectrogram Upscaling for Audio Super-Resolution" for cs395T - Deep Learning Seminar with Philipp Krähenbühl

Contains a modified version of BasicSR/ESRGAN and an iPython notebook to generate data.

Usage:

Extract VCTK Dataset, adjust file path in notebook and generate data
copy data to BasicSR/datasets/multispeaker
modify BasicSR/options/train/ESRGAN/train_ESRGAN_MEL.yml/BasicSR/options/test/ESRGAN/test_ESRGAN_MEL.yml
train using python basicsr/train.py -opt options/train/ESRGAN/train_ESRGAN_MEL.yml
during training, you can listen to validataion outputs (follow code in notebook)
batch generate upscaled audio files by putting 11025Hz files into BasicSR/inference_data and running python basicsr/inference.py -opt options/test/ESRGAN/test_ESRGAN_MEL.yml

The samples directory contains audio samples randomly chosen from the test data. "_low_sr" files are the 11025Hz inputs to the model, "_label" files are the 44100Hz labels and "_pred" files are the network's predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
BasicSR @ 99ef706		BasicSR @ 99ef706
samples		samples
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
resample.ipynb		resample.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mel-sound-upsampling

Code for my final project "GAN Mel Spectrogram Upscaling for Audio Super-Resolution" for cs395T - Deep Learning Seminar with Philipp Krähenbühl

About

Uh oh!

Releases

Packages

Languages

WhyToFly/mel-sound-upsampling

Folders and files

Latest commit

History

Repository files navigation

mel-sound-upsampling

Code for my final project "GAN Mel Spectrogram Upscaling for Audio Super-Resolution" for cs395T - Deep Learning Seminar with Philipp Krähenbühl

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages