Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major maintenance update #961

Merged
merged 16 commits into from
Dec 28, 2021
Merged

Major maintenance update #961

merged 16 commits into from
Dec 28, 2021

Conversation

CorentinJ
Copy link
Owner

Hello everyone. As you may know, this repo hasn't been actively maintained in the last 1-2 years - aside from all bluefish has done. Thank you for that bluefish, it was very helpful. As for myself, I am still working full-time and I became a father recently. I won't give you my life story but it's clear that maintaining this repository is not a priority for me.

This project was nothing more than a master's thesis. It still is today. It's far from SOTA. In a field that progresses so rapidly, it is already outdated. If you were to build from this repo to start a serious project, code-wise it's okay-ish. I actually found myself pleasantly surprised while making this update, I thought it would be a lot harder than it was. Anyway, the strength of this repo lies in its accessibility. It's got a cool little GUI slapped on top of it that lets you play around even if you don't know much about ML or programming.

Of course this accessibility drops when just getting the toolbox to run is a can of worms. This update attempts to remediate this.

Changes

Environment

  • I've updated everything using a fresh python 3.7 env as reference. All packages were updated to their latest version (as possible) and pinned in requirements.txt. Aside from torch, doing a requirements install will get you the entire env ready
    • webrtcvad installs nicely on windows these days, so that's a needle out of my foot
      • As a result I've removed all --no_trim arguments

All models

  • Pretrained models are now downloaded automatically! You can still download them manually if you wish
  • The directory structure for saved models has changed. Where before you had <model_type>/saved_models/pretrained/pretrained.pt, you now have saved_models/<run_id>/<model_type>.pt. You may store different models types in the same run_id if you wish.
  • Anywhere multiprocessing was involved I have made fixes. Windows now uses multiprocessing again. To this end, I made all matplotlib imports local, because it is a source of problems.

Encoder

  • Preprocessing is now in a process pool instead of a (almost useless) threadpool.
  • Preprocessing now supports more audio extensions (but still expects the same datasets)

Misc

  • The "no mp3 support" was removed, I'll revisit this issue if needed but I wasn't happy with how it was handled currently.

@CorentinJ CorentinJ merged commit 370e970 into master Dec 28, 2021
@CorentinJ CorentinJ deleted the dev branch December 28, 2021 12:18
@ireneb612
Copy link

Thank you for your time and effort !

@DoubleF3lix
Copy link

Congratulations on being a father! Glad to see this.

@raccoonML
Copy link

If you were to build from this repo to start a serious project, code-wise it's okay-ish.

I really like how you perform synthesizer audio preprocessing in this repo. The code gets reused in a lot of my personal projects. I find the train.txt and individual .npy files are much easier to work with than pickle datasets.

@Tomcattwo
Copy link
Contributor

Corentin,
Thank you so much for developing and sharing this valuable tool! Congratulations on the birth of your child, and wishing you a lifetime of enjoyment with your now-expanded family!

Do not underestimate the effect that your Masters thesis and the toolbox have had upon the world...there are likely many more people than you could have ever imagined who are using it for numerous new and creative things, all over the world. Blessings and thanks.
Regards,
Tomcattwo

@CodingRox82
Copy link

CodingRox82 commented Aug 24, 2022

@CorentinJ

How do you think this repo will hold up in attempting to do the stuff below nowadays? Do you think this repo is a good fit or are there better options out there? I don't need to do text to speech and I don't need a GUI. All I need to do is be able to run the voice to voice code on demand.

-I will have a number of different audio file samples of different voices reading a variety of sentences. These will be the training models.
-In real time, an input audio file of a person reading a sentence will be sent over the internet and this repo will receive it, convert it into an output file of one of the training models, then send it back over the internet.
-The words in the output audio file should be able to be understood as clearly as they are in the input audio file.

@fran478
Copy link

fran478 commented Apr 25, 2023

Hello. An example path would be saved_models/<run_id>/<model_type>.pt = saved_models/encoder/encoder.pt?. Thanks for any help.

@valvesss
Copy link

Time to re-open?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants