hotword detection for a new language #24

nfaraji2002 · 2023-02-26T11:32:04Z

Hi

Is it possible that I train this model for a language with a different alphabet than English, such as Persian?

Thanks,

aman-17 · 2023-02-26T20:17:02Z

Yes, you can do that. Please go through the training file.

nfaraji2002 · 2023-02-27T09:28:03Z

Thanks.
I executed training.ipynb, but I faced with an error:

No file or directory found at /content/drive/MyDrive/Siamese/modelCheckpoints_old/model-8-01-0.96.h5

I think I need some pre-trained models, but I could not find it in your github. Is it possible that you upload them in the github space to be accessed by everyone?

My another question is that:
I found that there are lots of English single-word audio files in the directory: "dataset_format_fixed". Do I require a new single-word audio dataset to train for a new language? or Can I use the model trained by your English dataset to customize on my hot words that are with completely different alphabets and letters such as in Arabic:
آ ب ث د ر ز م س ش ح ض
Thanks in advance

aman-17 · 2023-03-03T21:07:47Z

For your first question: Training again with Arabic words will give a better performance instead of going with the pre-trained model of English since the window frame of audio will be different(guessing this since Arabic words are longer than 1 sec).

Do I require a new single-word audio dataset to train for a new language? Yes if you want to get high accuracy. Our model gives the best accuracy on words that have less than 1.5 sec.

TheSeriousProgrammer · 2023-03-04T01:58:18Z

Like. @aman-17 pointed out it can be better to train the model from scratch as there is very little to no similarities in the pronunciations between arabian language and english

Secondly a more polished version of the code with pytorch and resnet is currently under the works. Will share the same soon , so stay stuned!

TheSeriousProgrammer · 2023-04-14T11:45:31Z

The new model is out, can you test it with arabic languages and let us know? The newer model has only been trained for english words , but its perfomance is way better than the old one

Soon we will share the training code of the newer model as well

aman-17 closed this as completed Mar 17, 2023

aman-17 reopened this Mar 17, 2023

aman-17 added enhancement New feature or request wake_word_generation labels Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hotword detection for a new language #24

hotword detection for a new language #24

nfaraji2002 commented Feb 26, 2023

aman-17 commented Feb 26, 2023

nfaraji2002 commented Feb 27, 2023

aman-17 commented Mar 3, 2023

TheSeriousProgrammer commented Mar 4, 2023 •

edited

Loading

TheSeriousProgrammer commented Apr 14, 2023

hotword detection for a new language #24

hotword detection for a new language #24

Comments

nfaraji2002 commented Feb 26, 2023

aman-17 commented Feb 26, 2023

nfaraji2002 commented Feb 27, 2023

aman-17 commented Mar 3, 2023

TheSeriousProgrammer commented Mar 4, 2023 • edited Loading

TheSeriousProgrammer commented Apr 14, 2023

TheSeriousProgrammer commented Mar 4, 2023 •

edited

Loading