The recognition ability of the Audio model #69

b3326023 · 2019-12-10T03:43:22Z

Excuse me, I have search that the audio pretrained model used in this project is Speech-Command, and it use over 105,000 WAVE audio files of people saying thirty different words.

So this base model have the ability to well recognize many different words, and its learned low level feature should be only associated with the speech only.

But my question is that why the transfer learning model trained on some very different audio samples like clap table, water sounds, whistle, etc, such non-speech sounds, are also magically perform very well?

irealva added the question Further information is requested label Dec 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The recognition ability of the Audio model #69

The recognition ability of the Audio model #69

b3326023 commented Dec 10, 2019

The recognition ability of the Audio model #69

The recognition ability of the Audio model #69

Comments

b3326023 commented Dec 10, 2019