-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
npz data files creation #3
Comments
Hi! These files are produced by asvtorch and saved in .npz format for better convenience: in this way I don't have to use every time the loaders inside asvtorch and I have my dataframe ready to be used. |
Hey, So if I'm only interested in generating the dataset (including the audio files) used in the paper without the asvtorch environment what will be the easiest way to do so? |
If you only want the raw recordings then you should get in touch with the
original VoxCeleb <https://www.robots.ox.ac.uk/~vgg/data/voxceleb/> team in
order to get all the recordings, as some YT files are no longer available
due to copyright issues, account deletion etc.
However, if you want also the processed recordings (MFCC etc) then it is
best to use asvtorch as it wraps all the Kaldi operations
Il giorno gio 23 giu 2022 alle ore 14:12 oriohayon ***@***.***>
ha scritto:
… Hey,
Thanks for the answer!
So if I'm only interested in generating the dataset (including the audio
files) used in the paper without the asvtorch environment what will be the
easiest way to do so?
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH3GRTKO7PCFGUEMRNAQJ6DVQRIBPANCNFSM5ZELQUAQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thanks again for your answer. Is it possible to run the asvtorch with Colab notebooks? I have troubles in Colab with many things such as Conda and I don't even get to the point where I can run anything. |
Oh ok, now I understand the context! It is quite tricky, as asvtorch uses
Kaldi under the hood for the spectrograms, mfcc etc etc. In theory you can
compute them with librosa, however the resulting mfcc from librosa will
quite likely be different than the kaldi based on what I read in the past,
even though I do not have any direct experience on it.
My suggestion is to run asvtorch on a local machine, then upload the output
on your Drive, so that you have your data ready to be used from Colab.
Especially because the asvtorch pipeline take several hours to run,
therefore you might risk sudden interruption on Colab.
Il giorno ven 24 giu 2022 alle ore 12:57 oriohayon ***@***.***>
ha scritto:
… Thanks again for your answer.
My problem with the asvtorch environment is that it requires to many
dependencies and other installations (such as kaldi tools) and I don't have
a stand alone environment to do so but I am working with Colab.
Is it possible to run the asvtorch with Colab notebooks? I have troubles
in Colab with many things such as Conda and I don't even get to the point
where I can run anything.
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH3GRTKNLFQSZTZFY25JO43VQWIBTANCNFSM5ZELQUAQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi,
Thanks for the nice repository, I have a question regarding train test phase in some of the notebooks:
You are loading a npc data files (e.g., '/media/hdd1/khaled/npz_files/final_version/test_data.npz') that constructs the train and test dataset, however in the repository I don't find them and I didn't understand how you created them.
Any help finding out how you packed the data files will be great, thanks!
The text was updated successfully, but these errors were encountered: