Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ivector training in pytorch model #3969

Merged
merged 4 commits into from
Mar 3, 2020

Conversation

fanlu
Copy link

@fanlu fanlu commented Mar 2, 2020

update latest result

TDNN-F(Pytorch, Adam, fanlu's previous result ) TDNN-F(Pytorch, Adam, haowen's previous result with 4GPU) this pull request base on haowen's pr #3966
dev_cer 6.16 6.29 6.18
dev_wer 14.01 14.10 13.96
test_cer 7.31 7.57 7.59
test_wer 15.97 15.80 15.86

@danpovey
Copy link
Contributor

danpovey commented Mar 3, 2020 via email

@danpovey
Copy link
Contributor

danpovey commented Mar 3, 2020 via email

@fanlu
Copy link
Author

fanlu commented Mar 3, 2020

@csukuangfj I have fixed the code with your suggestion. please have a look.
@csukuangfj @qindazhu please test this pr at your convenience due to Dan's advice. Thanks.

@qindazhu
Copy link
Contributor

qindazhu commented Mar 3, 2020

OK, I'll run it after it's merged.

@qindazhu
Copy link
Contributor

qindazhu commented Mar 3, 2020

I'll take a look at that pr and start to do this.

Guys, I just want to mention something... I think it would be better if we shifted (not necessarily right now..) to, instead of exposing the Kaldi egs as a Dataset, exposing them as a DataLoader. That way we could use the existing command-line tools for things like shuffling and time-shifting, and it will be much more efficient for I/O. The idea is that the dataloader would, on every epoch, create a suitable command line and read from it as a pipe. If it was a distributed data-loader, probably the easiest way to do it would be to make sure there is an appropriately split scp file and give it the appropriate one. We could the scripts here #3765 to generate the scp files. I want to merge this soon; one option is to merge into pybind11 first to test it.

On Tue, Mar 3, 2020 at 10:11 AM fanlu @.> wrote: @.* commented on this pull request. ------------------------------ In egs/aishell/s10/chain/feat_dataset.py <#3969 (comment)>: > with open(feats_scp, 'r') as f: for line in f: split = line.split() assert len(split) == 2 - items.append(split) - - self.items = items + uttid, rxfilename =split OK — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3969?email_source=notifications&email_token=AAZFLO6XPUG7BA6VEDOUVETRFRRNRA5CNFSM4K7XDWTKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCXVAYYY#discussion_r386762254>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZFLO2XIWLBFIYM4T3DZQ3RFRRNRANCNFSM4K7XDWTA .

@danpovey
Copy link
Contributor

danpovey commented Mar 3, 2020

OK, merging.

@danpovey danpovey merged commit 63c732b into kaldi-asr:pybind11 Mar 3, 2020
@danpovey
Copy link
Contributor

danpovey commented Mar 3, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants