Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] This branch contains some acoustic modeling improvements... still cleaning up. #2114

Merged
merged 117 commits into from
Feb 17, 2018

Conversation

danpovey
Copy link
Contributor

No description provided.

danpovey and others added 30 commits December 1, 2017 18:54
…ect memory-norm). Other small changes to MemoryNormComponent, will rework most of this. Adding script changes for memory-norm.
…eorthonormalize-with-memonorm

Conflicts:
	src/nnet3/nnet-parse.h
@davidavdav
Copy link
Contributor

Hello,

I've run this script, or at least, the nnet configuration, in a Librispeech set-up, and results are:

exp/chain/tdnn7m23t_sp/decode_dev_clean_fglarge/wer_12_0.5:%WER 3.40 [ 1851 / 54402, 229 ins, 155 del, 1467 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_clean_tglarge/wer_12_1.0:%WER 3.51 [ 1910 / 54402, 202 ins, 199 del, 1509 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_clean_tgmed/wer_12_0.0:%WER 4.28 [ 2327 / 54402, 265 ins, 206 del, 1856 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_clean_tgsmall/wer_12_0.0:%WER 4.84 [ 2635 / 54402, 277 ins, 263 del, 2095 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_other_fglarge/wer_14_0.5:%WER 8.77 [ 4468 / 50948, 484 ins, 500 del, 3484 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_other_tglarge/wer_14_0.0:%WER 9.24 [ 4709 / 50948, 598 ins, 435 del, 3676 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_other_tgmed/wer_14_0.0:%WER 11.31 [ 5760 / 50948, 604 ins, 678 del, 4478 sub ]
exp/chain/tdnn7m23t_sp/decode_dev_other_tgsmall/wer_14_0.0:%WER 12.45 [ 6342 / 50948, 596 ins, 829 del, 4917 sub ]
exp/chain/tdnn7m23t_sp/decode_test_clean_fglarge/wer_11_0.5:%WER 3.87 [ 2036 / 52576, 317 ins, 157 del, 1562 sub ]
exp/chain/tdnn7m23t_sp/decode_test_clean_tglarge/wer_10_0.5:%WER 4.04 [ 2126 / 52576, 336 ins, 164 del, 1626 sub ]
exp/chain/tdnn7m23t_sp/decode_test_clean_tgmed/wer_12_0.0:%WER 4.90 [ 2575 / 52576, 345 ins, 229 del, 2001 sub ]
exp/chain/tdnn7m23t_sp/decode_test_clean_tgsmall/wer_12_0.0:%WER 5.35 [ 2811 / 52576, 340 ins, 269 del, 2202 sub ]
exp/chain/tdnn7m23t_sp/decode_test_other_fglarge/wer_13_0.5:%WER 8.97 [ 4694 / 52343, 564 ins, 481 del, 3649 sub ]
exp/chain/tdnn7m23t_sp/decode_test_other_tglarge/wer_13_0.5:%WER 9.42 [ 4931 / 52343, 581 ins, 542 del, 3808 sub ]
exp/chain/tdnn7m23t_sp/decode_test_other_tgmed/wer_13_0.0:%WER 11.56 [ 6053 / 52343, 690 ins, 659 del, 4704 sub ]
exp/chain/tdnn7m23t_sp/decode_test_other_tgsmall/wer_13_0.0:%WER 12.64 [ 6615 / 52343, 678 ins, 791 del, 5146 sub ]

which is consistently better than our earlier tdnn_lstm setup (which was adapted from mini-librispeech), for which we recorded:

exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_clean_fglarge/wer_10_0.5:%WER 3.64 [ 1980 / 54402, 232 ins, 170 del, 1578 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_clean_tglarge/wer_11_0.0:%WER 3.74 [ 2037 / 54402, 241 ins, 166 del, 1630 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_clean_tgmed/wer_10_0.0:%WER 4.70 [ 2558 / 54402, 266 ins, 236 del, 2056 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_clean_tgsmall/wer_10_0.0:%WER 5.11 [ 2780 / 54402, 266 ins, 274 del, 2240 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_other_fglarge/wer_12_0.0:%WER 10.01 [ 5100 / 50948, 634 ins, 474 del, 3992 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_other_tglarge/wer_13_0.0:%WER 10.56 [ 5381 / 50948, 626 ins, 584 del, 4171 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_other_tgmed/wer_13_0.0:%WER 12.56 [ 6401 / 50948, 652 ins, 818 del, 4931 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_dev_other_tgsmall/wer_12_0.0:%WER 13.57 [ 6916 / 50948, 669 ins, 877 del, 5370 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_looped_dev_clean_tgsmall/wer_11_0.0:%WER 5.08 [ 2766 / 54402, 264 ins, 281 del, 2221 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_looped_dev_other_tgsmall/wer_14_0.0:%WER 13.92 [ 7092 / 50948, 634 ins, 1009 del, 5449 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_looped_test_clean_tgsmall/wer_12_0.0:%WER 5.55 [ 2919 / 52576, 326 ins, 290 del, 2303 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_looped_test_other_tgsmall/wer_14_0.0:%WER 14.03 [ 7344 / 52343, 635 ins, 1068 del, 5641 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_clean_fglarge/wer_10_0.5:%WER 4.00 [ 2105 / 52576, 286 ins, 176 del, 1643 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_clean_tglarge/wer_10_0.5:%WER 4.13 [ 2174 / 52576, 286 ins, 196 del, 1692 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_clean_tgmed/wer_11_0.5:%WER 5.06 [ 2661 / 52576, 272 ins, 288 del, 2101 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_clean_tgsmall/wer_10_0.0:%WER 5.53 [ 2907 / 52576, 333 ins, 272 del, 2302 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_other_fglarge/wer_12_0.5:%WER 10.22 [ 5349 / 52343, 557 ins, 654 del, 4138 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_other_tglarge/wer_11_0.5:%WER 10.78 [ 5641 / 52343, 580 ins, 640 del, 4421 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_other_tgmed/wer_12_0.0:%WER 12.81 [ 6707 / 52343, 652 ins, 829 del, 5226 sub ]
exp/chain_cleaned/tdnn_lstm1a_sp/decode_test_other_tgsmall/wer_12_0.0:%WER 13.85 [ 7247 / 52343, 650 ins, 988 del, 5609 sub ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants