Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP [not-for-merge]: run aishell with latest recipe in Kaldi #3868

Draft
wants to merge 2 commits into
base: pybind11
Choose a base branch
from

Conversation

qindazhu
Copy link
Contributor

Run aishell with latest recipe in Kaldi which is copied from tedlium/s5_r3/:

  • run_kaldi.sh: the main script including steps before chain model training, with mfcc feature instead of mfcc_pitch
  • run_tdnn_1d.sh: chain model with ivector
  • run_tdnn_1c.sh: chain model without ivector.

Result

  • chain model without ivector
==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.65 [ 6964 / 104765, 155 ins, 247 del, 6562 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/cer_12_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_wer <==
%WER 15.18 [ 9783 / 64428, 900 ins, 1398 del, 7485 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/wer_12_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.71 [ 11724 / 205341, 245 ins, 346 del, 11133 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/cer_11_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.49 [ 17226 / 127698, 1606 ins, 2402 del, 13218 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/wer_11_0.0
  • chain model with ivector
==> exp/chain_cleaned_1d/tdnn1d_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.46 [ 6768 / 104765, 155 ins, 250 del, 6363 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_test/cer_12_1.0

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_test/scoring_kaldi/best_wer <==
%WER 14.91 [ 9604 / 64428, 1035 ins, 1241 del, 7328 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_test/wer_13_0.0

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.51 [ 11310 / 205341, 254 ins, 359 del, 10697 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/cer_11_0.5

==> exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.19 [ 16843 / 127698, 1533 ins, 2413 del, 12897 sub ] exp/chain_cleaned_1d/tdnn1d_sp/decode_dev/wer_12_0.0

TODO

  • Try different network-configs and training parameters in @csukuangfj 's pytorch training recipe for compare.

@danpovey
Copy link
Contributor

can you remind me how this compares with the currently-checked-in results?

@qindazhu
Copy link
Contributor Author

qindazhu commented Jan 22, 2020

Copied from @csukuangfj 's commit https://github.com/mobvoi/kaldi/blob/e8a28b5c96d1f2bc428ebbfa0cc20c51cbccd77b/egs/aishell/s10/RESULTS

pytorch: Results for kaldi pybind LF-MMI training with PyTorch

## head exp/chain/decode_res/*/scoring_kaldi/best_* > RESULTS
#
==> exp/chain/decode_res/dev/scoring_kaldi/best_cer <==
%WER 8.22 [ 16888 / 205341, 774 ins, 1007 del, 15107 sub ] exp/chain/decode_res/dev/cer_10_1.0

==> exp/chain/decode_res/dev/scoring_kaldi/best_wer <==
%WER 16.66 [ 21278 / 127698, 1690 ins, 3543 del, 16045 sub ] exp/chain/decode_res/dev/wer_11_0.5

==> exp/chain/decode_res/test/scoring_kaldi/best_cer <==
%WER 9.98 [ 10454 / 104765, 693 ins, 802 del, 8959 sub ] exp/chain/decode_res/test/cer_11_1.0

==> exp/chain/decode_res/test/scoring_kaldi/best_wer <==
%WER 18.89 [ 12170 / 64428, 1112 ins, 1950 del, 9108 sub ] exp/chain/decode_res/test/wer_12_0.5

tdnn_1b: Results for kaldi nnet3 LF-MMI training https://github.com/mobvoi/kaldi/blob/44ae951ea9c6f509dda24c60d29e5dddb482e3e1/egs/aishell/s10/local/run_tdnn_1b.sh#L100

#
==> exp/chain_nnet3/tdnn_1b/decode_dev/scoring_kaldi/best_cer <==
%WER 7.06 [ 14494 / 205341, 466 ins, 726 del, 13302 sub ] exp/chain_nnet3/tdnn_1b/decode_dev/cer_10_0.5

==> exp/chain_nnet3/tdnn_1b/decode_dev/scoring_kaldi/best_wer <==
%WER 15.11 [ 19296 / 127698, 1800 ins, 2778 del, 14718 sub ] exp/chain_nnet3/tdnn_1b/decode_dev/wer_11_0.0

==> exp/chain_nnet3/tdnn_1b/decode_test/scoring_kaldi/best_cer <==
%WER 8.63 [ 9041 / 104765, 367 ins, 668 del, 8006 sub ] exp/chain_nnet3/tdnn_1b/decode_test/cer_11_1.0

==> exp/chain_nnet3/tdnn_1b/decode_test/scoring_kaldi/best_wer <==
%WER 17.40 [ 11210 / 64428, 1059 ins, 1654 del, 8497 sub ] exp/chain_nnet3/tdnn_1b/decode_test/wer_11_0.5
pytorch tdnn_1b tdnn_1c tdnn_1d
dev_cer 8.22 7.06 5.71 5.51
dev_wer 16.66 15.11 13.49 13.19
test_cer 9.98 8.63 6.65 6.46
test_wer 18.89 17.40 15.18 14.91

@danpovey
Copy link
Contributor

OK, so we have some way to go, but it's all straightforward in principle. I am trying to relax on this vacation so I can get to work hard when I come back...

@csukuangfj
Copy link
Contributor

How long did it take for the training part of run_tdnn_1c.sh ?

It costs me 6 hours and 37 minutes to reach Iter: 39/78 Epoch: 2.03/6.0 (33.8% complete).

@fanlu
Copy link

fanlu commented Jan 29, 2020

it took about 4 hours.

2020-01-28 19:11:02,058 [steps/nnet3/chain/train.py:428 - train - INFO ] Copying the properties from exp/chain_cleaned_1c/tdnn1c_sp/egs to exp/chain_cleaned_1c/tdnn1c_sp 
2020-01-28 19:11:02,222 [steps/nnet3/chain/train.py:442 - train - INFO ] Computing the preconditioning matrix for input features                                          
2020-01-28 19:11:57,945 [steps/nnet3/chain/train.py:451 - train - INFO ] Preparing the initial acoustic model.                                                            
2020-01-28 19:12:14,562 [steps/nnet3/chain/train.py:485 - train - INFO ] Training will run for 6.0 epochs = 79 iterations                                                 
2020-01-28 19:12:14,562 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 0/78   Jobs: 3   Epoch: 0.00/6.0 (0.0% complete)   lr: 0.000750                            
2020-01-28 19:15:25,749 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 1/78   Jobs: 3   Epoch: 0.03/6.0 (0.5% complete)   lr: 0.000741
2020-01-28 23:02:12,888 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 76/78   Jobs: 12   Epoch: 5.58/6.0 (92.9% complete)   lr: 0.000353
2020-01-28 23:05:12,423 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 77/78   Jobs: 12   Epoch: 5.70/6.0 (94.9% complete)   lr: 0.000337
2020-01-28 23:08:08,939 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 78/78   Jobs: 12   Epoch: 5.82/6.0 (97.0% complete)   lr: 0.000300
2020-01-28 23:11:32,723 [steps/nnet3/chain/train.py:585 - train - INFO ] Doing final combination to produce final.mdl
2020-01-28 23:11:32,724 [steps/libs/nnet3/train/chain_objf/acoustic_model.py:571 - combine_models - INFO ] Combining {60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
73, 74, 75, 76, 77, 78, 79} models.
2020-01-28 23:12:26,242 [steps/nnet3/chain/train.py:614 - train - INFO ] Cleaning up the experiment directory exp/chain_cleaned_1c/tdnn1c_sp
exp/chain_cleaned_1c/tdnn1c_sp: num-iters=79 nj=3..12 num-params=9.3M dim=40->3448 combine=-0.030->-0.030 (over 1) xent:train/valid[51,78]=(-0.682,-0.513/-0.693,-0.540) l
ogprob:train/valid[51,78]=(-0.045,-0.030/-0.051,-0.039)

@csukuangfj
Copy link
Contributor

It took me more than 19 hours for the nnet3 traning part and it gives me similar results as haowen's:

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_cer <==
%WER 6.66 [ 6975 / 104765, 150 ins, 228 del, 6597 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/cer_11_0.5

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_test/scoring_kaldi/best_wer <==
%WER 15.14 [ 9755 / 64428, 1019 ins, 1255 del, 7481 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_test/wer_13_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_cer <==
%WER 5.69 [ 11691 / 205341, 253 ins, 345 del, 11093 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/cer_11_0.0

==> exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/scoring_kaldi/best_wer <==
%WER 13.45 [ 17179 / 127698, 1584 ins, 2408 del, 13187 sub ] exp/chain_cleaned_1c/tdnn1c_sp/decode_dev/wer_11_0.0

@qindazhu I think you mixed dev and test in your table.

Part of the training log is as follows:

2020-01-29 14:00:34,599 [steps/nnet3/chain/train.py:428 - train - INFO ] Copying the properties from exp/chain_cleaned_1c/tdnn1c_sp/egs to exp/chain_c
leaned_1c/tdnn1c_sp
2020-01-29 14:00:34,600 [steps/nnet3/chain/train.py:485 - train - INFO ] Training will run for 6.0 epochs = 79 iterations
2020-01-29 14:00:34,600 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 0/78   Jobs: 3   Epoch: 0.00/6.0 (0.0% complete)   lr: 0.000750
2020-01-29 14:07:15,371 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 1/78   Jobs: 3   Epoch: 0.03/6.0 (0.5% complete)   lr: 0.000741
2020-01-29 14:13:03,763 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 2/78   Jobs: 3   Epoch: 0.06/6.0 (1.0% complete)   lr: 0.000733
2020-01-29 14:18:51,418 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 3/78   Jobs: 3   Epoch: 0.09/6.0 (1.5% complete)   lr: 0.000724

2020-01-30 08:25:31,335 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 76/78   Jobs: 12   Epoch: 5.58/6.0 (92.9% complete)   lr: 0.000353
2020-01-30 08:49:30,420 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 77/78   Jobs: 12   Epoch: 5.70/6.0 (94.9% complete)   lr: 0.000337
2020-01-30 09:13:33,554 [steps/nnet3/chain/train.py:529 - train - INFO ] Iter: 78/78   Jobs: 12   Epoch: 5.82/6.0 (97.0% complete)   lr: 0.000300
2020-01-30 09:37:31,684 [steps/nnet3/chain/train.py:585 - train - INFO ] Doing final combination to produce final.mdl

@qindazhu
Copy link
Contributor Author

@csukuangfj yes, I mixed up the result in the table for Kaldi result, I have updated the table, thanks!

@qindazhu qindazhu changed the title WIP: run aishell with latest recipe in Kaldi WIP [not-for-merge]: run aishell with latest recipe in Kaldi Feb 17, 2020
@stale
Copy link

stale bot commented Jun 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Stale bot on the loose label Jun 19, 2020
@kkm000 kkm000 added the stale-exclude Stale bot ignore this issue label Jul 15, 2020
@stale stale bot removed the stale Stale bot on the loose label Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale-exclude Stale bot ignore this issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants