(WIP) Adding DIHARD 2018 recipe. #2822

HuangZiliAndy · 2018-11-06T16:40:50Z

The script is not finished and David would review it.

david-ryan-snyder · 2018-11-06T16:52:46Z

egs/dihard_2018/v2/run.sh

+# Copyright   2017   Johns Hopkins University (Author: Daniel Garcia-Romero)
+#             2017   Johns Hopkins University (Author: Daniel Povey)
+#        2017-2018   David Snyder
+#             2018   Ewald Enzinger


Be sure to credit yourself here.

You can remove hte file local/prepare_for_eer.py, it's not used in this recipe.

david-ryan-snyder · 2018-11-06T16:55:17Z

egs/dihard_2018/v2/conf/queue.conf

+option max_jobs_run=* -tc $0
+default gpu=0
+option gpu=0 -l 'hostname=b1[234568]*|c*,gpu=0' -q g.q
+option gpu=* -l 'hostname=b1[234568]*|c*,gpu=$0' -q g.q


This is a site specific file and should be deleted.

I know this file is explained in the documentation and in queue.pl, but it would have helped me to have had an example file under conf. Could this be made generic and part of the recipe?

david-ryan-snyder · 2018-11-06T16:56:02Z

egs/dihard_2018/v2/local/make_dihard_dev_David.py

+segments_fi.close()
+utt2spk_fi.close()
+wavscp_fi.close()
+rttm_fi.close()


This file was probably added by accident. Please check for other unused files that shouldn't be in the PR.

david-ryan-snyder · 2018-11-06T17:08:23Z

Also, please create a README.txt in egs/dihard_2018 in another in egs/dihard_2018/v2.

The first one should explain the DIHARD data and provide a high level overview of what recipes there are (right now, just v2). You may want to link to https://coml.lscp.ens.fr/dihard/index.html in this README.txt.

In the second README.txt, you can explain the recipe in more detail (e.g., clustering embeddings, PLDA scoring, AHC, how we determine the AHC stopping threshold, etc). You'll may want to refer to the paper http://www.danielpovey.com/files/2018_interspeech_dihard.pdf which this recipe is based on. There were two tracks in the DIHARD 2018 competition, one uses oracle SAD (track1) and the other required that SAD was performed form scratch (track2). You might mention that this recipe demonstrates a solution for track1 of that eval.

david-ryan-snyder · 2018-11-06T17:10:19Z

Also adding @leibny to this, for her info.

danpovey · 2018-11-06T22:59:39Z

Sorry, I don't want to have copies of this everywhere because then people will end up modifying it to suit their cluster.

…

On Tue, Nov 6, 2018 at 5:57 PM John J Morgan ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/dihard_2018/v2/conf/queue.conf <#2822 (comment)>: > @@ -0,0 +1,10 @@ +# Default configuration +command qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* +option mem=* -l mem_free=$0,ram_free=$0 +option mem=0 # Do not add anything to qsub_opts +option num_threads=* -pe smp $0 +option num_threads=1 # Do not add anything to qsub_opts +option max_jobs_run=* -tc $0 +default gpu=0 +option gpu=0 -l 'hostname=b1[234568]*|c*,gpu=0' -q g.q +option gpu=* -l 'hostname=b1[234568]*|c*,gpu=$0' -q g.q I know this file is explained in the documentation and in queue.pl, but it would have helped me to have had an example file under conf. Could this be made generic and part of the recipe? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2822 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu9qIN0hm-_Qig8xnQwVu7_y3PVoWks5ushPkgaJpZM4YQ1Ll> .

danpovey · 2018-11-06T23:00:05Z

... I mean, and checking in the changes, causing incompatibilities.

…

On Tue, Nov 6, 2018 at 5:59 PM Daniel Povey ***@***.***> wrote: Sorry, I don't want to have copies of this everywhere because then people will end up modifying it to suit their cluster. On Tue, Nov 6, 2018 at 5:57 PM John J Morgan ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In egs/dihard_2018/v2/conf/queue.conf > <#2822 (comment)>: > > > @@ -0,0 +1,10 @@ > +# Default configuration > +command qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* > +option mem=* -l mem_free=$0,ram_free=$0 > +option mem=0 # Do not add anything to qsub_opts > +option num_threads=* -pe smp $0 > +option num_threads=1 # Do not add anything to qsub_opts > +option max_jobs_run=* -tc $0 > +default gpu=0 > +option gpu=0 -l 'hostname=b1[234568]*|c*,gpu=0' -q g.q > +option gpu=* -l 'hostname=b1[234568]*|c*,gpu=$0' -q g.q > > I know this file is explained in the documentation and in queue.pl, but > it would have helped me to have had an example file under conf. Could this > be made generic and part of the recipe? > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#2822 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/ADJVu9qIN0hm-_Qig8xnQwVu7_y3PVoWks5ushPkgaJpZM4YQ1Ll> > . >

HuangZiliAndy · 2018-11-07T02:53:34Z

... I mean, and checking in the changes, causing incompatibilities.
…
On Tue, Nov 6, 2018 at 5:59 PM Daniel Povey @.> wrote: Sorry, I don't want to have copies of this everywhere because then people will end up modifying it to suit their cluster. On Tue, Nov 6, 2018 at 5:57 PM John J Morgan @.> wrote: > @.**** commented on this pull request. > ------------------------------ > > In egs/dihard_2018/v2/conf/queue.conf > <#2822 (comment)>: > > > @@ -0,0 +1,10 @@ > +# Default configuration > +command qsub -v PATH -cwd -S /bin/bash -j y -l arch=64 > +option mem=* -l mem_free=$0,ram_free=$0 > +option mem=0 # Do not add anything to qsub_opts > +option num_threads=* -pe smp $0 > +option num_threads=1 # Do not add anything to qsub_opts > +option max_jobs_run=* -tc $0 > +default gpu=0 > +option gpu=0 -l 'hostname=b1[234568]|c,gpu=0' -q g.q > +option gpu=* -l 'hostname=b1[234568]|c,gpu=$0' -q g.q > > I know this file is explained in the documentation and in queue.pl, but > it would have helped me to have had an example file under conf. Could this > be made generic and part of the recipe? > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#2822 (comment)>, or mute > the thread > https://github.com/notifications/unsubscribe-auth/ADJVu9qIN0hm-_Qig8xnQwVu7_y3PVoWks5ushPkgaJpZM4YQ1Ll > . >

Hi, Dan! David has told me about this problem and this is because this is a very very draft version. I will pay attention to that and make sure there is no personal settings in the final version.

david-ryan-snyder · 2018-11-07T16:45:44Z

egs/dihard_2018/v2/local/make_dihard_2018_dev.sh

+#
+# This script, called by ../run.sh, creates the MUSAN
+# data directory. The required dataset is freely available at
+#   http://www.openslr.org/17/


Could you rewrite this comment so it makes sense for this file? (e.g., remove anything about MUSAN)

david-ryan-snyder · 2018-11-07T16:48:52Z

egs/dihard_2018/v2/local/make_dihard_2018_eval.sh

+#
+# This script, called by ../run.sh, creates the MUSAN
+# data directory. The required dataset is freely available at
+#   http://www.openslr.org/17/


This comment should be rewritten to remove references to MUSAN.

david-ryan-snyder · 2018-11-07T16:49:41Z

egs/dihard_2018/v2/local/make_dihard_2018_eval.py

@@ -0,0 +1,53 @@
+#!/usr/bin/env python3


Could you write a short comment for this? I suggest clarifying that this script is only ever called by make_dihard_2018_eval.sh.

david-ryan-snyder · 2018-11-07T16:50:40Z

egs/dihard_2018/v2/local/make_dihard_2018_dev.py

@@ -0,0 +1,53 @@
+#!/usr/bin/env python3


Could you add a brief comment, and explain the relationship between this script and the *.sh file?

david-ryan-snyder · 2018-11-07T16:51:28Z

egs/dihard_2018/v2/run.sh

+dihard_2018_dev=/export/corpora/LDC/LDC2018E31
+dihard_2018_eval=/export/corpora/LDC/LDC2018E32v1.1
+
+stage=14


Stage should be set to 0

david-ryan-snyder · 2018-11-07T16:53:01Z

egs/dihard_2018/v2/run.sh

+      data/train_cmn data/train_cmn_segmented
+fi
+
+# In this section, we augment the VoxCeleb2 data with reverberation,


Could you fix this comment, so that it says "training data" instead of "VoxCeleb2?" Since the training data includes portions of both VoxCeleb1 and VoxCeleb2.

david-ryan-snyder · 2018-11-07T16:56:12Z

egs/dihard_2018/v2/run.sh

+    --min-segment 0.5 $nnet_dir \
+    data/dihard_2018_eval_cmn $nnet_dir/xvectors_dihard_2018_eval
+
+  # Reduce the amount of training data for the PLDA,


Comment appears to be incomplete.

david-ryan-snyder · 2018-11-07T17:08:32Z

egs/dihard_2018/v2/run.sh

+
+  mkdir -p $nnet_dir/results
+  # Compute the DER on the dihard_2018_eval. 250ms of a speaker transition is tolerated
+  # and overlapping segments are ignored. 


If you want to include these results, can you clarify in the comment that this is different from how the DIHARD challenge was originally evaluated?

david-ryan-snyder · 2018-11-07T17:13:57Z

egs/dihard_2018/v2/run.sh

+  echo "Using the oracle number of speakers, DER: $der%"
+fi
+
+# Score the overlapping part and do not tolerate errors within 250ms of a speaker transition


Could you clarify that this is how the RTTMs were scored during the DIHARD challenge? This would be helpful info for anyone who wants to compare results.

…e diarization/cluster.sh

HuangZiliAndy · 2018-11-08T16:14:41Z

Hi! I think I have fixed all the problems you mentioned. And in the current version, I add a channel parameter to the diarization/make_rttm.py (to ensure the code won't change for other scripts, I set 0 as default). To make the results comparable with the paper, I just keep the strict version of DER. (no unscored collars and including overlapping speech) The result is also offered in the script.

HuangZiliAndy · 2018-11-10T14:42:34Z

Hi David and Paola! The i-vector baseline of DIHARD is included. I was referring to Voxceleb baseline so I didn't add data augmentation for PLDA. One thing I am not sure is the local/extract_ivectors.sh file. I don't think it is very elegant. Please tell me if you find any problems. Thanks!

david-ryan-snyder · 2018-11-10T19:36:15Z

Thanks Zili! There's just a few more things and I think it should be ready to go:

Instead of creating multiple copies of each data prep script for each recipe, v1, v2, etc, it would be better if there is a single set of scripts, and you create symbolic links. E.g., v2/local/make_dihard_2018_dev.sh can be a symbolic link to v1/local/make_dihard_2018_dev.sh. It's a bit cleaner this way.
What are the differences between the script diarization/extract_ivectors.sh and the one in local/extract_ivectors.sh? Try to use the existing one in diarization/ if you can. If there's some functionality it lacks, there's a good chance you can add it to the existing script, rather than creating a new one.
Could you document the --channel option that is passed around a little better? It would help if you made it clear that this only affects the format of the RTTM file. You could even call the option you pass into cluster.sh --rttm-channel. The usage message could describe what it does more clearly, e.g., "the value passed into the RTTM channel field. Only affects the format of the RTTM file." Without a little more clarity, users might think it does something with the wav file channels.

HuangZiliAndy · 2018-11-12T16:04:53Z

Hi, David! I agree with your first and third comment and I have fixed them. For the second one, the situation is as follows. For callhome v1, we didn't apply-cmvn to the features, so that's not a problem. For callhome v2, we apply-cmvn to the features before segmentation and dump them. For dihard v1, we are following callhome v2, we also add-deltas before apply-cmvn.

diarization/extract_ivectors.sh is very similar to local/extract_ivectors.sh. The only difference is that diarization/extract_ivectors.sh directly copy the delta options in the ivector extractor while local/extract_ivectors.sh passes the delta options. Because we do some feature preprocessing before, the delta options may not stay the same. I can replace diarization/extract_ivectors.sh with local/extract_ivectors.sh, but that will influence the callhome recipe (You have to pass a delta-order argument now) .

Should I pass the delta options? I want to know what do you think.

david-ryan-snyder · 2018-11-12T17:14:47Z

egs/dihard_2018/v1/local/nnet3/xvector/prepare_feats.sh

@@ -0,0 +1,86 @@
+#!/bin/bash


You don't need v1/local/nnet3, since you already have this directory in v2/local/nnet3. You should delete it.

david-ryan-snyder · 2018-11-12T18:04:49Z

@HuangZiliAndy,

It's usually better to add the functionality you want to an existing script, rather than creating a new script, especially if the new one differs by only one line! At the same time, we want to make sure that changes to old scripts don't break things in the existing recipes. Fortunately, I think we can satisfy both of those requirements.

I suggest adding an option called apply_deltas to diarization/extract_ivectors.sh. This should be initialized to true. Please add a comment explaining that when this option is true, we use the options provided in $srcdir/delta_opts. You might want to say that in diarization recipes, we sometimes need to write features to disk that already have various post-processing applied. As a result, deltas may have already been applied to the input features.

Have an if statement that checks if apply_deltas is true. If it is, delta_opts is taken from $srcdir/delta_opts, otherwise it is assigned the value of --delta-order=0.

if $apply_deltas; then
delta_opts=$(cat $srcdir/delta_opts 2>/dev/null)
else
delta_opts="--delta-order=0"
fi

HuangZiliAndy · 2018-11-12T20:38:34Z

Hi, David! Thanks for your advice. I have fixed these problems.

david-ryan-snyder

@HuangZiliAndy, Looks pretty good. Good work! There's a few more minor cosmetic things.

One last thing: please take a look at the python scripts you added for preparing the DIHARD data, and try to make sure they don't go over 80 columns.

After this, it should be ready to go.

david-ryan-snyder · 2018-11-13T00:34:10Z

egs/dihard_2018/v1/run.sh

+if [ $stage -le 0 ]; then
+  local/make_voxceleb2.pl $voxceleb2_root dev data/voxceleb2_train
+  local/make_voxceleb2.pl $voxceleb2_root test data/voxceleb2_test
+  # This script reates data/voxceleb1_test and data/voxceleb1_train.


reates -> creates

david-ryan-snyder · 2018-11-13T00:34:58Z

egs/dihard_2018/v1/local/make_dihard_2018_eval.sh

@@ -0,0 +1,22 @@
+#!/bin/bash
+# Copyright 2015   David Snyder


david-ryan-snyder · 2018-11-13T00:35:06Z

egs/dihard_2018/v1/local/make_dihard_2018_dev.sh

@@ -0,0 +1,22 @@
+#!/bin/bash
+# Copyright 2015   David Snyder


david-ryan-snyder · 2018-11-13T00:38:33Z

egs/dihard_2018/v2/run.sh

+if [ $stage -le 0 ]; then
+  local/make_voxceleb2.pl $voxceleb2_root dev data/voxceleb2_train
+  local/make_voxceleb2.pl $voxceleb2_root test data/voxceleb2_test
+  # This script reates data/voxceleb1_test and data/voxceleb1_train.


reats -> creates

Thanks David! I have fixed them.

david-ryan-snyder · 2018-11-13T20:03:35Z

@HuangZiliAndy, good work!

@danpovey, @leibny, I think this is ready to merge.

@cbtpkzm

* [build] Allow configure script to handle package-based OpenBLAS (kaldi-asr#2618) * [egs] updating local/make_voxceleb1.pl so that it works with newer versions of VoxCeleb1 (kaldi-asr#2684) * [egs,scripts] Remove unused --nj option from some scripts (kaldi-asr#2679) * [egs] Fix to tedlium v3 run.sh (rnnlm rescoring) (kaldi-asr#2686) * [scripts,egs] Tamil OCR with training data from yomdle and testing data from slam (kaldi-asr#2621) note: this data may not be publicly available at the moment. we'll work on that. * [egs] mini_librispeech: allow relative pathnames in download_and_untar.sh (kaldi-asr#2689) * [egs] Updating SITW recipe to account for changes to VoxCeleb1 (kaldi-asr#2690) * [src] Fix nnet1 proj-lstm bug where gradient clipping not used; thx:@cbtpkzm (kaldi-asr#2696) * [egs] Update aishell2 recipe to allow online decoding (no pitch for ivector) (kaldi-asr#2698) * [src] Make cublas and cusparse use per-thread streams. (kaldi-asr#2692) This will reduce synchronization overhead when we actually use multiple cuda devices in one process go down drastically, since we no longer synchronize on the legacy default stream. More details here: https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html * [src] improve handling of low-rank covariance in ivector-compute-lda (kaldi-asr#2693) * [egs] Changes to IAM handwriting-recognition recipe, including BPE encoding (kaldi-asr#2658) * [scripts] Make sure pitch is not included in i-vector feats, in online decoding preparation (kaldi-asr#2699) * [src] fix help message in post-to-smat (kaldi-asr#2703) * [scripts] Fix to steps/cleanup/debug_lexicon.sh (kaldi-asr#2704) * [egs] Cosmetic and file-mode fixes in HKUST recipe (kaldi-asr#2708) * [scripts] nnet1: remove the log-print of args in 'make_nnet_proto.py', thx:mythilisharan@gmail.com (kaldi-asr#2706) * [egs] update README in AISHELL-2 (kaldi-asr#2710) * [src] Make constructor of CuDevice private (kaldi-asr#2711) * [egs] fix sorting issue in aishell v1 (kaldi-asr#2705) * [egs] Add soft links for CNN+TDNN scripts (kaldi-asr#2715) * [build] Add missing packages in extras/check_dependencies.sh (kaldi-asr#2719) * [egs] madcat arabic: clean scripts, tuning, use 6-gram LM (kaldi-asr#2718) * [egs] Update WSJ run.sh: comment out outdated things, add run_tdnn.sh. (kaldi-asr#2723) * [scripts,src] Fix potential issue in scripts; minor fixes. (kaldi-asr#2724) The use of split() in latin-1 encoding (which might be used for other ASCII-compatible encoded data like utf-8) is not right because character 160 (expressed here in decimal) is a NBSP in latin-8 encoding and is also in the range UTF-8 uses for encoding. The same goes for strip(). Thanks @ChunChiehChang for finding the issue. * [egs] add example script for RNNLM lattice rescoring for WSJ recipe (kaldi-asr#2727) * [egs] add rnnlm example on tedlium+lm1b; add rnnlm rescoring results (kaldi-asr#2248) * [scripts] Small fix to utils/data/convert_data_dir_to_whole.sh (RE backups) (kaldi-asr#2735) * [src] fix memory bug in kaldi::~LatticeFasterDecoderTpl(), (kaldi-asr#2737) - found it when running 'latgen-faster-mapped-parallel', - core-dumps from the line: decoder/lattice-faster-decoder.cc:52 -- the line is doing 'delete &(FST*)', i.e. deleting the pointer to FST, instead of deleting the FST itslef, -- bug was probably introduced by refactoring commit d0c68a6 from 2018-09-01, -- after the change the code runs fine... (the unit tests for src/decoder are missing) * [egs] Remove per-utt option from nnet3/align scripts (kaldi-asr#2717) * [egs] Small Librispeech example fix, thanks: Yasasa Tennakoon. (kaldi-asr#2738) * [egs] Aishell2 recipe: turn off jieba's new word discovery in word segmentation (kaldi-asr#2740) * [egs] Add missing file local/join_suffix.py in TEDLIUM s5_r3; thx:anand@sayint.ai (kaldi-asr#2741) * [egs,scripts] Add Tunisian Arabic (MSA) recipe; cosmetic fixes to pbs.pl (kaldi-asr#2725) * [scripts] Fix missing import in utils/langs/grammar/augment_words_txt.py (kaldi-asr#2742) * [scripts] Fix build_const_arpa_lm.sh w.r.t. where <s> appears inside words (kaldi-asr#2745) * [scripts] Slight improvements to decode_score_fusion.sh usability (kaldi-asr#2746) * [build] update configure to support cuda 10 (kaldi-asr#2747) * [scripts] Fix bug in utils/data/resample_data_dir.sh (kaldi-asr#2749) * [scripts] Fix bug in cleanup after steps/cleanup/clean_and_segment_data*.sh (kaldi-asr#2750) * [egs] several updates of the tunisian_msa recipe (kaldi-asr#2752) * [egs] Small fix to Tunisian MSA TDNN script (RE train_stage) (kaldi-asr#2757) * [src,scripts] Batched nnet3 computation (kaldi-asr#2726) This PR adds the underlying utilities for much faster nnet3 inference on GPU, and a command-line binary (and script support) for nnet3 decoding and posterior computation. TBD: a binary for x-vector computation. This PR also contains unrelated decoder speedups (skipping range checks for transition ids... this may cause segfaults when graphs are mismatched). * [build] Add python3 compatibility to install scripts (kaldi-asr#2748) * [scripts] tfrnnlm: Modify TensorFlow flag format for compatibility with recent versions (kaldi-asr#2760) * [egs] fix old style perl regex in egs/chime1/s5/local/chime1_prepare_data.sh (kaldi-asr#2762) * [scripts] Fix bug in steps/cleanup/debug_lexicon.sh (kaldi-asr#2763) * [egs] Add example for Yomdle Farsi OCR (kaldi-asr#2702) * [scripts] debug_lexicon.sh: Fix bug introduced in kaldi-asr#2763. (kaldi-asr#2764) * [egs] add missing online cmvn config in aishell2 (kaldi-asr#2767) * [egs] Add CNN-TDNN-F script for Librispeech (kaldi-asr#2744) * [src] Some minor cleanup/fixes regarding CUDA memory allocation; other small fixes. (kaldi-asr#2768) * [scripts] Update reverberate_data_dir.py so that it works with python3 (kaldi-asr#2771) * [egs] Chime5: fix total number of words for WER calculation (kaldi-asr#2772) * [egs] RNNLMs on Tedlium w/ Google 1Bword: Increase epochs, update results (kaldi-asr#2775) * [scripts,egs] Added phonetisaurus-based g2p scripts (kaldi-asr#2730) Phonetisaurus is much faster to train then sequitur. * [egs] madcat arabic: clean scripts, tuning, rescoring, text localization (kaldi-asr#2716) * [scripts] Enhancements & minor bugfix to segmentation postprocessing (kaldi-asr#2776) * [src] Update gmm-decode-simple to accept ConstFst (kaldi-asr#2787) * [scripts] Update documentation of train_raw_dnn.py (kaldi-asr#2785) * [src] nnet3: extend what descriptors can be parsed. (kaldi-asr#2780) * [src] Small fix to 'fstrand' (make sure args are parsed) (kaldi-asr#2777) * [src,scripts] Minor, mostly cosmetic updates (kaldi-asr#2788) * [src,scripts] Add script to compare alignment directories. (kaldi-asr#2765) * [scripts] Small fixes to script usage messages, etc. (kaldi-asr#2789) * [egs] Update ami_download.sh after changes on Edinburgh website. (kaldi-asr#2769) * [scripts] Update compare_alignments.sh to allow different lang dirs. (kaldi-asr#2792) * [scripts] Change make_rttm.py so output is in determinstic order (kaldi-asr#2794) * [egs] Fixes to yomdle_zh RE encoding direction, etc. (kaldi-asr#2791) * [src] Add support for context independent phones in gmm-init-biphone (for e2e) (kaldi-asr#2779) * [egs] Simplifying multi-condition version of AMI recipe (kaldi-asr#2800) * [build] Fix openblas build for aarch64 (kaldi-asr#2806) * [build] Make CUDA_ARCH configurable at configure-script level (kaldi-asr#2807) * [src] Print maximum memory stats in CUDA allocator (kaldi-asr#2799) * [src,scripts] Various minor code cleanups (kaldi-asr#2809) * [scripts] Fix handling of UTF-8 in filenames, in wer_per_spk_details.pl (kaldi-asr#2811) * [egs] Update AMI chain recipes (kaldi-asr#2817) * [egs] Improvements to multi_en tdnn-opgru/lstm recipes (kaldi-asr#2824) * [scripts] Fix initial prob of silence when lexicon has silprobs. Thx:@agurianov (kaldi-asr#2823) * [scripts,src] Fix to multitask nnet3 training (kaldi-asr#2818); cosmetic code change. (kaldi-asr#2827) * [scripts] Create shared versions of get_ctm_conf.sh, add get_ctm_conf_fast.sh (kaldi-asr#2828) * [src] Use cuda streams in matrix library (kaldi-asr#2821) * [egs] Add online-decoding recipe to aishell1 (kaldi-asr#2829) * [egs] Add DIHARD 2018 diarization recipe. (kaldi-asr#2822) * [egs] add nnet3 online result for aishell1 (kaldi-asr#2836) * [scripts] RNNLM scripts: don't die when features.txt is not present (kaldi-asr#2837) * [src] Optimize cuda allocator for multi-threaded case (kaldi-asr#2820) * [build] Add cub library for cuda projects (kaldi-asr#2819) not needed now but will be in future. * [src] Make Cuda allocator statistics visible to program (kaldi-asr#2835) * [src] Fix bug affecting scale in GeneralDropoutComponent (non-continuous case) (kaldi-asr#2815) * [build] FIX kaldi-asr#2842: properly check $use_cuda against false. (kaldi-asr#2843) * [doc] Add note about OOVs to data-prep. (kaldi-asr#2844) * [scripts] Allow segmentation with nnet3 chain models (kaldi-asr#2845) * [build] Remove -lcuda from cuda makefiles which breaks operation when no driver present (kaldi-asr#2851) * [scripts] Fix error in analyze_lats.sh for long lattices (replace awk with perl) (kaldi-asr#2854) * [egs] add rnnlm recipe for librispeech (kaldi-asr#2830) * [build] change configure version from 9 to 10 (kaldi-asr#2853) (kaldi-asr#2855) * [src] fixed compilation errors when built with --DOUBLE_PRECISION=1 (kaldi-asr#2856) * [build] Clarify instructions if cub is not found (kaldi-asr#2858) * [egs] Limit MFCC feature extraction job number in Dihard recipe (kaldi-asr#2865) * [egs] Added Bentham handwriting recognition recipe (kaldi-asr#2846) * [src] Share roots of different tones of phones aishell (kaldi-asr#2859) * [egs] Fix path to sequitur in commonvoice egs (kaldi-asr#2868) * [egs] Update reverb recipe (kaldi-asr#2753) * [scripts] Fix error while analyzing lattice (parsing bugs) (kaldi-asr#2873) * [src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe (kaldi-asr#2872) * [egs] TIMIT: fix mac compatibility of sed command (kaldi-asr#2874) * [egs] mini_librispeech: fixing some bugs and limiting repeated downloads (kaldi-asr#2861) * [src,scripts,egs] Speedups to GRU-based networks (special components) (kaldi-asr#2712) * [src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876) * Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877) This reverts commit 84435ff. * Revert "Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877)" (kaldi-asr#2878) This reverts commit b196b7f. * Revert "[src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe" (kaldi-asr#2882) the fix was buggy. apologies. * [src] Remove unused code that caused Windows compile failure. Thx:@btiplitz (kaldi-asr#2881) * [src] Really fix memory leak in online decoding; thx:@Worldexe (kaldi-asr#2883) * [src] Fix Windows cuda build failure (use C++11 standard include) (kaldi-asr#2880) * [src] Add #include that caused build failure on Windows (kaldi-asr#2886) * [scripts] Fix max duration check in sad_to_segments.py (kaldi-asr#2889) * [scripts] Fix speech duration calculation in sad_to_segments.py (kaldi-asr#2891) * [src] Fix Windows build problem (timer.h) (kaldi-asr#2888) * [egs] add HUB4 spanish tdnn-f and cnn-tdnn script (kaldi-asr#2895) * [egs] Fix Aishell2 dict prepare bug; should not affect results (kaldi-asr#2890) * [egs] Self-contained example for KWS for mini_librispeech (kaldi-asr#2887) * [egs,scripts] Fix bugs in Dihard 2018 (kaldi-asr#2897) * [scripts] Check last character of files to match with newline (kaldi-asr#2898) * [egs] Update Librispeech RNNLM results; use correct training data (kaldi-asr#2900) * [scripts] RNNLM: old iteration model cleanup; save space (kaldi-asr#2885) * [scripts] Make prepare_lang.sh cleanup beforehand (prevents certain failures) (kaldi-asr#2906) * [scripts] Expose dim-range-node at xconfig level (kaldi-asr#2903) * [scripts] Fix bug related to multi-task in train_raw_rnn.py (kaldi-asr#2907) [scripts] Fix bug related to multi-task in train_raw_rnn.py. Thx:tessfu2001@gmail.com * [scripts] Cosmetic fix/clarification to utils/prepare_lang.sh (kaldi-asr#2912) * [scripts,egs] Added a new lexicon learning (adaptation) recipe for tedlium, in accordance with the IS17 paper. (kaldi-asr#2774) * [egs] TDNN+LSTM example scripts, with RNNLM, for Librispeech (kaldi-asr#2857) * [src] cosmetic fix in nnet1 code (kaldi-asr#2921) * [src] Fix incorrect invocation of mutex in nnet-batch-compute code (kaldi-asr#2932) * [egs,minor] Fix typo in comment in voxceleb script (kaldi-asr#2926) * [src,egs] Mostly cosmetic changes; add some missing includes (kaldi-asr#2936) * [egs] Fix path of rescoring binaries used in tfrnnlm scripts (kaldi-asr#2941) * [src] Fix bug in nnet3-latgen-faster-batch for determinize=false (kaldi-asr#2945) thx: Maxim Korenevsky. * [egs] Add example for rimes handwriting database; Madcat arabic script cleanup (kaldi-asr#2935) * [egs] Add scripts for yomdle korean (kaldi-asr#2942) * [build] Refactor/cleanup build system, easier build on ubuntu 18.04. (kaldi-asr#2947) note: if this breaks someone's build we'll have to debug it then. * [scripts,egs] Changes for Python 2/3 compatibility (kaldi-asr#2925) * [egs] Add more modern DNN recipe for fisher_callhome_spanish (kaldi-asr#2951) * [scripts] switch from bc to perl to reduce dependencies (diarization scripts) (kaldi-asr#2956) * [scripts] Further fix for Python 2/3 compatibility (kaldi-asr#2957) * [egs] Remove no-longer-existing option in tedlium_r3 recipe (kaldi-asr#2959) * [build] Handle dependencies for .cu files in addition to .cc files (kaldi-asr#2944) * [src] remove duplicate test mode option from class GeneralDropoutComponent (kaldi-asr#2960) * [egs] Fix minor bugs in WSJ's flat-start/e2e recipe (kaldi-asr#2968) * [egs] Fix to BSD compatibility of TIMIT data prep (kaldi-asr#2966) * [scripts] Fix RNNLM training script problem (chunk_length was ignored) (kaldi-asr#2969) * [src] Fix bug in lattice-1best.cc RE removing insertion penalty (kaldi-asr#2970) * [src] Compute a separate avg (start, end) interval for each sausage word (kaldi-asr#2972) * [build] Move nvcc verbose flag to proper location (kaldi-asr#2962) * [egs] Fix mini_librispeech download_lm.sh crash; thx:chris.keith.johnson@gmail.com (kaldi-asr#2974) * [egs] minor fixes related to python2 vs python3 differences (kaldi-asr#2977) * [src] Small fix in test code, avoid spurious failure (kaldi-asr#2978) * [egs] Fix CSJ data-prep; minor path fix for USB version of data (kaldi-asr#2979) * [egs] Add paper ref to README.txt in reverb example (kaldi-asr#2982) * [egs] Minor fixes to sitw recipe (fix problem introdueced in kaldi-asr#2925) (kaldi-asr#2985) * [scripts] Fix bug introduced in kaldi-asr#2957, RE integer division (kaldi-asr#2986) * [egs] Update WSJ flat-start chain recipes to use TDNN-F not TDNN+LSTM (kaldi-asr#2988) * [scripts] Fix typo introduced in kaldi-asr#2925 (kaldi-asr#2989) * [build] Modify Makefile and travis script to fix Travis failures (kaldi-asr#2987) * [src] Simplification and efficiency improvement in ivector-plda-scoring-dense (kaldi-asr#2991) * [egs] Update madcat Arabic and Chinese egs, IAM (kaldi-asr#2964) * [src] Fix overflow bug in convolution code (kaldi-asr#2992) * [src] Fix nan issue in ctm times introduced in kaldi-asr#2972, thx: @vesis84 (kaldi-asr#2993) * [src] Fix 'sausage-time' issue which occurs with disabled MBR decoding. (kaldi-asr#2996) * [egs] Add scripts for yomdle Russian (OCR task) (kaldi-asr#2953) * [egs] Simplify lexicon preparation in Fisher callhome Spanish (kaldi-asr#2999) * [egs] Update GALE Arabic recipe (kaldi-asr#2934) * [egs] Remove outdated NN results from Gale Arabic recipe (kaldi-asr#3002) * [egs] Add RESULTS file for the tedlium s5_r3 (release 3) setup (kaldi-asr#3003) * [src] Fixes to grammar-fst code to handle LM-disambig symbols properly (kaldi-asr#3000) thanks: armando.muscariello@gmail.com * [src] Cosmetic change to mel computation (fix option string) (kaldi-asr#3011) * [src] Fix Visual Studio error due to alternate syntactic form of noreturn (kaldi-asr#3018) * [egs] Fix location of sequitur installation (kaldi-asr#3017) * [src] Fix w/ ifdef Visual Studio error from alternate syntactic form noreturn (kaldi-asr#3020) * [egs] Some fixes to getting data in heroico recipe (kaldi-asr#3021) * [egs] BABEL script fix: avoid make_L_align.sh generating invalid files (kaldi-asr#3022) * [src] Fix to older online decoding code in online/ (OnlineFeInput; was broken by commit cc2469e). (kaldi-asr#3025) * [script] Fix unset bash variable in make_mfcc.sh (kaldi-asr#3030) * [scripts] Extend limit_num_gpus.sh to support --num-gpus 0. (kaldi-asr#3027) * [scripts] fix bug in utils/add_lex_disambig.pl when sil-probs and pron-probs used (kaldi-asr#3033) bug would likely have resulted in determinization failure (only when not using word-position-dependent phones). * [egs] Fix path in Tedlium r3 rnnlm training script (kaldi-asr#3039) * [src] Thread-safety for GrammarFst (thx:armando.muscariello@gmail.com) (kaldi-asr#3040) * [scripts] Cosmetic fix to get_degs.sh (kaldi-asr#3045) * [egs] Small bug fixes for IAM and UW3 recipes (kaldi-asr#3048) * [scripts] Nnet3 segmentation: fix default params (kaldi-asr#3051) * [scripts] Allow perturb_data_dir_speed.sh to work with utt2lang (kaldi-asr#3055) * [scripts] Make beam in monophone training configurable (kaldi-asr#3057) * [scripts] Allow reverberate_data_dir.py to support unicode filenames (kaldi-asr#3060) * [scripts] Make some cleanup scripts work with python3 (kaldi-asr#3054) * [scripts] bug fix to nnet2->3 conversion, fixes kaldi-asr#886 (kaldi-asr#3071) * [src] Make copies occur in per-thread default stream (for GPUs) (kaldi-asr#3068) * [src] Add GPU version of MergeTaskOutput().. relates to batch decoding (kaldi-asr#3067) * [src] Add device options to enable tensor core math mode. (kaldi-asr#3066) * [src] Log nnet3 computation to VLOG, not std::cout (kaldi-asr#3072) * [src] Allow upsampling in compute-mfcc-feats, etc. (kaldi-asr#3014) * [src] fix problem with rand_r being undefined on Android (kaldi-asr#3037) * [egs] Update swbd1_map_words.pl, fix them_1's -> them's (kaldi-asr#3052) * [src] Add const overload OnlineNnet2FeaturePipeline::IvectorFeature (kaldi-asr#3073) * [src] Fix syntax error in egs/bn_music_speech/v1/local/make_musan.py (kaldi-asr#3074) * [src] Memory optimization for online feature extraction of long recordings (kaldi-asr#3038) * [build] fixed a bug in linux_configure_redhat_fat when use_cuda=no (kaldi-asr#3075) * [scripts] Add missing '. ./path.sh' to get_utt2num_frames.sh (kaldi-asr#3076) * [src,scripts,egs] Add count-based biphone tree tying for flat-start chain training (kaldi-asr#3007) * [scripts,egs] Remove sed from various scripts (avoid compatibility problems) (kaldi-asr#2981) * [src] Rework error logging for safety and cleanliness (kaldi-asr#3064) * [src] Change warp-synchronous to cub::BlockReduce (safer but slower) (kaldi-asr#3080) * [src] Fix && and || uses where & and | intended, and other weird errors (kaldi-asr#3087) * [build] Some fixes to Makefiles (kaldi-asr#3088) clang is unhappy with '-rdynamic' in compile-only step, and the switch is really unnecessary. Also, the default location for MKL 64-bit libraries is intel64/. The em64t/ was explained already obsolete by an Intel rep in 2010: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/285973 * [src] Fixed -Wreordered warnings in feat (kaldi-asr#3090) * [egs] Replace bc with perl -e (kaldi-asr#3093) * [scripts] Fix python3 compatibility issue in data-perturbing script (kaldi-asr#3084) * [doc] fix some typos in doc. (kaldi-asr#3097) * [build] Make sure expf() speed probe times sensibly (kaldi-asr#3089) * [scripts] Make sure merge_targets.py works in python3 (kaldi-asr#3094) * [src] ifdef to fix compilation failure on CUDA 8 and earlier (kaldi-asr#3103) * [doc] fix typos and broken links in doc. (kaldi-asr#3102) * [scripts] Fix frame_shift bug in egs/swbd/s5c/local/score_sclite_conf.sh (kaldi-asr#3104) * [src] Fix wrong assertion failure in nnet3-am-compute (kaldi-asr#3106) * [src] Cosmetic changes to natural-gradient code (kaldi-asr#3108) * [src,scripts] Python2 compatibility fixes and code cleanup for nnet1 (kaldi-asr#3113) * [doc] Small documentation fixes; update on Kaldi history (kaldi-asr#3031) * [src] Various mostly-cosmetic changes (copying from another branch) (kaldi-asr#3109) * [scripts] Simplify text encoding in RNNLM scripts (now only support utf-8) (kaldi-asr#3065) * [egs] Add "formosa_speech" recipe (Taiwanese Mandarin ASR) (kaldi-asr#2474) * [egs] python3 compatibility in csj example script (kaldi-asr#3123) * [egs] python3 compatibility in example scripts (kaldi-asr#3126) * [scripts] Bug-fix for removing deleted words (kaldi-asr#3116) The type of --max-deleted-words-kept-when-merging in segment_ctm_edits.py was a string, which prevented the mechanism from working altogether. * [scripts] Add fix regarding num-jobs for segment_long_utterances*.sh(kaldi-asr#3130) * [src] Enable allow_{upsample,downsample} with online features (kaldi-asr#3139) * [src] Fix bad assert in fstmakecontextsyms (kaldi-asr#3142) * [src] Fix to "Fixes to grammar-fst & LM-disambig symbols" (kaldi-asr#3000) (kaldi-asr#3143) * [build] Make sure PaUtils exported from portaudio (kaldi-asr#3144) * [src] cudamatrix: fixing a synchronization bug in 'normalize-per-row' (kaldi-asr#3145) was only apparent using large matrices * [src] Fix typo in comment (kaldi-asr#3147) * [src] Add binary that functions as a TCP server (kaldi-asr#2938) * [scripts] Fix bug in comment (kaldi-asr#3152) * [scripts] Fix bug in steps/segmentation/ali_to_targets.sh (kaldi-asr#3155) * [scripts] Avoid holding out more data than the requested num-utts (due to utt2uniq) (kaldi-asr#3141) * [src,scripts] Add support for two-pass agglomerative clustering. (kaldi-asr#3058) * [src] Disable unget warning in PeekToken (and other small fix) (kaldi-asr#3163) * [build] Add new nvidia tools to windows build (kaldi-asr#3159) * [doc] Fix documentation errors and add more docs for tcp-server decoder (kaldi-asr#3164)

hzili1 and others added 3 commits November 5, 2018 11:51

dihard x-vector baseline

86d3a5e

correct directory name error

4b99c83

Delete LOG

770cd30

david-ryan-snyder reviewed Nov 6, 2018

View reviewed changes

david-ryan-snyder reviewed Nov 7, 2018

View reviewed changes

hzili1 added 2 commits November 7, 2018 20:34

add more comments and modified some mistakes

f3a6906

add channel parameter to the diarization/make_rttm.py, also modify th…

8f4a64d

…e diarization/cluster.sh

add i-vector baseline

d7cb747

david-ryan-snyder reviewed Nov 12, 2018

View reviewed changes

hzili1 added 2 commits November 12, 2018 15:23

delete some unused files in local

86c617d

change --channel to --rttm-channel, add --apply-deltas options

b54abb1

david-ryan-snyder reviewed Nov 13, 2018

View reviewed changes

fixed some small errors

de204df

danpovey merged commit 3ae133c into kaldi-asr:master Nov 13, 2018

chenzhehuai mentioned this pull request Jun 4, 2019

update (#32) chenzhehuai/kaldi#33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP) Adding DIHARD 2018 recipe. #2822

(WIP) Adding DIHARD 2018 recipe. #2822

HuangZiliAndy commented Nov 6, 2018

david-ryan-snyder Nov 6, 2018

david-ryan-snyder Nov 6, 2018

david-ryan-snyder Nov 6, 2018

johnjosephmorgan Nov 6, 2018

david-ryan-snyder Nov 6, 2018

david-ryan-snyder commented Nov 6, 2018 •

edited

Loading

david-ryan-snyder commented Nov 6, 2018 •

edited

Loading

danpovey commented Nov 6, 2018 via email

danpovey commented Nov 6, 2018 via email

HuangZiliAndy commented Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

david-ryan-snyder Nov 7, 2018

HuangZiliAndy commented Nov 8, 2018

HuangZiliAndy commented Nov 10, 2018

david-ryan-snyder commented Nov 10, 2018 •

edited

Loading

HuangZiliAndy commented Nov 12, 2018

david-ryan-snyder Nov 12, 2018

david-ryan-snyder commented Nov 12, 2018

HuangZiliAndy commented Nov 12, 2018

david-ryan-snyder left a comment

david-ryan-snyder Nov 13, 2018

david-ryan-snyder Nov 13, 2018

david-ryan-snyder Nov 13, 2018

david-ryan-snyder Nov 13, 2018

HuangZiliAndy Nov 13, 2018

david-ryan-snyder commented Nov 13, 2018

(WIP) Adding DIHARD 2018 recipe. #2822

(WIP) Adding DIHARD 2018 recipe. #2822

Conversation

HuangZiliAndy commented Nov 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-ryan-snyder commented Nov 6, 2018 • edited Loading

david-ryan-snyder commented Nov 6, 2018 • edited Loading

danpovey commented Nov 6, 2018 via email

danpovey commented Nov 6, 2018 via email

HuangZiliAndy commented Nov 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuangZiliAndy commented Nov 8, 2018

HuangZiliAndy commented Nov 10, 2018

david-ryan-snyder commented Nov 10, 2018 • edited Loading

HuangZiliAndy commented Nov 12, 2018

Choose a reason for hiding this comment

david-ryan-snyder commented Nov 12, 2018

HuangZiliAndy commented Nov 12, 2018

david-ryan-snyder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-ryan-snyder commented Nov 13, 2018

david-ryan-snyder commented Nov 6, 2018 •

edited

Loading

david-ryan-snyder commented Nov 6, 2018 •

edited

Loading

david-ryan-snyder commented Nov 10, 2018 •

edited

Loading