Merge pull request #35 from julius-speech/dnn

Dnn
julius-speech · Aug 30, 2016 · a702aff · a702aff
2 parents 9226725 + ab0d9b3
commit a702aff
Show file tree

Hide file tree

Showing 39 changed files with 1,219 additions and 110 deletions.
diff --git a/00readme-DNN.txt b/00readme-DNN.txt
@@ -1,27 +1,64 @@
-
+f
 	Julius for DNN-based speech recognition
 
+						(revised 2016/08/30)
 						(updated 2013/09/29)
 
-A. How it works
-================
+A. Julius and DNN-HMM
+======================
+
+From 4.4, Julius can perform DNN-HMM based recognition in two ways:
+
+  1. standalone: directly compute DNN for HMM inside Julius (>= 4.4)
+
+  2. network: receive state probabilities calculated by other process
+     via socket (<= 4.3.1)
+
+Both are described below.
+
+ A.1. Standalone mode
+ =====================
+
+From version 4.4, Julius is capable of performing DNN-HMM based
+recognition by itself.  It can read a DNN definition along with a HMM,
+and can compute the network against input (spliced) feature vectors
+and output the node scores of output layer for each frame, which will
+be used as output probabilities of corresponding HMM states in the
+HMM.  All computation will be done in a single process.
+
+Note that the current implementation is very simple and limited.  Only
+basic functions are implemented for NN.  Any number of hidden layers
+can be defined, but the number of the nodes in the hidden layers
+should be the same.  No batch computation is performed: all
+frame-wise.  SIMD instruction (Intel AVX) is used to speed up the
+computation.  Only tested on Windows and Ubuntu on Intel PC.
+See "libsent/src/phmm/calc_dnn.c" for the actual implementation.
+
+To run, you need
+
+ 1) an HMM AM (GMM defs are ignored, only its structure is used)
+ 2) a DNN definition that corresponds to 1)
+ 3) ".dnnconf" configuration file (text)
+
+The .dnnconf file specifies the parameters, options, DNN definition
+files, and other parameters all relating to DNN computation. A sample
+file is located in the top directory of Julius archive as
+"Sample.dnnconf".
+
+The the matrix/vector definitions should be given in ".npy" format
+(i. e. python's "NumPy.save" format).  Only 32bit-float little endian
+datatype is acceptable.
+
+To perpare a model for DNN-HMM, note that the orders are important.
+The order of the output nodes in the DNN should be the order of HMM
+state definition id.  If not, Julius won't work properly.
 
-This version of Julius can perform, Julius can perform DNN-HMM based
-recognition by receiving pre-computed state probabilities.  In that mode,
-Julius does not read any feature parameter vectors and compute the state
-output probability of an HMM state in it, but just read the "output
-probabilities vectors" of the HMM states already computed in other tools,
-via socket or file.
 
-The "output probabilities input" is called "outprob vector" in Julius,
-which contains a sequence of vectors, each of them consists of
-pre-computed state probabilities a vector of state-num-of-HMM dimension.
+ A.2. Modular mode
+ =====================
 
-The most important thing to know before using this scheme is that,
-each dimension in the input outprob vector and each state in the HMM in
-Julius should corresponds.  In other words, the index of HMM states
-and outprob vector should match. The details are described in the
-following section.
+Julius still has capability of receiving state output probability
+vector from other process.  This is an older way before 4.4.
 
 To run, you need 
 
@@ -85,8 +122,8 @@ perform DNN-based recognition, please re-convert from ASCII hmmdefs
 with the newest version of mkbinhmm.
 
 
-D. Making outprob vector
-==========================
+D. Making outprob vector for Modular mode
+==========================================
 
 D.1. Format of outprob vector file
 ===================================

diff --git a/00readme-ja.txt b/00readme-ja.txt
@@ -4,7 +4,7 @@
 
                                 Julius
 
-						(Rev 4.4   2016/08/20)
+						(Rev 4.4   2016/08/30)
                                                 (Rev 4.3.1 2014/01/15)
                                                 (Rev 4.3   2013/12/25)
                                                 (Rev 4.2.3 2013/06/30)
@@ -41,8 +41,8 @@ Julius
 GitHub �ւ̈ڍs�ɂ���
 ========================
 
-Julius�̓o�[�W����4.3.1��� GitHub �ֈڍs���܂����D
-����C�ŐV�̃\�[�X�R�[�h�E�e����s�L�b�g�E�J�����̌��J�E���L�����
+Julius��2016�N��� GitHub �ֈڍs���܂����D
+�ŐV�̃\�[�X�R�[�h�E�e����s�L�b�g�E�J�����̌��J�E���L�����
 �J���Ҍ����̃t�H�[�����^�c�� GitHub �ɂčs���Ă����܂��D
 
         Julius on GitHub
@@ -58,10 +58,15 @@ Julius
 Julius-4.4
 ===========
 
-�o�[�W���� 4.4 �ł͂������̃A�b�v�f�[�g�ƐV�@�\���ǉ�����܂����D
-�V�c�[���Ƃ��� "adintool-gui" �� "binlm2arpa" ���ǉ�����܂����D
-�܂��C"mkbingram" �ł͕����R�[�h�ϊ��o�͂��s���܂��D
-���W���[�����[�h�ł̓N���C�A���g�ؒf���ɗ������Ɏ��̃N���C�A���g�ڑ���
+�o�[�W���� 4.4 �ł� DNN-HMM �d�l���� DNN �v�Z��g�ݍ��݁A�P�̂�
+DNN-HMM��p�����I�����C�������F�����s����悤�ɂȂ�܂����B
+�ڍׂ� 00readme-DNN.txt ���������������B
+
+
+�V�c�[���Ƃ��� adintool �� GUI �o�[�W�����ł��� "adintool-gui" ��
+�o�C�i��N-gram�� ARPA �`���ɋt�ϊ����� "binlm2arpa" ���ǉ�����܂����D
+�܂��C"mkbingram" �Ńo�C�i��N-gram�𒼐ڕ����R�[�h�ϊ��ł��܂��B
+���W���[�����[�h�ŃN���C�A���g�ؒf���ɗ������Ɏ��̃N���C�A���g�ڑ���
 �҂悤�ɂȂ�܂����D�܂��C�������̃o�O���C������C�ŋ߂�OS�ł�
 �R���p�C���G���[���C�����܂����D
 
@@ -81,6 +86,7 @@ Julius-4.4
 	configure		configure�X�N���v�g
 	configure.in		
 	Sample.jconf		jconf �ݒ�t�@�C���T���v��
+	Sample.dnnconf		DNN �ݒ�t�@�C���̃T���v��
 	julius/			Julius �\�[�X
 o	libjulius/		JuliusLib �R�A�G���W�����C�u���� �\�[�X
 	libsent/		JuliusLib �ėp���C�u���� �\�[�X
@@ -114,14 +120,7 @@ jconf
 �g�p���@�C�e�@�\�̏Љ�C�����������̎���������܂��̂ŁC��������䗗��
 �����D
 
-    �z�[���y�[�W�Fhttp://julius.sourceforge.jp/
-
-�܂��C��L�z�[���y�[�W�ɂ����āCJulius��p����������A�v���P�[�V�����J
-���Ɋւ�����������s�����߂́u�J���҃t�H�[�����v��ݒu���Ă���܂��D
-�ŐV�� Julius �� CVS �X�V���Ȃǂ����e����܂��D
-�ǂ����A�N�Z�X���������D
-
-    Julius Forum: http://julius.sourceforge.jp/forum/
+    �z�[���y�[�W�Fhttp://julius.osdn/
 
 
 ���C�Z���X
@@ -143,7 +142,11 @@ Julius
 �A����
 ===========
 
-Julius �Ɋւ��邲����E���₢���킹�́CGitHub ���邢��
+Julius �J���Ɋւ��邲����E���₢���킹�� GitHub �ŏ����Ă���܂��B
+
+        Julius on GitHub
+        https://github.com/julius-speech/julius
+
 ���邢�͉��L�̃��[���A�h���X�܂ł��₢���킹������
 ('at' �� '@' �ɓǂݑւ��Ă�������)
 

diff --git a/00readme.txt b/00readme.txt
@@ -4,7 +4,7 @@
 
                                 Julius
 
-						(Rev 4.4   2016/08/20)
+						(Rev 4.4   2016/08/30)
                                                 (Rev 4.3.1 2014/01/15)
                                                 (Rev 4.3   2013/12/25)
                                                 (Rev 4.2.3 2013/06/30)
@@ -58,13 +58,14 @@ What's new in Julius-4.4
 Julius is now hosted on GitHub:
 https://github.com/julius-speech/julius
 
-Version 4.4 includes several updates and new features.  Two new
-tools "adintool-gui" and "binlm2arpa" are added and "mkbingram" was
-updated for audio input and binary LM conversion.  Now does not exit
-on client disconnection on module mode, instead it pauses itself and
-wait for another client to come.  It also has many bug fixes and
-updates for recent OS and environments.  Some documents that may help
-users using Julius with DNN-HMM is also added.
+Version 4.4 now supports stand-alone DNN-HMM support. (see 00readme-DNN.txt)
+Other features include:
+- New tools:
+  - adintool-gui: GUI version of adintool
+  - binlm2arpa: reverse convert binary N-gram to ARPA format
+- "mkbingram" now support direct charset conversion of binary LM
+- Now does not exit at connection lost in module mode
+- Bug fixes
 
 See "Release.txt" for full list of updates.
 Run "configure --help=recursive" to see all configure options.
@@ -84,6 +85,7 @@ Contents of Julius-4.4
 	configure		configure script
 	configure.in		
 	Sample.jconf		Sample configuration file
+	Sample.dnnconf		Sample DNN configuration file
 	julius/			Julius sources
 	libjulius/		JuliusLib core engine library sources
 	libsent/		JuliusLib low-level library sources

diff --git a/LICENSE.txt b/LICENSE.txt
@@ -3,10 +3,10 @@
 	�u���b�A�������F���G���W�� Julius�v
 			���p������
 
-  Copyright (c)   1991-2013 ���s��w �͌�������
+  Copyright (c)   1991-2063 ���s��w �͌�������
   Copyright (c)   1997-2000 ��񏈗��U�����Ƌ���(IPA)
   Copyright (c)   2000-2005 �ޗǐ�[�Ȋw�Z�p��w�@��w ���쌤����
-  Copyright (c)   2005-2013 ���É��H�Ƒ�w Julius�J���`�[��
+  Copyright (c)   2005-2016 ���É��H�Ƒ�w Julius�J���`�[��
 
 ----------------------------------------------------------------------------
 
@@ -40,10 +40,10 @@ Julius
 �Ȃ����̂܂ܕ\�����Y�t���Ȃ���΂Ȃ�܂���B
 
 			�L
-  Copyright (c) 1991-2013 ���s��w �͌�������
+  Copyright (c) 1991-2016 ���s��w �͌�������
   Copyright (c) 1997-2000 ��񏈗��U�����Ƌ���(IPA)
   Copyright (c) 2000-2005 �ޗǐ�[�Ȋw�Z�p��w�@��w ���쌤����
-  Copyright (c) 2005-2013 ���É��H�Ƒ�w Julius�J���`�[��
+  Copyright (c) 2005-2016 ���É��H�Ƒ�w Julius�J���`�[��
 
 3. �{�\�t�g�E�F�A�𗘗p���ē���ꂽ�m���Ɋւ��Ĕ��\���s�Ȃ��ۂɂ́A
 �u���b�A�������F���G���W�� Julius�v�𗘗p�������Ƃ𖾋L���ĉ������B
@@ -79,10 +79,10 @@ Julius
      Large Vocabulary Continuous Speech Recognition Engine Julius
 
 
- Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
+ Copyright (c) 1991-2016 Kawahara Lab., Kyoto University
  Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
  Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
- Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology
+ Copyright (c) 2005-2016 Julius project team, Nagoya Institute of Technology
 
 "Large Vocabulary Continuous Speech Recognition Engine Julius",
 including Julian, is being developed at Kawahara Lab., Kyoto
@@ -129,10 +129,10 @@ whatsoever.
 
                       Form of copyright notice:
 
- Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
+ Copyright (c) 1991-2016 Kawahara Lab., Kyoto University
  Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
  Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
- Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology
+ Copyright (c) 2005-2016 Julius project team, Nagoya Institute of Technology
 
 3. When you publish or present any results by using the Software, you
 must explicitly mention your use of "Large Vocabulary Continuous

diff --git a/README.md b/README.md
@@ -49,12 +49,12 @@ The main developer / maintainer is Akinobu Lee (ri@nitech.ac.jp).
 
 # Download Julius
 
-The latest release version is [4.3.1](https://github.com/julius-speech/julius/releases), released on January 15, 2014.
+The latest release version is [4.4](https://github.com/julius-speech/julius/releases), released on August 30, 2016.
 You can get the released package from the [Release page](https://github.com/julius-speech/julius/releases).
 
-Version 4.3.1 is a bug fix release. Several bugs has been fixed.
-See the "Release.txt" file for the full list of updates.
-Run with "-help" to see full list of options.
+Version 4.4 supports stand-alone DNN-HMM support, and several new
+tools and bug fixes are included.  See the "Release.txt" file for the
+full list of updates.  Run with "-help" to see full list of options.
 
 # Toolkit and Assets
 

diff --git a/Release-ja.txt b/Release-ja.txt
@@ -1,5 +1,6 @@
-4.4 (2016.08.20)
+4.4 (2016.08.30)
 =================
+- DNN-HMMの計算をサポート
 - "adintool-gui": 音声入力モニタGUI付き adintool (adintool/README-GUI.txt参照)
 - "binlm2arpa": バイナリ言語モデルをARPAに変換する
 - "mkbingram" に言語モデルの文字コードを変換して出力するオプション "-c" を追加

diff --git a/Release.txt b/Release.txt
@@ -1,5 +1,6 @@
-4.4 (2016.08.20)
+4.4 (2016.08.30)
 =================
+- DNN-HMM computation support
 - "adintool-gui": adintool with input monitoring (see adintool/README-GUI.txt)
 - "binlm2arpa": convert binary LM to ARPA format
 - "mkbingram" now can convert text encoding of an LM by "-c" option

diff --git a/Sample.dnnconf b/Sample.dnnconf
@@ -0,0 +1,74 @@
+####
+#### Sample DNN Configuration for DNN-HMM Decoding (-dnnconf)
+####
+
+####
+#### Feature Extraction
+####
+
+# feature type, in HTK parameter specification format
+feature_type FBANK_D_A_Z
+
+# julius options to configure the acoustic parameter extraction.
+#
+#   The example below indicates that:
+#     1. parameters should be loaded from an HTK config file,
+#     2. use CMN/CVN,
+#     3. load ceptral mean and variance from the specified file,
+#     4. keep the cepstral mean/variance static, not update while processing
+#
+# the specified string will be expanded inline at the point where this
+# dnnconf file is specified by "-dnnconf", and passed to Julius.
+# As the same as other options in Julius, the later option will override
+# former. Please check the start-up messages to check if the 
+# feature extraction are correctly set up.
+#
+feature_options -htkconf model/dnn/config.lmfb.40ch.jnas -cvn -cmnload model/dnn/norm.jnas -cmnstatic
+
+# feature vector length (including delta or accel, before splicing)
+feature_len 120
+
+# splicing length
+context_len 11
+
+####
+#### NN Definition
+####
+
+# number of input nodes (should be equal to (feature_len * context_len))
+input_nodes 1320
+
+# number of output nodes (num and order should correspond to HMM definition)
+output_nodes 2004
+
+# number of nodes in hidden layers
+hidden_nodes 2048
+
+# number of hidden layers (layers excluding input and output)
+hidden_layers 5
+
+# weights W and biases b for hidden layers, in numpy np.save() format
+#   dtype of these file should be '<f4' (32-bit float little indian)!
+W1 model/dnn/W_l1.npy
+W2 model/dnn/W_l2.npy
+W3 model/dnn/W_l3.npy
+W4 model/dnn/W_l4.npy
+W5 model/dnn/W_l5.npy
+B1 model/dnn/bias_l1.npy
+B2 model/dnn/bias_l2.npy
+B3 model/dnn/bias_l3.npy
+B4 model/dnn/bias_l4.npy
+B5 model/dnn/bias_l5.npy
+
+# also weights and biases for output layer
+output_W model/dnn/W_output.npy
+output_B model/dnn/bias_output.npy
+
+# state prior in 'state_id(%d) prior(%e)' format
+state_prior model/dnn/prior.dnn
+
+# state prior factor
+state_prior_factor 1.0
+
+# batch size (not used)
+batch_size 64
diff --git a/Sample.jconf b/Sample.jconf
@@ -306,6 +306,7 @@
 #-cmnnoupdate			# keep initial mean, disable "-cmnupdate"
 #-cmnmapweight 100.0		# weight for MAP-CMN
 #-cvn				# enable variance normalization
+#-cmnstatic			# totally static cmn/cvn
 
 ## Vocal tract length normalization (VTLN)
 #-vtln 1.0 300 4800		# enable VTLN (alpha, lowerfreq, upperfreq)
@@ -317,6 +318,9 @@
 #-ssalpha 2.0			# alpha coef. for spectral subtraction
 #-ssfloor 0.5			# spectral floor coef.
 
+## DNN-HMM definition (default disabled (= GMM-HMM))
+#-dnnconf file			# DNN configuration file
+
 ## Others
 #-htkconf configfile		# load analysis settings from HTK Config file 
 

diff --git a/adintool/mainloop.c b/adintool/mainloop.c
@@ -349,7 +349,7 @@ vecnet_sub(SP16 *Speech, int nowlen, Recog *recog)
 #if 0
 	{
 	  int i;
-	  for (i = 0; i < vecnet_veclen; i++) {
+	  for (i = 0; i < a->conf.vecnet_veclen; i++) {
 	    printf(" %f", mfcc->tmpmfcc[i]);
 	  }
 	  printf("\n");

diff --git a/jclient-perl/jclient.pl b/jclient-perl/jclient.pl