Skip to content

Commit

Permalink
Merge pull request #35 from julius-speech/dnn
Browse files Browse the repository at this point in the history
Dnn
  • Loading branch information
LeeAkinobu authored Aug 30, 2016
2 parents 9226725 + ab0d9b3 commit a702aff
Show file tree
Hide file tree
Showing 39 changed files with 1,219 additions and 110 deletions.
75 changes: 56 additions & 19 deletions 00readme-DNN.txt
Original file line number Diff line number Diff line change
@@ -1,27 +1,64 @@

f
Julius for DNN-based speech recognition

(revised 2016/08/30)
(updated 2013/09/29)

A. How it works
================
A. Julius and DNN-HMM
======================

From 4.4, Julius can perform DNN-HMM based recognition in two ways:

1. standalone: directly compute DNN for HMM inside Julius (>= 4.4)

2. network: receive state probabilities calculated by other process
via socket (<= 4.3.1)

Both are described below.

A.1. Standalone mode
=====================

From version 4.4, Julius is capable of performing DNN-HMM based
recognition by itself. It can read a DNN definition along with a HMM,
and can compute the network against input (spliced) feature vectors
and output the node scores of output layer for each frame, which will
be used as output probabilities of corresponding HMM states in the
HMM. All computation will be done in a single process.

Note that the current implementation is very simple and limited. Only
basic functions are implemented for NN. Any number of hidden layers
can be defined, but the number of the nodes in the hidden layers
should be the same. No batch computation is performed: all
frame-wise. SIMD instruction (Intel AVX) is used to speed up the
computation. Only tested on Windows and Ubuntu on Intel PC.
See "libsent/src/phmm/calc_dnn.c" for the actual implementation.

To run, you need

1) an HMM AM (GMM defs are ignored, only its structure is used)
2) a DNN definition that corresponds to 1)
3) ".dnnconf" configuration file (text)

The .dnnconf file specifies the parameters, options, DNN definition
files, and other parameters all relating to DNN computation. A sample
file is located in the top directory of Julius archive as
"Sample.dnnconf".

The the matrix/vector definitions should be given in ".npy" format
(i. e. python's "NumPy.save" format). Only 32bit-float little endian
datatype is acceptable.

To perpare a model for DNN-HMM, note that the orders are important.
The order of the output nodes in the DNN should be the order of HMM
state definition id. If not, Julius won't work properly.

This version of Julius can perform, Julius can perform DNN-HMM based
recognition by receiving pre-computed state probabilities. In that mode,
Julius does not read any feature parameter vectors and compute the state
output probability of an HMM state in it, but just read the "output
probabilities vectors" of the HMM states already computed in other tools,
via socket or file.

The "output probabilities input" is called "outprob vector" in Julius,
which contains a sequence of vectors, each of them consists of
pre-computed state probabilities a vector of state-num-of-HMM dimension.
A.2. Modular mode
=====================

The most important thing to know before using this scheme is that,
each dimension in the input outprob vector and each state in the HMM in
Julius should corresponds. In other words, the index of HMM states
and outprob vector should match. The details are described in the
following section.
Julius still has capability of receiving state output probability
vector from other process. This is an older way before 4.4.

To run, you need

Expand Down Expand Up @@ -85,8 +122,8 @@ perform DNN-based recognition, please re-convert from ASCII hmmdefs
with the newest version of mkbinhmm.


D. Making outprob vector
==========================
D. Making outprob vector for Modular mode
==========================================

D.1. Format of outprob vector file
===================================
Expand Down
35 changes: 19 additions & 16 deletions 00readme-ja.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Julius

(Rev 4.4 2016/08/20)
(Rev 4.4 2016/08/30)
(Rev 4.3.1 2014/01/15)
(Rev 4.3 2013/12/25)
(Rev 4.2.3 2013/06/30)
Expand Down Expand Up @@ -41,8 +41,8 @@ Julius
GitHub �ւ̈ڍs�ɂ‚���
========================

Julius�̓o�[�W����4.3.1��� GitHub �ֈڍs���܂����D
���C�ŐV�̃\�[�X�R�[�h�E�e����s�L�b�g�E�J�����̌��J�E���L�����
Julius��2016�N��� GitHub �ֈڍs���܂����D
�ŐV�̃\�[�X�R�[�h�E�e����s�L�b�g�E�J�����̌��J�E���L�����
�J���Ҍ����̃t�H�[�����^�c�� GitHub �ɂčs���Ă����܂��D

Julius on GitHub
Expand All @@ -58,10 +58,15 @@ Julius
Julius-4.4
===========

�o�[�W���� 4.4 �ł͂����‚��̃A�b�v�f�[�g�ƐV�@�\���lj�����܂����D
�V�c�[���Ƃ��� "adintool-gui" �� "binlm2arpa" ���lj�����܂����D
�܂��C"mkbingram" �ł͕����R�[�h�ϊ��o�͂��s���܂��D
���W���[�����[�h�ł̓N���C�A���g�ؒf���ɗ������Ɏ��̃N���C�A���g�ڑ���
�o�[�W���� 4.4 �ł� DNN-HMM �d�l���� DNN �v�Z��g�ݍ��݁A�P�̂�
DNN-HMM��p�����I�����C�������F�����s����悤�ɂȂ�܂����B
�ڍׂ� 00readme-DNN.txt ���������������B


�V�c�[���Ƃ��� adintool �� GUI �o�[�W�����ł��� "adintool-gui" ��
�o�C�i��N-gram�� ARPA �`���ɋt�ϊ����� "binlm2arpa" ���lj�����܂����D
�܂��C"mkbingram" �Ńo�C�i��N-gram�𒼐ڕ����R�[�h�ϊ��ł��܂��B
���W���[�����[�h�ŃN���C�A���g�ؒf���ɗ������Ɏ��̃N���C�A���g�ڑ���
�҂‚悤�ɂȂ�܂����D�܂��C�����‚��̃o�O���C������C�ŋ߂�OS�ł�
�R���p�C���G���[���C�����܂����D

Expand All @@ -81,6 +86,7 @@ Julius-4.4
configure configure�X�N���v�g
configure.in
Sample.jconf jconf �ݒ�t�@�C���T���v��
Sample.dnnconf DNN �ݒ�t�@�C���̃T���v��
julius/ Julius �\�[�X
o libjulius/ JuliusLib �R�A�G���W�����C�u���� �\�[�X
libsent/ JuliusLib �ėp���C�u���� �\�[�X
Expand Down Expand Up @@ -114,14 +120,7 @@ jconf
�g�p���@�C�e�@�\�̏Љ�C�����������̎���������܂��̂ŁC��������䗗��
�����D

�z�[���y�[�W�Fhttp://julius.sourceforge.jp/

�܂��C��L�z�[���y�[�W�ɂ����āCJulius��p����������A�v���P�[�V�����J
���Ɋւ�����������s�����߂́u�J���҃t�H�[�����v��ݒu���Ă���܂��D
�ŐV�� Julius �� CVS �X�V���Ȃǂ����e����܂��D
�ǂ����A�N�Z�X���������D

Julius Forum: http://julius.sourceforge.jp/forum/
�z�[���y�[�W�Fhttp://julius.osdn/


���C�Z���X
Expand All @@ -143,7 +142,11 @@ Julius
�A����
===========

Julius �Ɋւ��邲����E���₢���킹�́CGitHub ���邢��
Julius �J���Ɋւ��邲����E���₢���킹�� GitHub �ŏ����Ă���܂��B

Julius on GitHub
https://github.com/julius-speech/julius

���邢�͉��L�̃��[���A�h���X�܂ł��₢���킹������
('at' �� '@' �ɓǂݑւ��Ă�������)

Expand Down
18 changes: 10 additions & 8 deletions 00readme.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Julius

(Rev 4.4 2016/08/20)
(Rev 4.4 2016/08/30)
(Rev 4.3.1 2014/01/15)
(Rev 4.3 2013/12/25)
(Rev 4.2.3 2013/06/30)
Expand Down Expand Up @@ -58,13 +58,14 @@ What's new in Julius-4.4
Julius is now hosted on GitHub:
https://github.com/julius-speech/julius

Version 4.4 includes several updates and new features. Two new
tools "adintool-gui" and "binlm2arpa" are added and "mkbingram" was
updated for audio input and binary LM conversion. Now does not exit
on client disconnection on module mode, instead it pauses itself and
wait for another client to come. It also has many bug fixes and
updates for recent OS and environments. Some documents that may help
users using Julius with DNN-HMM is also added.
Version 4.4 now supports stand-alone DNN-HMM support. (see 00readme-DNN.txt)
Other features include:
- New tools:
- adintool-gui: GUI version of adintool
- binlm2arpa: reverse convert binary N-gram to ARPA format
- "mkbingram" now support direct charset conversion of binary LM
- Now does not exit at connection lost in module mode
- Bug fixes

See "Release.txt" for full list of updates.
Run "configure --help=recursive" to see all configure options.
Expand All @@ -84,6 +85,7 @@ Contents of Julius-4.4
configure configure script
configure.in
Sample.jconf Sample configuration file
Sample.dnnconf Sample DNN configuration file
julius/ Julius sources
libjulius/ JuliusLib core engine library sources
libsent/ JuliusLib low-level library sources
Expand Down
16 changes: 8 additions & 8 deletions LICENSE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
�u���b�A�������F���G���W�� Julius�v
���p������

Copyright (c) 1991-2013 ���s��w �͌�������
Copyright (c) 1991-2063 ���s��w �͌�������
Copyright (c) 1997-2000 ��񏈗��U�����Ƌ���(IPA)
Copyright (c) 2000-2005 �ޗǐ�[�Ȋw�Z�p��w�@��w ���쌤����
Copyright (c) 2005-2013 ���É��H�Ƒ�w Julius�J���`�[��
Copyright (c) 2005-2016 ���É��H�Ƒ�w Julius�J���`�[��

----------------------------------------------------------------------------

Expand Down Expand Up @@ -40,10 +40,10 @@ Julius
�Ȃ����̂܂ܕ\�����Y�t���Ȃ���΂Ȃ�܂���B

�L
Copyright (c) 1991-2013 ���s��w �͌�������
Copyright (c) 1991-2016 ���s��w �͌�������
Copyright (c) 1997-2000 ��񏈗��U�����Ƌ���(IPA)
Copyright (c) 2000-2005 �ޗǐ�[�Ȋw�Z�p��w�@��w ���쌤����
Copyright (c) 2005-2013 ���É��H�Ƒ�w Julius�J���`�[��
Copyright (c) 2005-2016 ���É��H�Ƒ�w Julius�J���`�[��

3. �{�\�t�g�E�F�A�𗘗p���ē���ꂽ�m���Ɋւ��Ĕ��\���s�Ȃ��ۂɂ́A
�u���b�A�������F���G���W�� Julius�v�𗘗p�������Ƃ𖾋L���ĉ������B
Expand Down Expand Up @@ -79,10 +79,10 @@ Julius
Large Vocabulary Continuous Speech Recognition Engine Julius


Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
Copyright (c) 1991-2016 Kawahara Lab., Kyoto University
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology
Copyright (c) 2005-2016 Julius project team, Nagoya Institute of Technology

"Large Vocabulary Continuous Speech Recognition Engine Julius",
including Julian, is being developed at Kawahara Lab., Kyoto
Expand Down Expand Up @@ -129,10 +129,10 @@ whatsoever.

Form of copyright notice:

Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
Copyright (c) 1991-2016 Kawahara Lab., Kyoto University
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology
Copyright (c) 2005-2016 Julius project team, Nagoya Institute of Technology

3. When you publish or present any results by using the Software, you
must explicitly mention your use of "Large Vocabulary Continuous
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,12 @@ The main developer / maintainer is Akinobu Lee (ri@nitech.ac.jp).

# Download Julius

The latest release version is [4.3.1](https://github.com/julius-speech/julius/releases), released on January 15, 2014.
The latest release version is [4.4](https://github.com/julius-speech/julius/releases), released on August 30, 2016.
You can get the released package from the [Release page](https://github.com/julius-speech/julius/releases).

Version 4.3.1 is a bug fix release. Several bugs has been fixed.
See the "Release.txt" file for the full list of updates.
Run with "-help" to see full list of options.
Version 4.4 supports stand-alone DNN-HMM support, and several new
tools and bug fixes are included. See the "Release.txt" file for the
full list of updates. Run with "-help" to see full list of options.

# Toolkit and Assets

Expand Down
3 changes: 2 additions & 1 deletion Release-ja.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
4.4 (2016.08.20)
4.4 (2016.08.30)
=================
- DNN-HMMの計算をサポート
- "adintool-gui": 音声入力モニタGUI付き adintool (adintool/README-GUI.txt参照)
- "binlm2arpa": バイナリ言語モデルをARPAに変換する
- "mkbingram" に言語モデルの文字コードを変換して出力するオプション "-c" を追加
Expand Down
3 changes: 2 additions & 1 deletion Release.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
4.4 (2016.08.20)
4.4 (2016.08.30)
=================
- DNN-HMM computation support
- "adintool-gui": adintool with input monitoring (see adintool/README-GUI.txt)
- "binlm2arpa": convert binary LM to ARPA format
- "mkbingram" now can convert text encoding of an LM by "-c" option
Expand Down
74 changes: 74 additions & 0 deletions Sample.dnnconf
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
####
#### Sample DNN Configuration for DNN-HMM Decoding (-dnnconf)
####

####
#### Feature Extraction
####

# feature type, in HTK parameter specification format
feature_type FBANK_D_A_Z

# julius options to configure the acoustic parameter extraction.
#
# The example below indicates that:
# 1. parameters should be loaded from an HTK config file,
# 2. use CMN/CVN,
# 3. load ceptral mean and variance from the specified file,
# 4. keep the cepstral mean/variance static, not update while processing
#
# the specified string will be expanded inline at the point where this
# dnnconf file is specified by "-dnnconf", and passed to Julius.
# As the same as other options in Julius, the later option will override
# former. Please check the start-up messages to check if the
# feature extraction are correctly set up.
#
feature_options -htkconf model/dnn/config.lmfb.40ch.jnas -cvn -cmnload model/dnn/norm.jnas -cmnstatic

# feature vector length (including delta or accel, before splicing)
feature_len 120

# splicing length
context_len 11

####
#### NN Definition
####

# number of input nodes (should be equal to (feature_len * context_len))
input_nodes 1320

# number of output nodes (num and order should correspond to HMM definition)
output_nodes 2004

# number of nodes in hidden layers
hidden_nodes 2048

# number of hidden layers (layers excluding input and output)
hidden_layers 5

# weights W and biases b for hidden layers, in numpy np.save() format
# dtype of these file should be '<f4' (32-bit float little indian)!
W1 model/dnn/W_l1.npy
W2 model/dnn/W_l2.npy
W3 model/dnn/W_l3.npy
W4 model/dnn/W_l4.npy
W5 model/dnn/W_l5.npy
B1 model/dnn/bias_l1.npy
B2 model/dnn/bias_l2.npy
B3 model/dnn/bias_l3.npy
B4 model/dnn/bias_l4.npy
B5 model/dnn/bias_l5.npy

# also weights and biases for output layer
output_W model/dnn/W_output.npy
output_B model/dnn/bias_output.npy

# state prior in 'state_id(%d) prior(%e)' format
state_prior model/dnn/prior.dnn

# state prior factor
state_prior_factor 1.0

# batch size (not used)
batch_size 64
4 changes: 4 additions & 0 deletions Sample.jconf
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,7 @@
#-cmnnoupdate # keep initial mean, disable "-cmnupdate"
#-cmnmapweight 100.0 # weight for MAP-CMN
#-cvn # enable variance normalization
#-cmnstatic # totally static cmn/cvn

## Vocal tract length normalization (VTLN)
#-vtln 1.0 300 4800 # enable VTLN (alpha, lowerfreq, upperfreq)
Expand All @@ -317,6 +318,9 @@
#-ssalpha 2.0 # alpha coef. for spectral subtraction
#-ssfloor 0.5 # spectral floor coef.

## DNN-HMM definition (default disabled (= GMM-HMM))
#-dnnconf file # DNN configuration file

## Others
#-htkconf configfile # load analysis settings from HTK Config file

Expand Down
2 changes: 1 addition & 1 deletion adintool/mainloop.c
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ vecnet_sub(SP16 *Speech, int nowlen, Recog *recog)
#if 0
{
int i;
for (i = 0; i < vecnet_veclen; i++) {
for (i = 0; i < a->conf.vecnet_veclen; i++) {
printf(" %f", mfcc->tmpmfcc[i]);
}
printf("\n");
Expand Down
Empty file modified jclient-perl/jclient.pl
100644 → 100755
Empty file.
Loading

0 comments on commit a702aff

Please sign in to comment.