Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training takes too long!! #34

Open
andy-soft opened this issue Apr 21, 2017 · 20 comments
Open

Training takes too long!! #34

andy-soft opened this issue Apr 21, 2017 · 20 comments

Comments

@andy-soft
Copy link

Hello, I was wondering what are the training times for the demonstrations.
I just tried the english seq labeler, and it took 1 hour to process 10% of the corpus! (is this normal?)
It's known Deep Learning is CPU hungry, I have only 2 cores and 8GB RAM (sorry)
¿do I need to change the PC) or acquire a CUDA core to help computing?
¿Is there a way to stop learning manually, or programmatically after reaching certain error rate?

I am wondering if you ever tried sequence labelling on highly inflectional languages (like Spanish) which has lots of inflectional power (complexity) and the words as a whole string are useless, the vocabulary explodes into >300M words! and the "examples"found on text begins to be too sparse, even with negative sampling you never get certain combinations, because most verbs have over 200 versions of itself (inflections), including time-tense, person, gender, plurality, mode, etc. So there is need to train on higher level features, but not losing the "semantic" sense. ¿do you think this could be possible, like decomposing the words (by means of controlled independent lemmatization) into parts/chunks (prefix, root, suffix, as well as modal information and semantic features of the parts,) My intuition is that this might lower the training and may be better the generalization power with less extensive corpus. Like capturing higher level syntax rules, and by the way generating semantic content constraints (may be even some common sense)...

It's just a question, on theory!

@zhongkaifu
Copy link
Owner

Hi @andy-soft,

For your labeling task, how many categories do you want to label ? Could you please share the configuration file your are using with me ? Then I will estimate if current performance is reasonable. Currently, RNNSharp doesn't support GPU training. It supports CPU training with SIMD instruction only, so you need to have a powerful CPU with new SIMD instruction set, such as AVX, AVX2 and so on.

I did use RNNSharp for sequence label tasks on inflectional languages, such as English, such as pos-tag, named entity and so on. Usually, the labeling categories is no more than 50. If labeling categories is too much, it will definitely affect performances, and you should optimize them, such as splitting them to a few of basic units for labeling. If it's really hard to reduce the number of them, you could use SampledSoftmax as output layer type. For each token, It randomly samples some categories plus categories on current labeling sentence for training, instead of the entire categories set.

@andy-soft
Copy link
Author

andy-soft commented Apr 22, 2017 via email

@zhongkaifu
Copy link
Owner

It's really appreciated if you could make contribution for RNNSharp. :)

For word2vec, you can try my version: https://github.com/zhongkaifu/Txt2Vec It has higher performance than original word2vec and supports incremental training.

For "the problem is the many labels of each word, the variability is huge, more
than 900 different POS labels, (EAGLES 2 version)", could you please make a specified example about it ? Sorry that I don't understand about it.

@andy-soft
Copy link
Author

andy-soft commented Apr 22, 2017 via email

@zhongkaifu
Copy link
Owner

Hi Andrés,

Thanks for your explanation in details. It's really helpful.
For your task, to improve performance and reduce the number of output categories, you could try sub-word level segmentation and labeling or character level segmentation and labeling. As the example you mentioned in above "hiperrecontrabuenísimo", if you have sub-word dictionary for training, you could build training corpus likes:

hiper \t S_Aug1
recontra \t S_Aug2
buen \t S_CorePart
ísimo \t S_Aug3

So, Label "Aug1Aug2CorePartAug3" is split into four basic tags. Or you could try character level labeling, such as

h \t B_Aug1
I \t M_Aug1
p \t M_Aug1
e \t M_Aug1
r \t E_Aug1

By this way, it will significantly reduce the number of output categories.

Thanks
Zhongkai Fu

@zhongkaifu
Copy link
Owner

In addition, did you try the latest RNNSharp code (check out from master branch) ? It's much faster than the released version, since I have not updated release package yet.

@andy-soft
Copy link
Author

andy-soft commented Apr 22, 2017 via email

@andy-soft
Copy link
Author

andy-soft commented May 31, 2017 via email

@zhongkaifu
Copy link
Owner

According RNN output lines, you are still using older RNNSharp, please sync the latest source code (not released demo package, since I have not updated it yet), build it and train your model.

It's okay you can send training example, configuration file and command line you ran to me.

@andy-soft
Copy link
Author

andy-soft commented Jun 1, 2017 via email

@andy-soft
Copy link
Author

andy-soft commented Jun 2, 2017 via email

@zhongkaifu
Copy link
Owner

First of all, your CPU has only two cores, this is the main reason why training is slowly.

Secondly, I don't know if your CPU support AVX and AVX2 instructions which is for SIMD to speed up training. You could show a few of first log lines with me, and I will take a look.

Finally, you could set TFEATURE_CONTEXT=0 to reduce the number of sparse features to speed up training.

@andy-soft
Copy link
Author

andy-soft commented Jun 2, 2017 via email

@zhongkaifu
Copy link
Owner

Hi Andrés

It's really appreciated if you would like to contribute RNNSharp project. :)

I cannot get your inline image for CPU G2020. According information at http://www.cpu-world.com/CPUs/Pentium_Dual-Core/Intel-Pentium%20G2020.html, it seems this CPU doesn't support AVX and AVX2 instructions, so RNNSharp cannot emit SIMD instruction to speed up.

@andy-soft
Copy link
Author

andy-soft commented Jun 3, 2017 via email

@zhongkaifu
Copy link
Owner

I'm using System.Vectors which is a component of .NET core to emit SIMD instruction (AVX and AVX2) for RNNSharp.

If that AMD CPU supports these AVX instructions, RNNSharp can leverage them as well.

@andy-soft
Copy link
Author

Hi there, I just got a CPU with 16 cores and 128Gbytes of RAM. Ready to train hard!!

@zhongkaifu
Copy link
Owner

Cool!
I recently introduced MKL into Seq2SeqSharp and got a significantly improvement on performance, if you like, you could try it in RNNSharp.

@andy-soft
Copy link
Author

andy-soft commented Aug 26, 2018 via email

@andy-soft
Copy link
Author

andy-soft commented Oct 2, 2018

I just put to train the sample of English SeqClassif (NER) from your sample 143Mb flat text file, 2.2M words.
I got a 32 core Xeon 3500 server, with 128 Gb Ram, and...
it took >24 hours to reach a mere 0.89% token error, 8.89% seq error. (About 40% of total training time, then I aborted it) I am scared of the unusual time to train those sets....
The binary model file is 1.8Gb long !!
¿Are those normal training times, and model sizes.. ?
¿or should I go and purchase a CUDA multi core and use another LSTM library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants