Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

Training too slow #183

Open
janenie opened this issue Nov 22, 2017 · 7 comments
Open

Training too slow #183

janenie opened this issue Nov 22, 2017 · 7 comments

Comments

@janenie
Copy link

janenie commented Nov 22, 2017

Hi
Is there any possible way to accelerate the code ?
I am running a training data with only 300 vocabulary size and 3w training instances with maximum length 50, but it takes almost 1 hour to finish training a epoch

What happened to this version of code

Thanks

@brightmart
Copy link

brightmart commented Nov 22, 2017

i am using 100k vocabulary size and 10 million training data, it take 32 hours to training 127k steps with around 17 BLUE for english to chinese. batch size is set to 64.

1.use batch size as big as possible, as long as GPU can support
2.hidden size is 1024 by default, you can reduce it to 800 or 512 if out of memory of GPU.
3.for machine translation, with deeper layers it take long time to train and more memory of GPU, but performance improve is small. you can set layer to 2.

here is the command:
CUDA_VISIBLE_DEVICES=7 nohup python -m nmt.nmt --attention=normed_bahdanau --src=en --tgt=zh --train_prefix=nmt_data_chinese/train --dev_prefix=nmt_data_chinese/dev
--test_prefix=nmt_data_chinese/test --out_dir=nmt_attention_model_big_pte_batch64 --num_train_steps=4800000 --steps_per_stats=100 --num_layers=2 --num_units=800
--dropout=0.5 --metrics=bleu --learning_rate=0.001 --optimizer=adam --encoder_type=bi --batch_size=64 --attention_architecture=gnmt_v2 --src_max_len=25
--subword_option=bpe --unit_type=layer_norm_lstm --vocab_prefix=nmt_data_chinese/vocabulary &

@yapingzhao
Copy link

I have a question to ask: My training language is about 70,000 sentences. How many of my dictionary sizes are appropriate? Thank you.

@brightmart
Copy link

brightmart commented Apr 18, 2018 via email

@yapingzhao
Copy link

yapingzhao commented Apr 18, 2018

I would like to ask if 50k is equivalent to 50000 (dictionary size)?I'm a neural network beginner (smile).
Thank you.

@brightmart
Copy link

brightmart commented Apr 19, 2018 via email

@vikaskumarjha9
Copy link

@brightmart What is the size of your dev set ? Does it matter if we have bigger dev set, then training takes longer to complete?

@brightmart
Copy link

if you have a big dev set, you can choose part of dev set to evaluate during training.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants