No Inference results OUTPUT With the transformer_moe Model #435
Description
I’m using T2T to translate English to Chinese with TensorFlow 1.4 and Tensor2Tensor 1.2.9.
I try to apply the transformer_moe model to train the character-level En2Zh transformer model .
The process of training works well, but the decoder outputs nothing.
Is available the transformer_moe model for outputting Chinese now? Or are there other settings?
The script snippets of the training and decoding as follows:
Training
`PROBLEM=translate_en_zh_nmt_char
MODEL=transformer_moe
HPARAMS=transformer_moe_8k
t2t-trainer
--data_dir=$DATA_DIR
--problems=$PROBLEM
--model=$MODEL
--hparams_set=$HPARAMS
--train_steps=30000
--worker_gpu_memory_fraction=0.98
--eval_early_stopping_steps=2
--output_dir=$TRAIN_DIR
`
Decoding
`PROBLEM=translate_en_zh_nmt_char
MODEL=transformer_moe
HPARAMS=transformer_moe_8k
BEAM_SIZE=4
ALPHA=0.6
t2t-decoder
--data_dir=$DATA_DIR
--problems=$PROBLEM
--model=$MODEL
--hparams_set=$HPARAMS
--output_dir=$TRAIN_DIR
--decode_hparams="beam_size=$BEAM_SIZE,alpha=$ALPHA,batch_size=100"
--decode_from_file=$DECODE_FILE
`
The output:
`INFO:tensorflow:Inference results INPUT: I have to write down everything that I remember, even if it's in the middle of the night.
INFO:tensorflow:Inference results OUTPUT:
INFO:tensorflow:Inference results INPUT: Open 7:00 a.m. to 10:00 a.m. for breakfast and 5:00 p.m. to 9:00 p.m. for dinner and dessert.
INFO:tensorflow:Inference results OUTPUT:
`
By the way, I compared the transformer_moe model with transformer, it’s amazing!
With the transformer_moe model:
After about 21,000 steps, the loss is closely to 0.006.
INFO:tensorflow:Saving dict for global step 21142: global_step = 21142, loss = 0.00608885, metrics-translate_en_zh_nmt_char/a ccuracy = 0.999784, metrics-translate_en_zh_nmt_char/accuracy_per_sequence = 0.996667, metrics-translate_en_zh_nmt_char/accur acy_top5 = 0.999784, metrics-translate_en_zh_nmt_char/approx_bleu_score = 0.529102, metrics-translate_en_zh_nmt_char/neg_log_ perplexity = -0.00323181, metrics-translate_en_zh_nmt_char/rouge_2_fscore = 0.899869, metrics-translate_en_zh_nmt_char/rouge_ L_fscore = 0.555063 INFO:tensorflow:Validation (step 21142): metrics-translate_en_zh_nmt_char/rouge_L_fscore = 0.555063, loss = 0.00608885, metri cs-translate_en_zh_nmt_char/neg_log_perplexity = -0.00323181, global_step = 21142, metrics-translate_en_zh_nmt_char/accuracy_ top5 = 0.999784, metrics-translate_en_zh_nmt_char/accuracy = 0.999784, metrics-translate_en_zh_nmt_char/approx_bleu_score = 0 .529102, metrics-translate_en_zh_nmt_char/rouge_2_fscore = 0.899869, metrics-translate_en_zh_nmt_char/accuracy_per_sequence = 0.996667
With the the transformer model:
After 350,000 steps, the loss is still at 3.66.
But the transformer_moe model cannot decode output now.
Thanks.