Open
Description
My batch_size is 64, I pretrain my model for about 50000 iterations, and get a better result than pgen`s. Then I turn on the coverage mechanism, and train the model with another 2000 iterations. The coverage loss cannot decrease to 0.2 which has been mentioned in pgen model. The final result on rouge-1 metric is about 38.90. Is there any tricks to add coverage mechanism? How can I get the similar result with pgen model ?
Activity
yaserkl commentedon Dec 13, 2018
No, this issue is well-discussed in the original pointer-generator model page.
Every time you run this model, it will generate a different result due to the multi-processing batching used in this model.
The only solution that I usually use for fixing my model parameters is to use 1 queue for batching and make sure to use seed for randomizers throughout the framework.
Try setting these parameters to 1:
example_queue_threads
batch_queue_threads
If you vary the seed parameter, you might manage to get even better result than the original paper. I've got better result myself as presented in our latest paper.
My personal experience is that the running average loss (at least the way it is defined in this paper) is not the best indicator for selecting the best evaluation model. In the above paper, I'm using the average ROUGE reward during evals as another way of saving my best model and it sometimes work better the running average loss.
gm0616 commentedon Dec 18, 2018
Well, thanks for your response, I`ll try the methods you have mentioned above to manage with coverage mechanism.
You said you use ROUGE reward during evals. As far as I know, the calculation of ROUGE is quite slow, how to implement this metric to evaluate a certain ckpt? And which ROUGE metric you use for evaluation? 1, 2, or L ?
yaserkl commentedon Dec 25, 2018
Yes, it's quite slow and will increase the evaluation time per batch by two to three times (without ROUGE based eval, each evaluation will take around 0.5 sec on a P100 GPU with batch size 8, but with ROUGE it rise up to 1.5 secs which is still fine for my case). Also, I'm using ROUGE L to get the best training ckpt.
xiangriconglin commentedon Jul 1, 2019
Excuse me,
When evaluating, is it necessary to add the train_operation to run?
The function def run_train_steps() is:
to_return = {
'train_op': self._shared_train_op,
'summaries': self._summaries,
'pgen_loss': self._pgen_loss,
'global_step': self.global_step,
'decoder_outputs': self.decoder_outputs
}
However, def run_eval_steps() is:
to_return = {
'summaries': self._summaries,
'pgen_loss': self._pgen_loss,
'global_step': self.global_step,
'decoder_outputs': self.decoder_outputs
}
When i ran eval steps, the model did not update and the average loss kept same. Is anything wrong in my running process?
Expect for replying, thank you very much.