Skip to content

cannot reproduct the reuslt in pointer-generator with coverage mechanism, always inferior to pgen model. #23

Open
@gm0616

Description

@gm0616

My batch_size is 64, I pretrain my model for about 50000 iterations, and get a better result than pgen`s. Then I turn on the coverage mechanism, and train the model with another 2000 iterations. The coverage loss cannot decrease to 0.2 which has been mentioned in pgen model. The final result on rouge-1 metric is about 38.90. Is there any tricks to add coverage mechanism? How can I get the similar result with pgen model ?

Activity

yaserkl

yaserkl commented on Dec 13, 2018

@yaserkl
Owner

No, this issue is well-discussed in the original pointer-generator model page.
Every time you run this model, it will generate a different result due to the multi-processing batching used in this model.
The only solution that I usually use for fixing my model parameters is to use 1 queue for batching and make sure to use seed for randomizers throughout the framework.
Try setting these parameters to 1:
example_queue_threads
batch_queue_threads

If you vary the seed parameter, you might manage to get even better result than the original paper. I've got better result myself as presented in our latest paper.

My personal experience is that the running average loss (at least the way it is defined in this paper) is not the best indicator for selecting the best evaluation model. In the above paper, I'm using the average ROUGE reward during evals as another way of saving my best model and it sometimes work better the running average loss.

gm0616

gm0616 commented on Dec 18, 2018

@gm0616
Author

Well, thanks for your response, I`ll try the methods you have mentioned above to manage with coverage mechanism.
You said you use ROUGE reward during evals. As far as I know, the calculation of ROUGE is quite slow, how to implement this metric to evaluate a certain ckpt? And which ROUGE metric you use for evaluation? 1, 2, or L ?

yaserkl

yaserkl commented on Dec 25, 2018

@yaserkl
Owner

Yes, it's quite slow and will increase the evaluation time per batch by two to three times (without ROUGE based eval, each evaluation will take around 0.5 sec on a P100 GPU with batch size 8, but with ROUGE it rise up to 1.5 secs which is still fine for my case). Also, I'm using ROUGE L to get the best training ckpt.

xiangriconglin

xiangriconglin commented on Jul 1, 2019

@xiangriconglin

No, this issue is well-discussed in the original pointer-generator model page.
Every time you run this model, it will generate a different result due to the multi-processing batching used in this model.
The only solution that I usually use for fixing my model parameters is to use 1 queue for batching and make sure to use seed for randomizers throughout the framework.
Try setting these parameters to 1:
example_queue_threads
batch_queue_threads

If you vary the seed parameter, you might manage to get even better result than the original paper. I've got better result myself as presented in our latest paper.

My personal experience is that the running average loss (at least the way it is defined in this paper) is not the best indicator for selecting the best evaluation model. In the above paper, I'm using the average ROUGE reward during evals as another way of saving my best model and it sometimes work better the running average loss.

Excuse me,
When evaluating, is it necessary to add the train_operation to run?
The function def run_train_steps() is:
to_return = {
'train_op': self._shared_train_op,
'summaries': self._summaries,
'pgen_loss': self._pgen_loss,
'global_step': self.global_step,
'decoder_outputs': self.decoder_outputs
}
However, def run_eval_steps() is:
to_return = {
'summaries': self._summaries,
'pgen_loss': self._pgen_loss,
'global_step': self.global_step,
'decoder_outputs': self.decoder_outputs
}
When i ran eval steps, the model did not update and the average loss kept same. Is anything wrong in my running process?
Expect for replying, thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

      Participants

      @yaserkl@gm0616@xiangriconglin

      Issue actions

        cannot reproduct the reuslt in pointer-generator with coverage mechanism, always inferior to pgen model. · Issue #23 · yaserkl/RLSeq2Seq