Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for GPT2 decoding speedup #61

Closed
chiragjn opened this issue Nov 14, 2019 · 2 comments
Closed

Suggestion for GPT2 decoding speedup #61

chiragjn opened this issue Nov 14, 2019 · 2 comments
Labels
enhancement New feature or request

Comments

@chiragjn
Copy link
Contributor

chiragjn commented Nov 14, 2019

According to transformers documentation, GPT2 LM head supports an argument called past that speeds up decoding by reusing computed attention tensors from previous steps. I have started changes in my fork to be able to get some numbers

chiragjn#2

I am not entirely sure how to get this working for XLNET, but there is a similar, mems argument

Would you accept such PR upstream if it speeds up decoding?

@makcedward
Copy link
Owner

@chiragjn
It will be good if you can submit PR as I am working on improving performance in terms of speed.

@chiragjn
Copy link
Contributor Author

Closing this as #63 is in master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants