Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API DESIGN REVIEW] Recurrent Attention Layer(s) #7633

Closed
andhus opened this issue Aug 13, 2017 · 8 comments
Closed

[API DESIGN REVIEW] Recurrent Attention Layer(s) #7633

andhus opened this issue Aug 13, 2017 · 8 comments

Comments

@andhus
Copy link
Contributor

andhus commented Aug 13, 2017

I've laid out an API suggestion for application of attention mechanisms in a recurrent setting.

Full design review doc: https://docs.google.com/document/d/1psFXnmMlSTg5JapgZKz26ag-zBu3ERrxkKoEzNpzl4w/edit?usp=sharing

There is a proof-of-concept implementation of the API in my (play-ground) add-on library extkeras as well as an example of its use in model training.

There is still a lot of work needed, but if you think it's a good general direction I'm happy to take lead in completing the implementation.

Happy for all feedback at any level!

@hubayirp
Copy link

hubayirp commented Aug 13, 2017 via email

@hamelsmu
Copy link

+1

@titu1994
Copy link
Contributor

This is important. Attention mechanisms is often used in various capacities in new papers. It would be highly beneficial to have a standard way to apply attention to general RNNs / LSTM / GRU.

@fchollet
Copy link
Member

I fully agree that we need built-in support for RNN attention in Keras, and I think this is a solid first proposal. What would be some alternative proposals? If none, we will go with this one.

@andhus
Copy link
Contributor Author

andhus commented Sep 22, 2017

As discussed in comments of the API Review doc, the introduction of standalone RNN Cells (#7943) will simplify and make addition of Recurrent Attention more modular. I'll adapt the API suggestion accordingly asap.

@andhus
Copy link
Contributor Author

andhus commented Sep 25, 2017

@fchollet The API doc is now updated, see the PR:s above which covers most crucial parts. Let me know if you think it's on the right track and I'll continue with step 3 and 4 according document. The concept of standalone cells really makes these things more modular and neat :D

@stabgan
Copy link

stabgan commented Jul 13, 2018

@andhus I need a help . I've researched a lot but I'm stuck on something . I need to extract the attention weights . But couldn't find an tutorial or blog on how to do that . Just consider IMDB dataset , and after training , I will input a test data . Then I want to know which words in the test data are most important . I only know keras well . Is there any way ? For a better understanding I want something like deepmoji.mit.edu but it's simpler than that . Any kind of help by anyone is appreciated

@gabrieldemarmiesse
Copy link
Contributor

Closing this issue in favor of #11172 for organization purposes. This issue can be reopened later on if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants