Multi Head Attention Layer #7803

grafael · 2017-09-01T18:52:20Z

I think is a good idea start to think how to implement this sort of layer in Keras.
I know that is a really fresh algorithm, but I believe that's a new cutting edge tech in Deep Learning for the next years.

Paper: Attention is all you need (https://arxiv.org/abs/1706.03762)

Blog showing some results: Google Research Blog
Tensor2Tensor library tensor2tensor
Pytorch implementation pytorch-t2t

andhus · 2017-10-30T01:19:15Z

@grafael Please keep an eye on this #8296. I'll try to have a look at the specific case to see if it will be covered.

grafael · 2017-10-30T01:53:46Z

Thank @andhus I'll follow the updates

soham97 · 2018-05-28T18:24:06Z

Yeah attention layer is defacto standard used in NLP problems to achieve state of art be it generative or classification. I have implemented the attention layer in keras, and have obtained good results from it. It could be be much better if the layer is added to keras, so public can directly use it. Should I share the implementation in the thread or how is the procedure ?

utkarshshukla2912 · 2018-06-10T07:40:34Z

@soham97 it would be great if you could i was trying to implement it couldn't get the thing working

lvapeab · 2018-06-11T14:10:12Z

Hi,

I implemented this some time ago in a fork. It is somewhat dirty and lacks test suites, but it works (an NMT example of this).

Cheers.

utkarshshukla2912 · 2018-06-11T17:02:35Z

@lvapeab thanks, man !!

miniwa · 2019-05-05T15:46:57Z

Whats the status on this?
Seeing OpenAI's success with them made me want to try it.

dynamicwebpaige · 2020-05-17T16:35:07Z

Closing, as there is now a Keras-friendly multi-head attention layer in TensorFlow Addons. Thanks for the feature request!

dynamicwebpaige closed this as completed May 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi Head Attention Layer #7803

Multi Head Attention Layer #7803

grafael commented Sep 1, 2017

andhus commented Oct 30, 2017

grafael commented Oct 30, 2017

soham97 commented May 28, 2018

utkarshshukla2912 commented Jun 10, 2018

lvapeab commented Jun 11, 2018

utkarshshukla2912 commented Jun 11, 2018

miniwa commented May 5, 2019

dynamicwebpaige commented May 17, 2020

Multi Head Attention Layer #7803

Multi Head Attention Layer #7803

Comments

grafael commented Sep 1, 2017

andhus commented Oct 30, 2017

grafael commented Oct 30, 2017

soham97 commented May 28, 2018

utkarshshukla2912 commented Jun 10, 2018

lvapeab commented Jun 11, 2018

utkarshshukla2912 commented Jun 11, 2018

miniwa commented May 5, 2019

dynamicwebpaige commented May 17, 2020