which attention architecture is used in NER? #6

omerarshad · 2018-10-02T08:53:35Z

I want to understand how you used attention in NEr task, any paper or article which explains this? Thanks

qq547276542 · 2018-12-17T01:59:29Z

According to the README, attention mechanism not suitable for NER task：
The variant modules include Stack Bidirectional RNN (multi-layers), Multi-RNN Cells (multi-layers), Lurong/Bahdanau Attention Mechanism, Self-attention Mechanism, Residual Connection, Layer Normalization and so on. However, these modifications did not improve (sometime even worse than basement model) the performance significantly (F1 score improves >= 1.5). It's easy to apply those variants and train by modifying the config settings in train_conll_ner_blstm_cnn_crf.py.

qq547276542 · 2018-12-17T02:01:17Z

In my own experiment, attention mechanism really not work, but Layer Normalization can improve the robustness of the model

omerarshad · 2018-12-17T09:09:01Z

well in my experiments attention only models achieve comparable results to LSTM, even got better than LSTM with very less training time

qq547276542 · 2018-12-17T09:15:13Z

well in my experiments attention only models achieve comparable results to LSTM, even got better than LSTM with very less training time

Maybe there are some problems with my experiment. I have tried BLSTM+SelfAttention+CRF，the effect is not as good as BLSTM+CRF. The structure of your model is SelfAttention+CRF, not LSTM? I want to give it a try.

omerarshad · 2018-12-17T09:20:34Z

yes, structure of my model is attention+crf only

qq547276542 · 2019-01-17T05:28:28Z

yes, structure of my model is attention+crf only

do you have the relevant code, can I refer to it?

VioletJKI · 2019-06-05T07:38:47Z

I do not use crf, and get a best result.

@MingLunHan excuse me, what is your model's architecture? bilstm + attention only?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

which attention architecture is used in NER? #6

which attention architecture is used in NER? #6

omerarshad commented Oct 2, 2018

qq547276542 commented Dec 17, 2018

qq547276542 commented Dec 17, 2018

omerarshad commented Dec 17, 2018

qq547276542 commented Dec 17, 2018

omerarshad commented Dec 17, 2018

qq547276542 commented Jan 17, 2019

VioletJKI commented Jun 5, 2019

which attention architecture is used in NER? #6

which attention architecture is used in NER? #6

Comments

omerarshad commented Oct 2, 2018

qq547276542 commented Dec 17, 2018

qq547276542 commented Dec 17, 2018

omerarshad commented Dec 17, 2018

qq547276542 commented Dec 17, 2018

omerarshad commented Dec 17, 2018

qq547276542 commented Jan 17, 2019

VioletJKI commented Jun 5, 2019