forked from tensorflow/hub
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Showing
2 changed files
with
56 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Module nascarr/nq/1 | ||
Question Answering model trained on Natural Questions Dataset [1, 2]. | ||
|
||
|
||
<!-- asset-path: https://github.com/nascarr/natural-questions/raw/master/models/v1/model.tar.gz --> | ||
<!-- module-type: text-question-answering --> | ||
<!-- network-architecture: Transformer --> | ||
<!-- dataset: Natural Questions --> | ||
<!-- language: en --> | ||
<!-- fine-tunable: true --> | ||
<!-- format: saved_model_2 --> | ||
|
||
## Overview | ||
|
||
Model was created as part of a Gold Medal solution at TensorFlow 2.0 Question Answering competition [3]. | ||
Solution develops ideas from "A BERT Baseline for the Natural Questions" paper [4] and uses Bert Large Uncased (Whole Word Masking) | ||
as a pretrained model [5]. Model is fine-tuned on Natural Questions Dataset. | ||
More detailed description of the solution can be found on Kaggle forum [6]. | ||
|
||
|
||
#### Example use | ||
|
||
``` | ||
seq_len = 512 | ||
unique_id = tf.keras.Input(shape=(1,),dtype=tf.int64,name='unique_id') | ||
input_ids = tf.keras.Input(shape=(seq_len,),dtype=tf.int32,name='input_ids') # input word ids | ||
input_mask = tf.keras.Input(shape=(seq_len,),dtype=tf.int32,name='input_mask') | ||
segment_ids = tf.keras.Input(shape=(seq_len,),dtype=tf.int32,name='segment_ids') | ||
model = hub.KerasLayer("https://tfhub.dev/nascarr/nq/1", trainable=True) | ||
unique_id, start_logits, end_logits, ans_type = model([unique_id, input_ids, input_mask, segment_ids]) | ||
``` | ||
|
||
|
||
#### References | ||
|
||
|
||
[1] Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Matthew Kelcey, Jacob Devlin, Kenton Lee, Kristina N. Toutanova, Llion Jones, Ming-Wei Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov. | ||
[Natural Questions: a Benchmark for Question Answering Research](https://research.google/pubs/pub47761/) | ||
|
||
[2] https://ai.google.com/research/NaturalQuestions | ||
|
||
[3] https://www.kaggle.com/c/tensorflow2-question-answering | ||
|
||
[4] Chris Alberti, Kenton Lee, Michael Collins. | ||
[A BERT Baseline for the Natural Questions](https://arxiv.org/abs/1901.08634) | ||
|
||
[5] https://github.com/google-research/bert | ||
|
||
[6] https://www.kaggle.com/c/tensorflow2-question-answering/discussion/128278 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Publisher nascarr | ||
Anastasiya Karpovich | ||
|
||
## Anastasiya Karpovich | ||
NLP Engineer. Kaggle competitor. | ||
|
||
https://www.kaggle.com/user189546 |