Skip to content
This repository was archived by the owner on Mar 12, 2024. It is now read-only.

Commit b88026e

Browse files
Merge branch 'master' of https://github.com/athnlp/athnlp-labs
2 parents d26fffa + d80dc01 commit b88026e

File tree

3 files changed

+23
-7
lines changed

3 files changed

+23
-7
lines changed

labs-exercises/neural-encoding-fever.md

+21-5
Original file line numberDiff line numberDiff line change
@@ -131,13 +131,29 @@ If you are using `pdb`, you will have to write a simple 2-line wrapper script: s
131131
## Exercises
132132
For the exercises, we have provided a dataset reader (`athnlp/readers/fever_reader.py`), configuration file (`athnlp/experiments/fever.json`), and sample model (`athnlp/models/fever_text_classification.py`). You can complete these exercises by completing the code in the sample model.
133133

134+
### 1. Average Word Embedding Model
134135
1. Implement a model that
135136
- represents the claim and the evidence by averaging their word embeddings;
136137
- concatenates the two representations;
137138
- uses a multilayer perceptron to decide the label.
138-
Experiment with the number and the size of hidden layers to find the best settings using the train/dev set and assess your accuracy on the test set.
139-
2. Look at the distribution of training data. How does balancing the number of `SUPPORTED` and `REFUTED` training instances affect the model accuracy? (hint, you may have to create a new dataset reader)
140-
3. Compare against a discrete feature baseline, i.e., using one-hot vectors or hand-crafted features instead of word embeddings to represent the words?
141-
4. Implement a _[hypothesis only](https://www.aclweb.org/anthology/S18-2023)_ version of the model that ignores the evidence and only uses the claim for predicting the label. What accuracy does this model get? Why do you think this?
142-
5. Take a look at the training/dev data. Can you design claims that would "fool" your models? You can see this report ([Thorne and Vlachos, 2019](https://arxiv.org/abs/1903.05543)) for inspiration.
139+
140+
2. Experiment with the number and the size of hidden layers to find the best settings using the train/dev set and assess your accuracy on the test set. (note: this model may not get high accuracy)
141+
142+
3. Explore: How does fine-tuning the word embeddings affect performance? You can make the word embeddings layer trainable by changing the config file for the `text_field_embedder` in the `fever.json` config file.
143+
144+
### 2. Discrete Feature Baseline
145+
1. Compare against a discrete feature baseline, i.e., using one-hot vectors or hand-crafted features instead of word embeddings to represent the words?
146+
147+
### 3. Alternative Pooling Methods
148+
Averaging word embeddings is an example of Pooling (see slide 110/111 in Ryan McDonald's talk: [SLIDES](https://github.com/athnlp/athnlp-labs/blob/master/slides/McDonald_classification.pdf)).
149+
150+
Try alternative methods for pooling the word embeddings. Which ones make an improvement?
151+
152+
1. Replace the averaging of word embeddings with max pooling (taking the max values for each embedding dimension over each word in the sentence).
153+
154+
2. Use a `CnnEncoder()` to generate sentence representations. (hint: you may need to set `"token_min_padding_length": 5` or higher in the `tokens` object in `token_indexers` for large filter sizes). Filter sizes of between 2-5 should be sufficient. More filters will cause training to be slower (perhaps just train for 1 or 2 epochs)
155+
156+
### 4. Hypothesis-Only NLI and Biases
157+
1. Implement a _[hypothesis only](https://www.aclweb.org/anthology/S18-2023)_ version of the model that ignores the evidence and only uses the claim for predicting the label. What accuracy does this model get? Why do you think this? Think back to slide 7 on Ryan's talk.
158+
2. Take a look at the training/dev data. Can you design claims that would "fool" your models? You can see this report ([Thorne and Vlachos, 2019](https://arxiv.org/abs/1903.05543)) for inspiration.
143159
What do you conclude about the ability of your model to understand language?

labs-exercises/pos-tagging-perceptron.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ nltk.download('brown')
3131
#### 1. Perceptron Algorithm
3232

3333
Implement the standard perceptron algorithm. Use the first 10000/1000/1000 sentences for training/dev/test.
34-
In order to speed up the process for you, we have implemented a simple dataset reader that automatically converts the Brown corpus using the Universal PoS Tagset: `athnlp/reader/brown_pos_corpus.py` (you may use your own implementation if you want; `athnlp/reader/en-brown.map` provides the mapping from Brown to Universal Tagset).
34+
In order to speed up the process for you, we have implemented a simple dataset reader that automatically converts the Brown corpus using the Universal PoS Tagset: `athnlp/readers/brown_pos_corpus.py` (you may use your own implementation if you want; `athnlp/reader/en-brown.map` provides the mapping from Brown to Universal Tagset).
3535

3636
**Important**: Recall that the perceptron has to predict multiple (PoS tags) instead of binary classes:
3737
![Multiclass Perceptron](multiclass_perceptron.png)

requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@ nltk
22
allennlp
33
numpy
44
ipykernel
5-
pytorch-transformers
5+
pytorch-transformers==1.1.0

0 commit comments

Comments
 (0)