Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ArmelRandy authored May 25, 2023
1 parent a26a608 commit aa90766
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Using `Instruction:`, `Input:` and `Output:` seems to work well for `text-davinc
How to select and post-process the instructions that are generated by prompting a model? In the original work, the instructions are generated iteratively, and we keep those with a rouge score stricly less than `0.7` with any previously generated instruction. This allows diversity in the dataset, at least in terms of how the instructions are worded. According to our experiments, it is still possible to generate a problem multiple times with a different formulation each time. We propose to extend take the curation further with multiple ideas.

### Self-consistency
We came up with a strong data instruction filtering technique. The idea is very simple, we want to test if the model is consistent with what it generates. We verify that by prompting the model to generate and instruction based the output. It is a complicated task for a LM and for a human because in many case, it results in an unsolvable task. In the case where the model is able to generate an instruction, we compare it in terms of meaning with the ground-truth. For that, we use [Sentence-BERT](https://arxiv.org/pdf/1908.10084.pdf), precisely [All-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) with the threshold of our choice (typically 0.5).
We came up with a strong data instruction filtering technique. The idea is very simple, we want to test if the model is consistent with what it generates. We verify that by prompting the model to generate and instruction based the output. It is a complicated task for a LM and for a human because in many cases, it results in an unsolvable task. In the case where the model is able to generate an instruction, we compare it in terms of meaning with the ground-truth. For that, we use [Sentence-BERT](https://arxiv.org/pdf/1908.10084.pdf), precisely [All-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) with the threshold of our choice (typically 0.5). This filtering technique is not recommended for models with a frailty ability to understand natural language text.

### Uniqueness
Another alternative is to post-process the raw dataset by only keeping instructions that are not similar to each other in terms of meaning. Once again we make use of Sentence-BERT. An instruction is kept if any previously generated instruction has a similarity score less than a threshold (typically 0.5) w.r.t the considered instruction.
Expand Down

0 comments on commit aa90766

Please sign in to comment.