Skip to content

Commit

Permalink
Common misconceptions.
Browse files Browse the repository at this point in the history
  • Loading branch information
jondurbin committed Aug 24, 2023
1 parent af164b1 commit 3ef610b
Show file tree
Hide file tree
Showing 6 changed files with 47 additions and 2 deletions.
7 changes: 7 additions & 0 deletions airoboros/instructors/misconception.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from airoboros.instructors.inline_qa import generate as generate_inline


async def generate(instructor, **kwargs):
"""Generator for misconception training data."""
async for item in generate_inline(instructor, "misconception", **kwargs):
yield item
25 changes: 25 additions & 0 deletions airoboros/instructors/prompts/misconception.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
I need you to help me generate some question answer pairs related to common misconceptions.

Here are a few examples:
Example 1: QUESTION: What does searing do to meat?
ANSWER: While it is commonly believed that searing helps seal moisture into meat, in reality it causes the meat to lose some moisture. Meat is seared to brown it, to affect its color, flavor, and texture.

Example 2: QUESTION: what do banana flavored candies taste like?
ANSWER: One would think banana-flavored candy would taste like bananas, but as it turns out, banana-flavored candy was not intended to mimic the taste of a formerly popular variety of banana. The reason banana candy tastes different from bananas is that it is mainly flavored with only one of the many flavors a banana has, the ester compound isoamyl acetate.

Example 3: QUESTION: What was the first feature-length animated film?
ANSWER: El Apóstol, a lost 1917 Argentine silent film that used cutout animation, is considered the first feature-length animated film. The film many commonly believe to be the first, however, is Walt Disney Studios' Snow White and the Seven Dwarfs, due to it being the first feature-length animated film to use cel animation.

Please write {batch_size} more examples of a similar format, with "QUESTION: [question]" and "ANSWER: [answer]" with the correct answer.

All output text should be in {language}, but the exact terms "QUESTION" and "ANSWER" are special tokens that must not be translated.

The output format should be:
QUESTION: first question
ANSWER: first question's answer

QUESTION: second question
ANSWER: second question's answer
...

Do not number the question/answer pairs, just include them in the response format indicated.
2 changes: 1 addition & 1 deletion airoboros/instructors/prompts/riddle.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ ANSWER: Day and night.
Example 6: QUESTION: Jackie has 3 brothers. Each brother has 2 sisters. How many sisters does Jackie have?
ANSWER: Jackie has one sister. If each brother has 2 sisters, assuming Jackie is a female, that would mean there are two total female siblings, of which Jackie is one, so 2 - 1 = 1. The number of brothers is an extraneous detail.

Please write 20 more examples of a similar format, with "QUESTION: [puzzle]" and "ANSWER: [answer]" with the correct answer, reasoning step-by-step for the answer when appropriate.
Please write {batch_size} more examples of a similar format, with "QUESTION: [puzzle]" and "ANSWER: [answer]" with the correct answer, reasoning step-by-step for the answer when appropriate.

All output text should be in {language}, but the exact terms "QUESTION" and "ANSWER" are special tokens that must not be translated.

Expand Down
4 changes: 4 additions & 0 deletions airoboros/self_instruct.py
Original file line number Diff line number Diff line change
Expand Up @@ -763,6 +763,9 @@ async def run(self):
from airoboros.instructors.general import generate as general_generator
from airoboros.instructors.gtkm import generate as gtkm_generator
from airoboros.instructors.joke import generate as joke_generator
from airoboros.instructors.misconception import (
generate as misconception_generator,
)
from airoboros.instructors.multiple_choice import (
generate as multiple_choice_generator,
)
Expand Down Expand Up @@ -791,6 +794,7 @@ async def run(self):
"experience": experience_generator,
"general": general_generator,
"joke": joke_generator,
"misconception": misconception_generator,
"multiple_choice": multiple_choice_generator,
"plan": plan_generator,
"orca": orca_generator,
Expand Down
9 changes: 9 additions & 0 deletions example-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -429,3 +429,12 @@ instructors:
temperature: 0.7
question_count: 15
count: 100

##################################################################################
# Common misconceptions.
misconception:
api_params:
temperature: 0.5
batch_size: 25
min_docsearch_score: 0.02
count: 500
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

setup(
name="airoboros",
version="2.1.5",
version="2.1.6",
description="Updated and improved implementation of the self-instruct system.",
long_description=long_description,
long_description_content_type="text/markdown",
Expand Down

0 comments on commit 3ef610b

Please sign in to comment.