Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Question Generation module for evaluation. #1231

Merged
merged 10 commits into from
Apr 19, 2023

Conversation

ravi03071991
Copy link
Contributor

Hi,

Added Question Generation module to DatasetGenerator for model evaluation and metrics. This module enables the evaluation of the model's performance in a QA system by generating questions and subsequently using the evaluation module for evaluating and proving metrics.

Thank you.

from langchain.chat_models import ChatOpenAI
from llama_index.langchain_helpers.text_splitter import TokenTextSplitter

DEFAULT_QUESTION_GENERATION_PROMPT = """"Context information is below.\n"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have four quote marks in the beginning (keep to 3)

@@ -0,0 +1,5 @@
"""Dataset generation module."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of creating a new folder here, can you add the dataset generation module as part of evaluation folder? You can name the file dataset_generation.py.

And add DatasetGenerator to evaluation/__init__.py.

)

response = index.query(
f"You are a Teacher/ Professor. Your task is to setup \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd abstract this into a constant string template (put as variable at the top of the file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it uses the num_questions_per_chunk variable. probably I initialize it in the init?

Generates questions for each document.
"""

def document_question_generator(chunks: List[str]) -> List[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't define this as a nested function. Define this on the class _document_question_generator

@jerryjliu jerryjliu merged commit cf9f26d into run-llama:main Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants