Skip to content

When we generate the Test set for evaluating RAG, how to add the source file name from which the QnA is generated. #2054

Open
@zaid1212

Description

@zaid1212
  • I checked the documentation and related resources and couldn't find an answer to my question.

Question

  • Previously there was a feature which would let you add the name of the source file beside each question. This was helping me to generate another metrics named "Source Relevancy" which will help in knowing if the RAG application picked the right file or not. But now there is no such parameter. How to get the source file name in Test set generated?

Code Example

loader = DirectoryLoader(path, glob="**/*",silent_errors=True,show_progress=True)
docs = loader.load()

from ragas.testset import TestsetGenerator

generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings)
dataset = generator.generate_with_langchain_docs(docs, testset_size=10)

Additional context

  • How can I make sure the metadata or the source file name is always generated as a column in my final dataset?

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    module-testsetgenModule testset generationquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions