Testset Generation: bringing continual learning to RAG pipelines

We started ragas with `ground-truth`free evaluations so that you didn't have to put significant upfront effort into building an ideal test set before running evaluations. **Creating a test set needs substantial upfront investment in time, money, human hours and expertise to get it right.** It is also a continuous process as your product and ML model evolve to cater to diverse use cases.  We are exploring the possibilities of synthetic test set generation because

1. As RAG users get more mature and go into production, having a solid test set and evaluation strategies becomes critical to give users a seamless experience. This means they have to put more time into building solid test sets and evaluation methodologies.
2. ground-truth free evaluation has its limitations. It is very effective in quantifying aspects like faithfulness but **ground-truth free evaluation cannot be used to ensure aspects like answer correctness which is also very important**. Here a synthetic test set with ground-truth can be of high utility.

The whole focus of the Ragas library is to help you build more reliable RAG applications which is why with the next leg of Ragas we'll be focusing a lot more on test set generation and continual learning of RAG pipelines. The goal is to leverage custom LLMs and Data-Centric AI techniques to

1. Build more robust paradigms for test set generation. 
    1. Many libraries already have some sort of test set generation but they have a few shortcomings. Ideally, the test set should have a good distribution of easy -> hard questions across different tasks/situations as seen in production.
2. Tools to scale up and reduce the cost of test set generation. 
    1. Works like self-instruct and evol-instruct have proven that LLMs can generate human-quality synthetic data. **We are working on paradigms to generate high-quality synthetic data generation specific to RAG**. Ref [[1]](https://arxiv.org/abs/2306.08568) [[2]](https://arxiv.org/abs/2212.10560)
3. Methodologies to continuously add to and improve the test set as your RAG pipelines evolve using other data points like logs and feedback.

there is a lot of work to be done but with the v0.1 release of Ragas, we'll be releasing features in this direction. In the meantime, we would love to hear your opinions, expectations, suggestions and ideas about this too :)

Team Ragas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testset Generation: bringing continual learning to RAG pipelines #136

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Testset Generation: bringing continual learning to RAG pipelines #136

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions