You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a huge wave of interest around high accuracy Q&A, such as via Retrieval Augmented Generation (RAG). RAG accuracy is largely driven by how well vector search is able to retrieve the correct context to answer questions via an LLM. When evaluating embedding models, vector search retrieval metrics are helpful but insufficient because they don't reveal how well the retrieved content actually answers the target questions.
Pitch
I'd love to see an integration with a tool like our new ragulate library (Apache 2 licensed) that would simplify model evaluation on RAG Q&A: https://github.com/epinzur/ragulate/tree/main
Additional context
I was going to suggest that you integrate with trulens, but then I discovered that we built ragulate to automate much of the process of using trulens, and we'd love feedback on it.
The text was updated successfully, but these errors were encountered:
@devinbost, thank you for your suggestion. Indeed, it would be nice to have such meters available in TM. Said so, I would love to see your PR adding them as complete code not referring to an external package... 🦩
Problem & Motivation
There is a huge wave of interest around high accuracy Q&A, such as via Retrieval Augmented Generation (RAG). RAG accuracy is largely driven by how well vector search is able to retrieve the correct context to answer questions via an LLM. When evaluating embedding models, vector search retrieval metrics are helpful but insufficient because they don't reveal how well the retrieved content actually answers the target questions.
Pitch
I'd love to see an integration with a tool like our new ragulate library (Apache 2 licensed) that would simplify model evaluation on RAG Q&A: https://github.com/epinzur/ragulate/tree/main
Additional context
I was going to suggest that you integrate with trulens, but then I discovered that we built ragulate to automate much of the process of using trulens, and we'd love feedback on it.
The text was updated successfully, but these errors were encountered: