Skip to content

Commit 028d76d

Browse files
authored
docs: redo readme (#1480)
1 parent c06b131 commit 028d76d

File tree

2 files changed

+15
-42
lines changed

2 files changed

+15
-42
lines changed

README.md

+14-40
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
src="./docs/_static/imgs/logo.png">
44
</h1>
55
<p align="center">
6-
<i>Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines</i>
6+
<i>Evaluation library for your LLM applications</i>
77
</p>
88

99
<p align="center">
@@ -16,33 +16,31 @@
1616
<a href="https://github.com/explodinggradients/ragas/blob/master/LICENSE">
1717
<img alt="License" src="https://img.shields.io/github/license/explodinggradients/ragas.svg?color=green">
1818
</a>
19-
<a href="https://colab.research.google.com/github/explodinggradients/ragas/blob/main/docs/quickstart.ipynb">
20-
<img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg">
19+
<a href="https://pypi.org/project/ragas/">
20+
<img alt="Open In Colab" src="https://img.shields.io/pypi/dm/ragas">
2121
</a>
2222
<a href="https://discord.gg/5djav8GGNZ">
2323
<img alt="discord-invite" src="https://dcbadge.vercel.app/api/server/5djav8GGNZ?style=flat">
2424
</a>
25-
<a href="https://github.com/explodinggradients/ragas/">
26-
<img alt="Downloads" src="https://badges.frapsoft.com/os/v1/open-source.svg?v=103">
27-
</a>
2825
</p>
2926

3027
<h4 align="center">
3128
<p>
3229
<a href="https://docs.ragas.io/">Documentation</a> |
33-
<a href="#shield-installation">Installation</a> |
34-
<a href="#fire-quickstart">Quickstart</a> |
35-
<a href="#-community">Community</a> |
36-
<a href="#-open-analytics">Open Analytics</a> |
37-
<a href="https://huggingface.co/explodinggradients">Hugging Face</a>
30+
<a href="#Quickstart">Quick start</a> |
31+
<a href="https://dcbadge.vercel.app/api/server/5djav8GGNZ?style=flat">Join Discord</a> |
3832
<p>
3933
</h4>
4034

41-
> 🚀 Dedicated solutions to evaluate, monitor and improve performance of LLM & RAG application in production including custom models for production quality monitoring.[Talk to founders](https://cal.com/shahul-ragas/30min)
35+
[Ragas](https://www.ragas.io/) supercharges your LLM application evaluations with tools to objectively measure performance, synthesize test case scenarios, and gain insights by leveraging production data.
36+
37+
Evaluating and testing LLM applications is a challenging, time-consuming, and often boring process. Ragas aims provide a suite of tools that could supercharge your evaluation workflows and make it more efficient and fun using state-of-the-art research. We are also building an open ecosystem, that fosters sharing of ideas to make the evaluation process better and collaborates with other tools in the market to make it seamless you.
4238

43-
Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is where Ragas (RAG Assessment) comes in.
39+
## Key Features
4440

45-
Ragas provides you with the tools based on the latest research for evaluating LLM-generated text to give you insights about your RAG pipeline. Ragas can be integrated with your CI/CD to provide continuous checks to ensure performance.
41+
- **Metrics**: Different LLM based and non LLM based metrics to objectively evaluate your LLM applications such as RAG, Agents, etc.
42+
- **Test Data Generation**: Synthesize high-quality datasets covering wide variety of scenarios for comprehensive testing of your LLM applications.
43+
- **Integrations**: Seamless integration with all major LLM applications frameworks like langchain and observability tools.
4644

4745
## :shield: Installation
4846

@@ -60,33 +58,9 @@ pip install git+https://github.com/explodinggradients/ragas
6058

6159
## :fire: Quickstart
6260

63-
This is a small example program you can run to see ragas in action!
64-
65-
```python
66-
67-
from datasets import Dataset
68-
import os
69-
from ragas import evaluate
70-
from ragas.metrics import faithfulness, answer_correctness
71-
72-
os.environ["OPENAI_API_KEY"] = "your-openai-key"
73-
74-
data_samples = {
75-
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
76-
'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
77-
'contexts' : [['The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles,'],
78-
['The Green Bay Packers...Green Bay, Wisconsin.','The Packers compete...Football Conference']],
79-
'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
80-
}
81-
82-
dataset = Dataset.from_dict(data_samples)
83-
84-
score = evaluate(dataset,metrics=[faithfulness,answer_correctness])
85-
score.to_pandas()
86-
```
87-
88-
Refer to our [documentation](https://docs.ragas.io/) to learn more.
8961

62+
- [Run ragas metrics for evaluating RAG](https://docs.ragas.io/en/latest/getstarted/rag_evaluation/)
63+
- [Generate test data for evaluating RAG](https://docs.ragas.io/en/latest/getstarted/rag_testset_generation/)
9064

9165
## 🫂 Community
9266

docs/concepts/metrics/overview/index.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,10 @@ A metric is a quantitative measure used to evaluate the performance of a AI appl
1010
## Different types of metrics
1111

1212
<figure markdown="span">
13-
![Metrics Mind map](../../../_static/imgs/metrics_mindmap.png){width="600"}
13+
![Component-wise Evaluation](../../../_static/imgs/metrics_mindmap.png){width="600"}
1414
<figcaption>Metrics Mind map</figcaption>
1515
</figure>
1616

17-
1817
**Metrics can be classified into two categories based on the mechanism used underneath the hood**:
1918

2019
&nbsp;&nbsp;&nbsp;&nbsp; **LLM-based metrics**: These metrics use LLM underneath to do the evaluation. There might be one or more LLM calls that are performed to arrive at the score or result. These metrics can be somewhat non deterministic as the LLM might not always return the same result for the same input. On the other hand, these metrics has shown to be more accurate and closer to human evaluation.

0 commit comments

Comments
 (0)