Skip to content

Commit 9403320

Browse files
authored
docs: caching in ragas (#1779)
1 parent e8f9232 commit 9403320

File tree

7 files changed

+392
-5
lines changed

7 files changed

+392
-5
lines changed

Makefile

+5-4
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,11 @@ test-e2e: ## Run end2end tests
3636
run-ci: format lint type test ## Running all CI checks
3737

3838
# Docs
39-
rewrite-docs: ## Use GPT4 to rewrite the documentation
40-
@echo "Rewriting the documentation in directory $(DIR)..."
41-
$(Q)python $(GIT_ROOT)/docs/python alphred.py --directory $(DIR)
42-
docsite: ## Build and serve documentation
39+
build-docsite: ## Use GPT4 to rewrite the documentation
40+
@echo "convert ipynb notebooks to md files"
41+
$(Q)python $(GIT_ROOT)/docs/ipynb_to_md.py
42+
@(Q)mkdocs build
43+
serve-docsite: ## Build and serve documentation
4344
$(Q)mkdocs serve --dirtyreload
4445

4546
# Benchmarks
+100
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Caching in Ragas
2+
3+
You can use caching to speed up your evaluations and testset generation by avoiding redundant computations. We use Exact Match Caching to cache the responses from the LLM and Embedding models.
4+
5+
You can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] which uses a local disk cache to store the cached responses. You can also implement your own custom cacher by implementing the [CacheInterface][ragas.cache.CacheInterface].
6+
7+
8+
## Using DefaultCacher
9+
10+
Let's see how you can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] LLM and Embedding models.
11+
12+
13+
14+
```python
15+
from ragas.cache import DiskCacheBackend
16+
17+
cacher = DiskCacheBackend()
18+
19+
# check if the cache is empty and clear it
20+
print(len(cacher.cache))
21+
cacher.cache.clear()
22+
print(len(cacher.cache))
23+
```
24+
25+
26+
27+
28+
DiskCacheBackend(cache_dir=.cache)
29+
30+
31+
32+
Create an LLM and Embedding model with the cacher, here I'm using the `ChatOpenAI` from [langchain-openai](https://github.com/langchain-ai/langchain-openai) as an example.
33+
34+
35+
36+
```python
37+
from langchain_openai import ChatOpenAI
38+
from ragas.llms import LangchainLLMWrapper
39+
40+
cached_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"), cache=cacher)
41+
```
42+
43+
44+
```python
45+
# if you want to see the cache in action, set the logging level to debug
46+
import logging
47+
from ragas.utils import set_logging_level
48+
49+
set_logging_level("ragas.cache", logging.DEBUG)
50+
```
51+
52+
Now let's run a simple evaluation.
53+
54+
55+
```python
56+
from ragas import evaluate
57+
from ragas import EvaluationDataset
58+
59+
from ragas.metrics import FactualCorrectness, AspectCritic
60+
from datasets import load_dataset
61+
62+
# Define Answer Correctness with AspectCritic
63+
answer_correctness = AspectCritic(
64+
name="answer_correctness",
65+
definition="Is the answer correct? Does it match the reference answer?",
66+
llm=cached_llm,
67+
)
68+
69+
metrics = [answer_correctness, FactualCorrectness(llm=cached_llm)]
70+
71+
# load the dataset
72+
dataset = load_dataset(
73+
"explodinggradients/amnesty_qa", "english_v3", trust_remote_code=True
74+
)
75+
eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])
76+
77+
# evaluate the dataset
78+
results = evaluate(
79+
dataset=eval_dataset,
80+
metrics=metrics,
81+
)
82+
83+
results
84+
```
85+
86+
This took almost 2mins to run in our local machine. Now let's run it again to see the cache in action.
87+
88+
89+
```python
90+
results = evaluate(
91+
dataset=eval_dataset,
92+
metrics=metrics,
93+
)
94+
95+
results
96+
```
97+
98+
Runs almost instantaneously.
99+
100+
You can also use this with testset generation also by replacing the `generator_llm` with a cached version of it. Refer to the [testset generation](../../getstarted/rag_testset_generation.md) section for more details.
+173
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Caching in Ragas\n",
8+
"\n",
9+
"You can use caching to speed up your evaluations and testset generation by avoiding redundant computations. We use Exact Match Caching to cache the responses from the LLM and Embedding models.\n",
10+
"\n",
11+
"You can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] which uses a local disk cache to store the cached responses. You can also implement your own custom cacher by implementing the [CacheInterface][ragas.cache.CacheInterface].\n",
12+
"\n",
13+
"\n",
14+
"## Using DefaultCacher\n",
15+
"\n",
16+
"Let's see how you can use the [DiskCacheBackend][ragas.cache.DiskCacheBackend] LLM and Embedding models.\n"
17+
]
18+
},
19+
{
20+
"cell_type": "code",
21+
"execution_count": 1,
22+
"metadata": {},
23+
"outputs": [
24+
{
25+
"data": {
26+
"text/plain": [
27+
"DiskCacheBackend(cache_dir=.cache)"
28+
]
29+
},
30+
"execution_count": 1,
31+
"metadata": {},
32+
"output_type": "execute_result"
33+
}
34+
],
35+
"source": [
36+
"from ragas.cache import DiskCacheBackend\n",
37+
"\n",
38+
"cacher = DiskCacheBackend()\n",
39+
"\n",
40+
"# check if the cache is empty and clear it\n",
41+
"print(len(cacher.cache))\n",
42+
"cacher.cache.clear()\n",
43+
"print(len(cacher.cache))"
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"metadata": {},
49+
"source": [
50+
"Create an LLM and Embedding model with the cacher, here I'm using the `ChatOpenAI` from [langchain-openai](https://github.com/langchain-ai/langchain-openai) as an example.\n"
51+
]
52+
},
53+
{
54+
"cell_type": "code",
55+
"execution_count": 2,
56+
"metadata": {},
57+
"outputs": [],
58+
"source": [
59+
"from langchain_openai import ChatOpenAI\n",
60+
"from ragas.llms import LangchainLLMWrapper\n",
61+
"\n",
62+
"cached_llm = LangchainLLMWrapper(ChatOpenAI(model=\"gpt-4o\"), cache=cacher)"
63+
]
64+
},
65+
{
66+
"cell_type": "code",
67+
"execution_count": null,
68+
"metadata": {},
69+
"outputs": [],
70+
"source": [
71+
"# if you want to see the cache in action, set the logging level to debug\n",
72+
"import logging\n",
73+
"from ragas.utils import set_logging_level\n",
74+
"\n",
75+
"set_logging_level(\"ragas.cache\", logging.DEBUG)"
76+
]
77+
},
78+
{
79+
"cell_type": "markdown",
80+
"metadata": {},
81+
"source": [
82+
"Now let's run a simple evaluation."
83+
]
84+
},
85+
{
86+
"cell_type": "code",
87+
"execution_count": null,
88+
"metadata": {},
89+
"outputs": [],
90+
"source": [
91+
"from ragas import evaluate\n",
92+
"from ragas import EvaluationDataset\n",
93+
"\n",
94+
"from ragas.metrics import FactualCorrectness, AspectCritic\n",
95+
"from datasets import load_dataset\n",
96+
"\n",
97+
"# Define Answer Correctness with AspectCritic\n",
98+
"answer_correctness = AspectCritic(\n",
99+
" name=\"answer_correctness\",\n",
100+
" definition=\"Is the answer correct? Does it match the reference answer?\",\n",
101+
" llm=cached_llm,\n",
102+
")\n",
103+
"\n",
104+
"metrics = [answer_correctness, FactualCorrectness(llm=cached_llm)]\n",
105+
"\n",
106+
"# load the dataset\n",
107+
"dataset = load_dataset(\n",
108+
" \"explodinggradients/amnesty_qa\", \"english_v3\", trust_remote_code=True\n",
109+
")\n",
110+
"eval_dataset = EvaluationDataset.from_hf_dataset(dataset[\"eval\"])\n",
111+
"\n",
112+
"# evaluate the dataset\n",
113+
"results = evaluate(\n",
114+
" dataset=eval_dataset,\n",
115+
" metrics=metrics,\n",
116+
")\n",
117+
"\n",
118+
"results"
119+
]
120+
},
121+
{
122+
"cell_type": "markdown",
123+
"metadata": {},
124+
"source": [
125+
"This took almost 2mins to run in our local machine. Now let's run it again to see the cache in action."
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": null,
131+
"metadata": {},
132+
"outputs": [],
133+
"source": [
134+
"results = evaluate(\n",
135+
" dataset=eval_dataset,\n",
136+
" metrics=metrics,\n",
137+
")\n",
138+
"\n",
139+
"results"
140+
]
141+
},
142+
{
143+
"cell_type": "markdown",
144+
"metadata": {},
145+
"source": [
146+
"Runs almost instantaneously.\n",
147+
"\n",
148+
"You can also use this with testset generation also by replacing the `generator_llm` with a cached version of it. Refer to the [testset generation](../../getstarted/rag_testset_generation.md) section for more details."
149+
]
150+
}
151+
],
152+
"metadata": {
153+
"kernelspec": {
154+
"display_name": ".venv",
155+
"language": "python",
156+
"name": "python3"
157+
},
158+
"language_info": {
159+
"codemirror_mode": {
160+
"name": "ipython",
161+
"version": 3
162+
},
163+
"file_extension": ".py",
164+
"mimetype": "text/x-python",
165+
"name": "python",
166+
"nbconvert_exporter": "python",
167+
"pygments_lexer": "ipython3",
168+
"version": "3.10.15"
169+
}
170+
},
171+
"nbformat": 4,
172+
"nbformat_minor": 2
173+
}

docs/references/cache.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
::: ragas.cache
2+
options:
3+
members_order: "source"

mkdocs.yml

+4
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ nav:
7777
- General:
7878
- Customise models: howtos/customizations/customize_models.md
7979
- Run Config: howtos/customizations/_run_config.md
80+
- Caching: howtos/customizations/_caching.md
8081
- Metrics:
8182
- Modify Prompts: howtos/customizations/metrics/_modifying-prompts-metrics.md
8283
- Adapt Metrics to Languages: howtos/customizations/metrics/_metrics_language_adaptation.md
@@ -88,6 +89,7 @@ nav:
8889
- Persona Generation: howtos/customizations/testgenerator/_persona_generator.md
8990
- Custom Single-hop Query: howtos/customizations/testgenerator/_testgen-custom-single-hop.md
9091
- Custom Multi-hop Query: howtos/customizations/testgenerator/_testgen-customisation.md
92+
9193
- Applications:
9294
- howtos/applications/index.md
9395
- Metrics:
@@ -107,6 +109,7 @@ nav:
107109
- Embeddings: references/embeddings.md
108110
- RunConfig: references/run_config.md
109111
- Executor: references/executor.md
112+
- Cache: references/cache.md
110113
- Evaluation:
111114
- Schemas: references/evaluation_schema.md
112115
- Metrics: references/metrics.md
@@ -237,3 +240,4 @@ extra_javascript:
237240
- _static/js/header_border.js
238241
- https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js
239242
- _static/js/toggle.js
243+
- https://cdn.octolane.com/tag.js?pk=c7c9b2b863bf7eaf4e2a # octolane for analytics

0 commit comments

Comments
 (0)