Skip to content

docs: add zeno visualization integration #359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/howtos/integrations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ llamaindex.ipynb
langchain.ipynb
langsmith.ipynb
langfuse.ipynb
zeno.ipynb
:::
228 changes: 228 additions & 0 deletions docs/howtos/integrations/zeno.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Visualizing Ragas Results with Zeno\n",
"\n",
"You can use the [Zeno](https://zenoml.com) evaluation platform to easily visualize and explore the results of your Ragas evaluation.\n",
"\n",
"> Check out what the result of this tutorial looks like [here](https://hub.zenoml.com/project/b35c83b8-0b22-4b9c-aedb-80964011d7a7/ragas%20FICA%20eval)\n",
"\n",
"First, install the `zeno-client` package:\n",
"\n",
"```bash\n",
"pip install zeno-client\n",
"```\n",
"\n",
"Next, create an account at [hub.zenoml.com](https://hub.zenoml.com) and generate an API key on your [account page](https://hub.zenoml.com/account).\n",
"\n",
"We can now pick up the evaluation where we left off at the [Getting Started](../../getstarted/evaluation.md) guide:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import pandas as pd\n",
"from datasets import load_dataset\n",
"from ragas import evaluate\n",
"from ragas.metrics import (\n",
" answer_relevancy,\n",
" context_precision,\n",
" context_recall,\n",
" faithfulness,\n",
")\n",
"from zeno_client import ZenoClient, ZenoMetric"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set API keys\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\"\n",
"os.environ[\"ZENO_API_KEY\"] = \"your-zeno-api-key\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fiqa_eval = load_dataset(\"explodinggradients/fiqa\", \"ragas_eval\")\n",
"result = evaluate(\n",
" fiqa_eval[\"baseline\"],\n",
" metrics=[\n",
" context_precision,\n",
" faithfulness,\n",
" answer_relevancy,\n",
" context_recall,\n",
" ],\n",
")\n",
"\n",
"df = result.to_pandas()\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now take the `df` with our data and results and upload it to Zeno.\n",
"\n",
"We first create a project with a custom RAG view specification and the metric columns we want to do evaluation across:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"client = ZenoClient(os.environ[\"ZENO_API_KEY\"])\n",
"\n",
"project = client.create_project(\n",
" name=\"Ragas FICA eval\",\n",
" description=\"Evaluation of RAG model using Ragas on the FICA dataset\",\n",
" view={\n",
" \"data\": {\n",
" \"type\": \"vstack\",\n",
" \"keys\": {\n",
" \"question\": {\"type\": \"markdown\"},\n",
" \"texts\": {\n",
" \"type\": \"list\",\n",
" \"elements\": {\"type\": \"markdown\"},\n",
" \"border\": True,\n",
" \"pad\": True,\n",
" },\n",
" },\n",
" },\n",
" \"label\": {\n",
" \"type\": \"markdown\",\n",
" },\n",
" \"output\": {\n",
" \"type\": \"vstack\",\n",
" \"keys\": {\n",
" \"answer\": {\"type\": \"markdown\"},\n",
" \"ground_truths\": {\n",
" \"type\": \"list\",\n",
" \"elements\": {\"type\": \"markdown\"},\n",
" \"border\": True,\n",
" \"pad\": True,\n",
" },\n",
" },\n",
" },\n",
" \"size\": \"large\",\n",
" },\n",
" metrics=[\n",
" ZenoMetric(\n",
" name=\"context_precision\", type=\"mean\", columns=[\"context_precision\"]\n",
" ),\n",
" ZenoMetric(name=\"faithfulness\", type=\"mean\", columns=[\"faithfulness\"]),\n",
" ZenoMetric(name=\"answer_relevancy\", type=\"mean\", columns=[\"answer_relevancy\"]),\n",
" ZenoMetric(name=\"context_recall\", type=\"mean\", columns=[\"context_recall\"]),\n",
" ],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we upload the base dataset with the questions and ground truths:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_df = pd.DataFrame(\n",
" {\n",
" \"data\": df.apply(\n",
" lambda x: {\"question\": x[\"question\"], \"texts\": list(x[\"contexts\"])}, axis=1\n",
" ),\n",
" \"label\": df[\"ground_truths\"].apply(lambda x: \"\\n\".join(x)),\n",
" }\n",
")\n",
"data_df[\"id\"] = data_df.index\n",
"\n",
"project.upload_dataset(\n",
" data_df, id_column=\"id\", data_column=\"data\", label_column=\"label\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lastly, we upload the RAG outputs and Ragas metrics. \n",
"\n",
"You can run this for any number of models when doing comparison and iteration:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"output_df = df[\n",
" [\n",
" \"context_precision\",\n",
" \"faithfulness\",\n",
" \"answer_relevancy\",\n",
" \"context_recall\",\n",
" ]\n",
"].copy()\n",
"\n",
"output_df['output'] = df.apply(\n",
" lambda x: {\"answer\": x[\"answer\"], \"ground_truths\": list(x[\"ground_truths\"])}, axis=1\n",
")\n",
"output_df[\"id\"] = output_df.index\n",
"\n",
"project.upload_system(\n",
" output_df, name=\"Base System\", id_column=\"id\", output_column=\"output\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Reach out to the Zeno team on [Discord](https://discord.gg/km62pDKAkE) or at [hello@zenoml.com](mailto:hello@zenoml.com) if you have any questions!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "zeno-build",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 2
}