Skip to content

Commit 2aca534

Browse files
committed
Added the batch llm evaluator docs
1 parent 7c51ada commit 2aca534

File tree

2 files changed

+53
-1
lines changed

2 files changed

+53
-1
lines changed
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
title: "Batch LLM Evaluator"
3+
sidebarTitle: "Batch LLM Evaluator"
4+
description: "This example project evaluates multiple LLM models using the Vercel AI SDK and streams updates to the frontend using Trigger.dev Realtime."
5+
---
6+
7+
import RealtimeLearnMore from "/snippets/realtime-learn-more.mdx";
8+
9+
## Overview
10+
11+
This demo is a full stack example that uses the following:
12+
13+
- A [Next.js](https://nextjs.org/) app with [Prisma](https://www.prisma.io/) for the database.
14+
- Trigger.dev [Realtime](https://trigger.dev/launchweek/0/realtime) to stream updates to the frontend.
15+
- Work with multiple LLM models using the Vercel [AI SDK](https://sdk.vercel.ai/docs/introduction). (OpenAI, Anthropic, XAI)
16+
- Distribute tasks across multiple tasks using the new [`batch.triggerByTaskAndWait`](https://trigger.dev/docs/triggering#batch-triggerbytaskandwait) method.
17+
18+
## GitHub repo
19+
20+
<Card
21+
title="View the Batch LLM Evaluator repo"
22+
icon="GitHub"
23+
href="https://github.com/triggerdotdev/examples/tree/main/batch-llm-evaluator"
24+
>
25+
Click here to view the full code for this project in our examples repository on GitHub. You can
26+
fork it and use it as a starting point for your own project.
27+
</Card>
28+
29+
## Video
30+
31+
<video
32+
controls
33+
className="w-full aspect-video"
34+
src="https://content.trigger.dev/batch-llm-evaluator.mp4"
35+
></video>
36+
37+
## Relevant code
38+
39+
- View the Trigger.dev task code in the [src/trigger/batch.ts](https://github.com/triggerdotdev/examples/blob/main/batch-llm-evaluator/src/trigger/batch.ts) file.
40+
- The `evaluateModels` task uses the `batch.triggerByTaskAndWait` method to distribute the task to the different LLM models.
41+
- It then passes the results through to a `summarizeEvals` task that calculates some dummy "tags" for each LLM response.
42+
- We use a [useRealtimeRunsWithTag](https://trigger.dev/docs/frontend/react-hooks/realtime#userealtimerunswithtag) hook to subscribe to the different evaluation tasks runs in the [src/components/llm-evaluator.tsx](https://github.com/triggerdotdev/examples/blob/main/batch-llm-evaluator/src/components/llm-evaluator.tsx) file.
43+
- We then pass the relevant run down into three different components for the different models:
44+
- The `AnthropicEval` component: [src/components/evals/Anthropic.tsx](https://github.com/triggerdotdev/examples/blob/main/batch-llm-evaluator/src/components/evals/Anthropic.tsx)
45+
- The `XAIEval` component: [src/components/evals/XAI.tsx](https://github.com/triggerdotdev/examples/blob/main/batch-llm-evaluator/src/components/evals/XAI.tsx)
46+
- The `OpenAIEval` component: [src/components/evals/OpenAI.tsx](https://github.com/triggerdotdev/examples/blob/main/batch-llm-evaluator/src/components/evals/OpenAI.tsx)
47+
- Each of these components then uses [useRealtimeRunWithStreams](https://trigger.dev/docs/frontend/react-hooks/realtime#userealtimerunwithstreams) to subscribe to the different LLM responses.
48+
49+
<RealtimeLearnMore />

docs/mint.json

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -348,7 +348,10 @@
348348
},
349349
{
350350
"group": "Example projects",
351-
"pages": ["guides/example-projects/realtime-fal-ai"]
351+
"pages": [
352+
"guides/example-projects/realtime-fal-ai",
353+
"guides/example-projects/batch-llm-evaluator"
354+
]
352355
},
353356
{
354357
"group": "Example tasks",

0 commit comments

Comments
 (0)