LLMaJ for evaluation of InstructLab fine-tuned LLM with and without RAG #177

ktam3 · 2024-11-14T16:51:17Z

Feature Overview (mandatory - Complete while in New status)

LLM-as-Judge (LLMaJ) pipeline for the side-by-side evaluation of POCs using InstructLab fine-tuned LLMs with and without RAG.

Goals (mandatory - Complete while in New status)
Provide an LLMaJ pipeline users can execute for evaluation comparing one or more combinations:

Starter model
Starter model w/RAG
InstructLab fine-tuned model
InstructLab fine-tuned model w/RAG
External LLM provider (e.g. OpenAI)

The LLMaJ model should be served by an OpenAI-compatible endpoint such that the user can use GPT4 or a custom or on-premise LLMaJ model (e.g. Mixtral)

Requirements (mandatory -_ Complete while in Refinement status):

The LLMaJ pipeline for evaluating models with and without RAG should be available for downstream users.
It should store the evaluations in a format that can be consumed by data scientists interested in further analysis (e.g., parquet table)

Done - Acceptance Criteria (mandatory - Complete while in Refinement status):

The pipeline is available for RHEL AI users
The docs should include information on how to use/run this new pipeline

Out of Scope {}(Initial completion while in Refinement status):{}

This is not about an exhaustive evaluation of RAG methodologies.
This is not about creating a LoRA layer or fine-tuned model for LLMaJ

Tasks Needed:

Create design doc for LLMaJ #178

ktam3 added the enhancement New feature or request label Nov 14, 2024

ktam3 assigned alimaredia Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLMaJ for evaluation of InstructLab fine-tuned LLM with and without RAG #177

LLMaJ for evaluation of InstructLab fine-tuned LLM with and without RAG #177

ktam3 commented Nov 14, 2024 •

edited

Loading

LLMaJ for evaluation of InstructLab fine-tuned LLM with and without RAG #177

LLMaJ for evaluation of InstructLab fine-tuned LLM with and without RAG #177

Comments

ktam3 commented Nov 14, 2024 • edited Loading

ktam3 commented Nov 14, 2024 •

edited

Loading