A collection of hand on notebook for LLMs practitioner
-
Updated
Jan 13, 2025 - Jupyter Notebook
A collection of hand on notebook for LLMs practitioner
For the purposes of familiarization and learning. Consists of utilizing LangChain framework, LangSmith for tracing, OpenAI LLM models, Pinecone serverless vectorDB using Jupyter Notebook and Python.
Notebooks for evaluating LLM outputs using various metrics, covering scenarios with and without known ground truth. Includes criteria such as correctness, coherence, relevance, and more, providing a comprehensive approach to assess LLM performance accurately and efficiently.
This repo contains my coding notebook for the tutorial series I made for the beginner level bias bounty challenge hosted by Humane Intelligence. I am an AI Ethics Fellow at Humane Intelligence.
Scoring LLM chatbot responses from LMSYS Chatbot Arena with SCBN and RQTL metrics, unwrapping Chatbot Arena prompts, quick chatbot setup on Jupyter notebook, and more... all things chatbots fit in this repo.
Code, datasets, and replication notebook for the preprint Anchors in the Machine: Behavioral and Attributional Evidence of Anchoring Bias in LLMs. The project replicates and extends Tversky & Kahneman’s classic anchoring experiments across six open-source LLMs (GPT-2, GPT-Neo-125M, Falcon-RW-1B, Phi-2, Gemma-2B, LLaMA-2-7B).
Add a description, image, and links to the llm-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the llm-evaluation topic, visit your repo's landing page and select "manage topics."