Intro to Q&A Systems with Large Language Models

Setting Up Dependencies

source setup.sh

And when done,

deactivate

Setting up environment variables

Create a .env file in this repo. Add yur keys and secrets to download your data there:

OPENAI_API_KEY=
ANTHROPIC_API_KEY=
MLOPS_DATA_URL=

Hello Milo

streamlit run introduction/hello_milo.py

Course Proof-of-concept Prototype

The proof of concept prototype for the course is in the folder poc/:

First run the notebook here to understand the code.
Then, run the PoC a. First download the data with python poc/download_chats.py. b. Then, build the index with the data pre-processing pipeline in python poc/build_index.py c. Run Milo assistant with streamlit run poc/milo.py

Optional Labs

Hello Milo

A simple MLOps Q&A bot using OpenAI directly. Note: DOES NOT USE RETRIVAL-AUGMENTED GENERATION.

streamlit run introduction/hello_milo.py

Q&A on Video

A Q&A that answers questions based on a video transcript. Note: DOES NOT USE RETRIVAL-AUGMENTED GENERATION.

This is one example of RAG, where the entire transcript is the retrieved context. Since transcripts are large, we need a LLM with a large window - for this we use Anthropic's Claude.

Make sure you have your ANTHROPIC_API_KEY set in your .env file.

streamlit run video/video_milo.py

e.g. Use https://www.youtube.com/watch?v=0e5q4zCBtBs and questions about the panel discussion.

Q&A from blog articles

Another example of RaG from blog data where we answer questions based on data on blugs that are publicly available.

a. First download the data with python blog/download_blogs.py. b. Then, build the index with the data pre-processing pipeline in python blog/build_index.py c. Run Milo assistant with streamlit run blog/blog_milo.py

You can also change the blog in download_blogs.py:

PAGES = [
    "https://mlops.community/building-the-future-with-llmops-the-main-challenges/",
]

NOTE: the html page contains a lot of data. This is where data cleanup comes in. Feel free to clean up the data manually or with a script to see improved performance.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
blog		blog
introduction		introduction
poc		poc
tests		tests
video		video
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Intro to Q&A Systems with Large Language Models

Setting Up Dependencies

Setting up environment variables

Hello Milo

Course Proof-of-concept Prototype

Optional Labs

Hello Milo

Q&A on Video

Q&A from blog articles

About

Uh oh!

Releases

Packages

Languages

License

sourishkrout/course-intro-to-qa-systems-with-llms

Folders and files

Latest commit

History

Repository files navigation

Intro to Q&A Systems with Large Language Models

Setting Up Dependencies

Setting up environment variables

Hello Milo

Course Proof-of-concept Prototype

Optional Labs

Hello Milo

Q&A on Video

Q&A from blog articles

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages