RAG_Pipeline

This repository provides a step-by-step walkthrough of the RAG (Retrieval-Augmented Generation) pipeline codebase The pipeline is implemented using a series of Jupyter notebooks. Follow the steps below to understand and run the pipeline.

Prerequisites

Before you begin, ensure you have the following installed:

Python 3.11.11
Jupyter Notebook
Required Python packages (listed in requirements.txt)
remove .example from .env.example and fill in the required values

Setup

Clone the Repository

git clone https://github.com/devzohaib/RAG_Pipeline.git
cd RAG_Pipeline

Install Dependencies
```
pip install -r requirements.txt
```

Notebooks Overview

1. Data Preparation

Notebook: 1-Data_Collection.ipynb

Objective: Prepare and preprocess the dataset for the RAG pipeline.
Steps:
1. Load the dataset.
2. Clean and preprocess the text data.
3. Save the processed data for further use.

2. Data Embeddings & Storage

Notebook: 2-Data_Embedding_and_Storage.ipynb

Objective: Creating Embedding for process dataset and Store Embedding into VectorStore
Steps:
1. Load the Batch of process data.
2. Creating the Embedding of data using test-embedding-3-small OpenAI embedding model .
3. Adding Data to the VectorStore.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
media		media
.env.example		.env.example
1-Data_Collection.ipynb		1-Data_Collection.ipynb
2-Data_Embedding_and_Storage.ipynb		2-Data_Embedding_and_Storage.ipynb
3-Augmented_input_Generation.ipynb		3-Augmented_input_Generation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

RAG_Pipeline

Prerequisites

Setup

Notebooks Overview

1. Data Preparation

2. Data Embeddings & Storage

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

devzohaib/RAG_Pipeline

Folders and files

Latest commit

History

Repository files navigation

RAG_Pipeline

Prerequisites

Setup

Notebooks Overview

1. Data Preparation

2. Data Embeddings & Storage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages