Skip to content

A multi-model Retrieval-Augmented Generation (RAG) pipeline combining ColPali, Qdrant, and Qwen2-VL.

Notifications You must be signed in to change notification settings

erkara/MultiModel-RAG-ColPali-Qdrant-Qwen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MultiModel-RAG-ColPali-Qdrant-Qwen

This repository has two notebooks that demonstrate a vision-based Retrieval-Augmented Generation (RAG) pipeline built with ColPali, Qdrant, and Qwen models. The project focuses on efficient image-based retrieval and generating insightful answers to user queries. Here is what we have:

ColPali: A state-of-the-art Vision Language Model (VLM) for document retrieval. By treating each PDF page as an image, ColPali skips the need for complicated OCR and layout detection pipelines. It generates multi-vector embeddings for each page and has shown significant improvements over traditional approaches in several benchmarks.

Qdrant: A fast and scalable vector database. Qdrant supports multi-vector embeddings, making it a great fit for ColPali since embeddings are created for each image patch. It’s an open-source solution, with a free-tier option, that handles large-scale similarity searches efficiently.

Qwen2-VL You probably know this one, pretty famous Vision Language Model by Alibaba, integrated to generate detailed and contextually rich answers from the retrieved images.

There are two notebooks:

  • colpali_intro.ipynb: Set up a retrieval pipeline using ColPali without requiring a vector store. It also includes interpretability features to visualize query-image similarities.

  • colpali_qdrant.ipynb: Extend the pipeline by integrating Qdrant to handle large-scale retrieval.

It would be hard to pull this notebook together without great resources. Check ColPali cookbooks, Qdrant tutorial, and Vespa blog for more cool stuff. I hope you enjoy!

Vision-based RAG Pipeline

About

A multi-model Retrieval-Augmented Generation (RAG) pipeline combining ColPali, Qdrant, and Qwen2-VL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published