Skip to content

kuldeeparyadotcom/poc-deepseek-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is this git repository all about?

This git repo is just to share a minimalistic working demo of a RAG (Retrieval Augmented Generation) powered by DeepSeek LLM Model & LangChain framework.

Please Note - It's an offline system that means it is designed to run locally on your system.

It implements an AI powered question and answer system. This is how use experience would look like -

  1. System allows user to upload a pdf file.
  2. Once file is uploaded successfully, user can ask questions and system tries to answer it only based on that uploaded document.
  3. Systems is expected to not respond to the questions that are not relevant to the uploaded document.

System Components

  1. User Interaction: Users interact with the system via a Streamlit UI.
  2. Document Processing: PDFs are uploaded, processed using PDFPlumber, and split into chunks.
  3. Embedding Generation: The DeepSeek model generates vector embeddings from document chunks.
  4. Vector Storage & Retrieval: FAISS stores and retrieves vector embeddings based on similarity search.
  5. LLM-Based Response: The retrieved documents are sent to DeepSeek LLM for answer generation.
  6. Final Answer Display: The AI-generated response is displayed in the Streamlit UI.

How to set this up and run locally on your machine

Step 1 - Clone the repository and build a docker container.

docker build -t poc_deepseek_rag:latest .

Step 2 - Run

docker run --rm --name deepseek_qna -p 8501:8501 -p 11434:11434 poc_deepseek_rag:latest Access via browser http://localhost:8501/

Important

If you notice that your machine is not able to handle the deepseek-r1:14b model version configured in Dockerfile, visit Ollama website and updated Dockerfile to use a lighter one such as deepseek-r1:7b. Pick one configuration depending on available hardware on your machine. Just to give an idea - on my Macbook with 16GB RAM I tried this 14b and it performs with reasonable latency.

Alternatives

With the technology available today, you can build an AI powered question and answer system in several ways -

  1. Offline RAG - As demonstrated in this repo.
  2. Online RAG - Check out the other Git Repo and Blog
  3. Solely LLM based - You can build this use case by solely relying on the LLM such as based on Open AI LLM APIs. Check the other repo - AI Assisted Q&A prototype - that demonstrates how to achieve it in non-RAG way. In fact, in many cases that may produce way better results. Be mindful about pros and cons of both approaches.

References

Nariman Codes did an amazing job to explain RAG with LangChain prototype really well. This is very similar implementation in a nutshell that is packaged within docker in this repo. https://www.youtube.com/watch?v=M6vZ6b75p9k&list=PLp01ObP3udmq2quR-RfrX4zNut_t_kNot

Got any question/feedback?

Feel free to reach out via LinkedIn.

About

PoC - DeepSeek + LangChain + Docker - An Offline RAG

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published