🚀 LLM & Semantic Search Playground

Welcome to the LLM & Semantic Search Playground! This repository is a hands-on guide and collection of scripts demonstrating how to interact with various Large Language Models (LLMs) and Embedding Models. It showcases the foundational concepts behind modern AI applications like RAG (Retrieval-Augmented Generation).

This project explores:

Invoking different LLMs (Open-Source vs. Closed-Source).
Using models via APIs (like OpenAI, Gemini) vs. running them locally.
Generating text embeddings to capture semantic meaning.
Performing semantic search using cosine similarity to find the most relevant documents.

📸 Demos & Examples

Here are some screenshots showing the key scripts in action.

1. OpenAI Chat Model (openaichatmodel.py) Shows a simple conversation with the GPT model via API.

2. Hugging Face Chat Model (huggingface_chatmodel_local.py) Demonstrates running an open-source model locally on your machine.

3. Generating Embeddings (embedding_openai_docs.py) Displays the numerical vector representations (embeddings) of text documents.

4. Semantic Search (document_similarity.py) Shows the script taking a query and finding the most relevant document using cosine similarity.

✨ Core Concepts Demonstrated

This repository is structured around three key concepts of modern AI development:

🤖 LLM & Chat Model Invocation:
- Closed-Source (API-based): Scripts to interact with powerful models like OpenAI's GPT and Google's Gemini through their APIs.
- Open-Source (Local & API): Examples of running open-source models from platforms like Hugging Face and DeepSeek, both by downloading them locally and using their APIs.
🔍 Text Embedding Generation:
- Demonstrates how to convert text (documents and user queries) into numerical vectors (embeddings) using models like OpenAI's text-embedding-ada-002 and local Hugging Face models. These embeddings capture the meaning of the text, not just the keywords.
💡 Semantic Search:
- A practical mini-project (document_similarity.py) that shows how embeddings are used. It converts a user's query into a vector and uses cosine similarity to find the most contextually relevant document from a knowledge base. This is the core mechanism behind vector databases and RAG applications.

🛠️ Tech Stack

Core Framework: LangChain
LLM/Chat Model Providers: OpenAI, Google (Gemini), Hugging Face, DeepSeek
Embedding Models: OpenAI, Sentence-Transformers (from Hugging Face)
Core Libraries: langchain, python-dotenv, numpy, scikit-learn
Local Model Execution: transformers, torch

⚙️ Setup and Installation

Clone the repository:

git clone [https://github.com/jsonusuman351/langchainmodel.git](https://github.com/jsonusuman351/langchainmodel.git)
cd langchainmodel

Create and activate a virtual environment:

# It is recommended to use Python 3.10 or higher
python -m venv venv
.\venv\Scripts\activate

Install the required packages:
```
pip install -r requirements.txt
```
Set Up Environment Variables: To use API-based models (OpenAI, Gemini, etc.), you need to provide your API keys.
- Create a file named .env in the root directory of the project.
- Add your keys to this file like so:
```
OPENAI_API_KEY="your-openai-api-key"
GOOGLE_API_KEY="your-google-api-key"
HF_TOKEN="your-huggingface-api-key"
DEEPSEEK_API_KEY="your-deepseek-api-key"
```

🚀 Usage Guide

This repository is a collection of standalone scripts. You can run each one to explore a specific concept.

1. Invoking LLMs and Chat Models

These scripts show how to get responses from different models.

OpenAI Chat Model (API):
```
python chatModels/openaichatmodel.py
```
Google Gemini Chat Model (API):
```
python chatModels/gemini_chatmodel.py
```
Hugging Face Chat Model (Local Download): Note: This will download the model (~1.5 GB) the first time you run it.
```
python chatModels/huggingface_chatmodel_local.py
```
DeepSeek Chat Model (API):
```
python chatModels/deepseekchatmodel.py
```

2. Generating Text Embeddings

These scripts demonstrate how to convert text into vector embeddings.

Using OpenAI's API to embed documents:

python EmbeddedModels/embedding_openai_docs.py

Using a local Hugging Face model to embed text: Note: This will download the embedding model the first time you run it.
```
python EmbeddedModels/embedding_hf_local.py
```

3. Performing Semantic Search (Mini-Project)

This script is a complete example of using embeddings for semantic search. It takes a user query, finds the most similar document from a list, and returns it.

Run the semantic search demo:

python document_similarity.py

Example Interaction:

(D:\Projects\langchainmodel\venv) D:\Projects\langchainmodel>python document_similarity.py
tell me about narendra modi
Narendra Modi is the Prime Minister of India known for his charismatic leadership and economic reforms.
similarity score is: 0.6063302711097277

📂 Code Playground Structure

Click to view the folder structure

langchainmodel/
│
├── LLMs/                     # Scripts for basic LLMs
│   └── llm_demo.py
│
├── chatModels/               # Scripts for various chat models
│   ├── openaichatmodel.py    # (OpenAI API)
│   ├── gemini_chatmodel.py     # (Google Gemini API)
│   ├── huggingface_chatmodel_local.py # (Local open-source model)
│   └── ...
│
├── EmbeddedModels/           # Scripts for text embedding models
│   ├── embedding_openai_docs.py # (Using OpenAI API)
│   └── embedding_hf_local.py  # (Using local open-source model)
│
├── document_similarity.py    # Mini-project for semantic search
├── requirements.txt
├── .env                      # ( create this for API keys)
└── README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 LLM & Semantic Search Playground

📸 Demos & Examples

✨ Core Concepts Demonstrated

🛠️ Tech Stack

⚙️ Setup and Installation

🚀 Usage Guide

1. Invoking LLMs and Chat Models

2. Generating Text Embeddings

3. Performing Semantic Search (Mini-Project)

📂 Code Playground Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
EmbeddedModels		EmbeddedModels
LLMs		LLMs
chatModels		chatModels
.gitignore		.gitignore
README.md		README.md
document_similarity.py		document_similarity.py
requirements.txt		requirements.txt
text.py		text.py

jsonusuman351/langchainmodel

Folders and files

Latest commit

History

Repository files navigation

🚀 LLM & Semantic Search Playground

📸 Demos & Examples

✨ Core Concepts Demonstrated

🛠️ Tech Stack

⚙️ Setup and Installation

🚀 Usage Guide

1. Invoking LLMs and Chat Models

2. Generating Text Embeddings

3. Performing Semantic Search (Mini-Project)

📂 Code Playground Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages