RAG LLM Pattern on a Single-Node OpenShift Cluster

Overview

This Validated Pattern deploys a Retrieval-Augmented Generation (RAG) Large Language Model (LLM) infrastructure on a Single Node OpenShift (SNO) cluster. It provides a GPU-accelerated environment for running LLM inference services using vLLM with both IBM Granite 4 Small and GPT-OSS 120B models.

In addition to the LLM inference services, the pattern deploys Qdrant as a vector database, pre-populated with the Validated Patterns documentation. A frontend application is also included, allowing users to select an LLM, configure retrieval settings, and query the complete RAG pipeline.

flowchart LR
    User[User Query] --> Frontend
    Frontend --> Qdrant[(Vector DB)]
    Qdrant --> |Relevant Docs| Frontend
    Frontend --> LLM[LLM Service]
    LLM --> |Response| Frontend
    Frontend --> User

Applications & Components

LLM Inference Services

IBM Granite 4 Small - Served via vLLM with GPU acceleration
GPT-OSS 120B - Served via vLLM with GPU acceleration

Vector Database

Qdrant - Vector database pre-populated with Validated Patterns documentation for retrieval

Frontend

RAG Frontend Application - Web interface for selecting an LLM, configuring retrieval settings, and querying the RAG pipeline

Supporting Operators

Red Hat OpenShift AI (RHOAI) - AI/ML platform for model serving and management
NVIDIA GPU Operator - Provides GPU support for the inference services
Node Feature Discovery (NFD) - Identifies node hardware capabilities
Local Volume Management Service (LVMS) - Manages local storage volumes

Prerequisites

OpenShift Cluster - Single Node OpenShift (SNO) deployment
GPU Hardware - NVIDIA GPU-enabled node with sufficient VRAM for LLM inference (at least 80GB to run the GPT-OSS 120B model)

This pattern was developed and tested on a Lenovo ThinkSystem SR650a V4 with 2 NVIDIA RTX Pro 6000 GPUs. If your hardware does not meet these requirements, you will need to modify this pattern accordingly.

Installation

Standard Installation

Clone this repository:

git clone https://github.com/validatedpatterns-sandbox/rag-llm-sno.git
cd rag-llm-sno

Log into your OpenShift cluster:

export KUBECONFIG=/path/to/your/kubeconfig

Or:

oc login --token=<your-token> --server=<your-cluster-api>

Install the pattern:
```
./pattern.sh make install
```

Custom Installation

If your hardware differs from the tested configuration or you need to modify the pattern:

Fork this repository and clone your fork:

git clone https://github.com/<your-username>/rag-llm-sno.git
cd rag-llm-sno

Create a branch for your changes:
```
git checkout -b my-customizations
```
Make your modifications (e.g., adjust model configurations, resource limits)

Commit and push your changes:

git add .
git commit -m "Customize pattern for my environment"
git push -u origin my-customizations

Log into your OpenShift cluster:

export KUBECONFIG=/path/to/your/kubeconfig

Or:

oc login --token=<your-token> --server=<your-cluster-api>

Install the pattern:
```
./pattern.sh make install
```

Usage

After installation, access the pattern components from the OpenShift console's application menu (bento box):

From here you can:

Cluster Argo CD / Prod ArgoCD - View the GitOps installation and sync status of the pattern
RAG LLM Demo UI - Launch the frontend application
Red Hat OpenShift AI - Access the RHOAI dashboard

Using the Frontend

The RAG LLM Demo UI provides an interface to query the RAG pipeline:

Select an LLM - Choose between the available models (IBM Granite 4 Small or GPT-OSS 120B)
Configure Retrieval Settings - Adjust search type (similarity, similarity_score_threshold, or mmr) and parameters like number of documents to retrieve
Submit your query - Enter a question and view the response along with retrieved documents

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
charts		charts
images		images
overrides		overrides
LICENSE		LICENSE
Makefile		Makefile
Makefile-common		Makefile-common
README.md		README.md
pattern.sh		pattern.sh
values-global.yaml		values-global.yaml
values-prod.yaml		values-prod.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG LLM Pattern on a Single-Node OpenShift Cluster

Overview

Applications & Components

LLM Inference Services

Vector Database

Frontend

Supporting Operators

Prerequisites

Installation

Standard Installation

Custom Installation

Usage

Using the Frontend

About

Uh oh!

Releases

Packages

Languages

License

validatedpatterns-sandbox/rag-llm-sno

Folders and files

Latest commit

History

Repository files navigation

RAG LLM Pattern on a Single-Node OpenShift Cluster

Overview

Applications & Components

LLM Inference Services

Vector Database

Frontend

Supporting Operators

Prerequisites

Installation

Standard Installation

Custom Installation

Usage

Using the Frontend

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages