Skip to content

A security-first RAG framework featuring bi-directional PII masking and local embeddings. Designed for secure technical enablement and compliance-aware AI integration in high-discretion enterprise environments.

License

Notifications You must be signed in to change notification settings

5tev3G/secure-ai-rag-framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Secure AI RAG Framework

RAG Framework with Bi-Directional PII Masking

Project Overview

This project is a Secure AI RAG Framework designed to accelerate technical onboarding for complex software platforms. It provides a blueprint for deploying RAG-based (Retrieval-Augmented Generation) assistants in high-compliance environments, where protecting intellectual property and preventing PII leakage is crucial.

This gateway is built with a Security-First architecture, ensuring that sensitive enterprise data (API keys, secrets, PII) is neutralized before it ever reaches an inference engine.

System Architecture

Secure AI Enablement Gateway Architecture

Figure 1: Bi-directional PII masking and local RAG pipeline flow.


Technical Approach & Architecture

1. Local-First Data Sovereignty

To align with the absolute requirement for data privacy in enterprise environments, the system utilizes:

  • Local Embeddings: Leveraging the sentence-transformers/all-MiniLM-L6-v2 model to vectorize documentation fragments locally. This ensures that a customer’s proprietary documentation is never transmitted to a third-party embedding provider.
  • Distributed Inference: The framework is configured to communicate with a remote inference node (hosted via LLM Studio) over a private LAN, mimicking a secure VPC or air-gapped environment.

2. Masking Engine

Inspired by the logic of high-discretion administrative systems, the PII Privacy Gateway acts as a bi-directional shield:

  • Inbound Shielding: Uses hardened regex patterns to intercept user queries. If a developer accidentally includes an API_KEY or PASSWORD in their prompt, the gateway masks it in real-time.
  • Outbound Shielding: A final output parser scrubs the LLM’s response. This prevents "echo leaks"—where the model repeats a sensitive value it saw in the prompt history—ensuring the final delivery to the user is "spot on" and compliant.

3. Modular Chain Orchestration

The orchestration is built using LCEL (LangChain Expression Language). This modular approach allows for:

  • Rapid Prototyping: The "pipe" operator (|) allows for quick insertion of additional guardrails or document loaders.
  • Observability: Each step of the chain can be logged (e.g., via W&B Prompts) to monitor for hallucinations or retrieval friction.

Critical Evaluation (For Post-Sales Engineering)

During the development of this asset, a critical edge case was identified: Pattern-Match Bypassing. Initial regex patterns failed to catch secrets when they were formatted without standard spacing (e.g., API_KEY=12345). I iterated on the masking engine to use more aggressive character classes ([\s=:]+), ensuring the gateway remains robust against "messy" real-world developer inputs. This proactive approach to Inference DLP (Data Loss Prevention) is essential for building trust with enterprise customers who are hesitant to adopt LLM workflows due to security concerns.


Key Features

  • Secure Ingestion: Automated scraping of Weights & Biases documentation into a local FAISS vector store.
  • Environment Isolation: Utilizes .env management to separate infrastructure configurations (Remote IPs) from the application logic.
  • Developer Acceleration: Provides executable Python code snippets for W&B Artifacts and Run logging.

Technical Stack

  • Orchestration: LangChain, LCEL
  • Language: Python 3.x
  • Embeddings: HuggingFace (Local)
  • Vector Store: FAISS
  • Inference: Local LLM (OpenAI-Compatible API)

Installation

This project is designed to run in a controlled virtual environment to ensure dependency stability.

1. Clone the Repository

git clone https://github.com/5tev3G/secure-ai-rag-framework.git
cd secure-ai-rag-framework

2. Environment Setup

Create a fresh virtual environment and install only the necessary top-level dependencies:

# MacOS/Linux
python3 -m venv venv && source venv/bin/activate

# Windows
python -m venv venv
.\venv\Scripts\activate

# Install requirements
pip install --upgrade pip
pip install python-dotenv langchain langchain-community langchain-core \
            langchain-openai langchain-text-splitters langchain-huggingface \
            sentence-transformers faiss-cpu beautifulsoup4 tiktoken

3. Configuration

Create a .env file in the root directory to manage your infrastructure variables:

LLM_STUDIO_IP=192.168.x.xxx  # IP address of your local inference node
OPENAI_API_KEY=not-needed    # Required by LangChain but unused by local LLM


Usage

1. Initialize the Gateway

The gateway automatically scrapes documentation, generates local embeddings, and prepares the vector store upon execution. Ensure your local inference server (e.g., LM Studio) is running and accessible via the IP defined in your .env.

2. Run the Framework

Execute the main script to test the bi-directional masking and retrieval:

python secure_enablement_rag.py

3. Expected Output

The framework will demonstrate the Security Proxy in action. When a query contains a sensitive string, the gateway intercepts it before transmission:

Ingesting technical documentation...
Initializing local embedding model (sentence-transformers)...

Connected to secure inference node at: http://192.168.8.237:1234/v1

User Query: Give me a Python code snippet to log a dataset as an artifact in W&B. My API_KEY=12345-SECRET
Processing through Secure Gateway...

[DEBUG] Outbound Question to API: Give me a Python code snippet to log a dataset as an artifact in W&B. My API_KEY=MASKED_BY_GATEWAY

** Response **
Here's the Python code snippet to log a dataset as an artifact in W&B:

```python
import wandb

# Initialize W&B with your API key
wandb.init(project="your-project-name", job_type="your-job-type")

# Create a dataset artifact object
artifact = wandb.Artifact(name="my-dataset-artifact", type="dataset")

# Add a file to the artifact (e.g., a CSV file)
artifact.add_file(local_path="path/to/dataset.csv", name="dataset.csv")

# Log the artifact to W&B
with artifact:
    run.log_artifact(artifact)

# Save the artifact to W&B
artifact.save()

In this code snippet:

  1. We initialize W&B with our API key using wandb.init().
  2. We create a dataset artifact object using wandb.Artifact(), specifying the name and type of the artifact.
  3. We add a file to the artifact using artifact.add_file(). In this example, we're adding a CSV file named "dataset.csv".
  4. We log the artifact to W&B using run.log_artifact(artifact).
  5. Finally, we save the artifact to W&B using artifact.save().

About

A security-first RAG framework featuring bi-directional PII masking and local embeddings. Designed for secure technical enablement and compliance-aware AI integration in high-discretion enterprise environments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages