🛡️ PayShield-ML:Real-Time Fraud Engine

A production-grade MLOps system providing low-latency fraud detection with stateful feature hydration and human-in-the-loop explainability.

🏗️ System Architecture

The system follows a Lambda Architecture pattern, decoupling the high-latency model training from the low-latency inference path.

graph TD
    User[Payment Gateway] -->|POST /predict| API[FastAPI Inference Service]
    
    subgraph "Online Serving Layer (<50ms)"
        API -->|1. Hydrate Velocity| Redis[(Redis Feature Store)]
        API -->|2. Preprocess| Pipeline[Scikit-Learn Pipeline]
        Pipeline -->|3. Predict| Model[XGBoost Classifier]
        Model -->|4. Explain| SHAP[SHAP Engine]
    end
    
    subgraph "Offline Training Layer"
        Data[(Transactional Data)] -->|Batch Load| Trainer[Training Pipeline]
        Trainer -->|CV & Tuning| MLflow[MLflow Tracking]
        Trainer -->|Export Artifact| Registry[Model Registry]
    end
    
    API -->|Async Log| ShadowLogs[Shadow Mode Logs]
    API -->|Visualize| Dashboard[Analyst Dashboard]

🖥️ Analyst Workbench

Real-time interface for fraud analysts to review blocked transactions and SHAP explanations.

🚀 Key Features

1. Stateful Feature Store (Redis)

Traditional stateless APIs struggle with "Velocity Features" (e.g., how many times did this user swipe in 24 hours?). Our engine utilizes Redis Sorted Sets (ZSET) to maintain rolling windows, allowing feature hydration in <2ms with $O(\log N)$ complexity.

2. Shadow Mode (Dark Launch)

To mitigate the risk of model drift or false positives, the system supports a Shadow Mode configuration. The model runs in production, receives real traffic, and logs decisions, but never blocks a transaction. This allows for risk-free A/B testing against legacy rule engines.

3. Business-Aligned Optimization

Standard accuracy is misleading in fraud detection due to class imbalance. We implement a Recall-Constraint Strategy (Target: 80% Recall) ensuring the model captures the vast majority of fraud while maintaining a strict upper bound on False Positive Rates, as required by financial compliance.

4. Cold-Start Handling

Engineered logic to handle new users with zero history by defaulting to global medians and "warm-up" priors, preventing the system from unfairly blocking legitimate first-time customers.

📊 Performance & Metrics

The following metrics were achieved on a hold-out temporal test set (out-of-time validation):

Metric	Result	Target / Bench
PR-AUC	0.9245	Excellent
Precision	93.18%	Low False Positives
Recall	80.06%	Target: 80%
Inference Latency (p95)	~30ms	< 50ms

🛠️ Quick Start

Prerequisites

Docker & Docker Compose
(Optional) Python 3.12+ (managed via uv)

Installation & Deployment

Clone the Repository

git clone https://github.com/Sibikrish3000/realtime-fraud-engine.git
cd realtime-fraud-engine

Launch the Stack
```
docker-compose up --build
```
This starts the FastAPI Backend, Redis Feature Store, and Streamlit Dashboard.
Access the Services
- Analyst Dashboard: http://localhost:8501
- API Documentation: http://localhost:8000/docs
- Redis Instance: localhost:6379

📂 Project Structure

src/
├── api/             # FastAPI service, schemas (Pydantic), and config
├── features/        # Redis feature store logic and sliding window constants
├── models/          # Training pipelines, metrics calculation, and XGBoost wrappers
├── frontend/        # Streamlit-based analyst workbench
├── data/            # Data ingestion and cleaning utilities
└── explainability.py # SHAP-based waterfall plots and global importance

See the Development & MLOps Guide for detailed instructions on training and local development.

📡 API Reference

Predict Transaction

POST /v1/predict

Request:

curl -X 'POST' \
  'http://localhost:8000/v1/predict' \
  -H 'Content-Type: application/json' \
  -d '{
    "user_id": "u12345",
    "trans_date_trans_time": "2024-01-20 14:30:00",
    "amt": 150.75,
    "lat": 40.7128,
    "long": -74.0060,
    "merch_lat": 40.7306,
    "merch_long": -73.9352,
    "category": "grocery_pos"
  }'

Response:

{
  "decision": "APPROVE",
  "probability": 0.12,
  "risk_score": 12.0,
  "latency_ms": 28.5,
  "shadow_mode": false,
  "shap_values": {
    "amt": 0.05,
    "dist": 0.02
  }
}

🗺️ Future Roadmap

Kafka Integration: Transition to an asynchronous event-driven architecture for high-throughput stream processing.
KServe Deployment: Migrate from standalone FastAPI to KServe for automated scaling and model versioning.
Graph Features: Incorporate Neo4j-based features to detect fraud rings and synthetic identities.

Author: Sibi Krishnamoorthy Machine Learning Engineer | Fintech & Risk Analytics

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
assets		assets
configs		configs
docker		docker
docs		docs
logs		logs
models		models
notebook		notebook
scripts		scripts
src		src
tests		tests
.coverage		.coverage
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ PayShield-ML:Real-Time Fraud Engine

🏗️ System Architecture

🖥️ Analyst Workbench

Real-time interface for fraud analysts to review blocked transactions and SHAP explanations.

🚀 Key Features

1. Stateful Feature Store (Redis)

2. Shadow Mode (Dark Launch)

3. Business-Aligned Optimization

4. Cold-Start Handling

📊 Performance & Metrics

🛠️ Quick Start

Prerequisites

Installation & Deployment

📂 Project Structure

See the Development & MLOps Guide for detailed instructions on training and local development.

📡 API Reference

Predict Transaction

🗺️ Future Roadmap

About

Uh oh!

Releases

Packages

Languages

Sibikrish3000/realtime-fraud-engine

Folders and files

Latest commit

History

Repository files navigation

🛡️ PayShield-ML:Real-Time Fraud Engine

🏗️ System Architecture

🖥️ Analyst Workbench

Real-time interface for fraud analysts to review blocked transactions and SHAP explanations.

🚀 Key Features

1. Stateful Feature Store (Redis)

2. Shadow Mode (Dark Launch)

3. Business-Aligned Optimization

4. Cold-Start Handling

📊 Performance & Metrics

🛠️ Quick Start

Prerequisites

Installation & Deployment

📂 Project Structure

See the Development & MLOps Guide for detailed instructions on training and local development.

📡 API Reference

Predict Transaction

🗺️ Future Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages