A production-grade MLOps system providing low-latency fraud detection with stateful feature hydration and human-in-the-loop explainability.
The system follows a Lambda Architecture pattern, decoupling the high-latency model training from the low-latency inference path.
graph TD
User[Payment Gateway] -->|POST /predict| API[FastAPI Inference Service]
subgraph "Online Serving Layer (<50ms)"
API -->|1. Hydrate Velocity| Redis[(Redis Feature Store)]
API -->|2. Preprocess| Pipeline[Scikit-Learn Pipeline]
Pipeline -->|3. Predict| Model[XGBoost Classifier]
Model -->|4. Explain| SHAP[SHAP Engine]
end
subgraph "Offline Training Layer"
Data[(Transactional Data)] -->|Batch Load| Trainer[Training Pipeline]
Trainer -->|CV & Tuning| MLflow[MLflow Tracking]
Trainer -->|Export Artifact| Registry[Model Registry]
end
API -->|Async Log| ShadowLogs[Shadow Mode Logs]
API -->|Visualize| Dashboard[Analyst Dashboard]
Traditional stateless APIs struggle with "Velocity Features" (e.g., how many times did this user swipe in 24 hours?). Our engine utilizes Redis Sorted Sets (ZSET) to maintain rolling windows, allowing feature hydration in <2ms with
To mitigate the risk of model drift or false positives, the system supports a Shadow Mode configuration. The model runs in production, receives real traffic, and logs decisions, but never blocks a transaction. This allows for risk-free A/B testing against legacy rule engines.
Standard accuracy is misleading in fraud detection due to class imbalance. We implement a Recall-Constraint Strategy (Target: 80% Recall) ensuring the model captures the vast majority of fraud while maintaining a strict upper bound on False Positive Rates, as required by financial compliance.
Engineered logic to handle new users with zero history by defaulting to global medians and "warm-up" priors, preventing the system from unfairly blocking legitimate first-time customers.
The following metrics were achieved on a hold-out temporal test set (out-of-time validation):
| Metric | Result | Target / Bench |
|---|---|---|
| PR-AUC | 0.9245 | Excellent |
| Precision | 93.18% | Low False Positives |
| Recall | 80.06% | Target: 80% |
| Inference Latency (p95) | ~30ms | < 50ms |
- Docker & Docker Compose
- (Optional) Python 3.12+ (managed via
uv)
-
Clone the Repository
git clone https://github.com/Sibikrish3000/realtime-fraud-engine.git cd realtime-fraud-engine -
Launch the Stack
docker-compose up --build
This starts the FastAPI Backend, Redis Feature Store, and Streamlit Dashboard.
-
Access the Services
- Analyst Dashboard: http://localhost:8501
- API Documentation: http://localhost:8000/docs
- Redis Instance:
localhost:6379
src/
├── api/ # FastAPI service, schemas (Pydantic), and config
├── features/ # Redis feature store logic and sliding window constants
├── models/ # Training pipelines, metrics calculation, and XGBoost wrappers
├── frontend/ # Streamlit-based analyst workbench
├── data/ # Data ingestion and cleaning utilities
└── explainability.py # SHAP-based waterfall plots and global importance
See the Development & MLOps Guide for detailed instructions on training and local development.
POST /v1/predict
Request:
curl -X 'POST' \
'http://localhost:8000/v1/predict' \
-H 'Content-Type: application/json' \
-d '{
"user_id": "u12345",
"trans_date_trans_time": "2024-01-20 14:30:00",
"amt": 150.75,
"lat": 40.7128,
"long": -74.0060,
"merch_lat": 40.7306,
"merch_long": -73.9352,
"category": "grocery_pos"
}'Response:
{
"decision": "APPROVE",
"probability": 0.12,
"risk_score": 12.0,
"latency_ms": 28.5,
"shadow_mode": false,
"shap_values": {
"amt": 0.05,
"dist": 0.02
}
}- Kafka Integration: Transition to an asynchronous event-driven architecture for high-throughput stream processing.
- KServe Deployment: Migrate from standalone FastAPI to KServe for automated scaling and model versioning.
- Graph Features: Incorporate Neo4j-based features to detect fraud rings and synthetic identities.
Author: Sibi Krishnamoorthy Machine Learning Engineer | Fintech & Risk Analytics

