Owner: Ananya Shukla
Status: Active development (production-aware prototype)
Focus: Fraud detection as a decision system, not a standalone classifier
Real-world fraud detection violates many assumptions of standard machine learning pipelines:
- Extreme class imbalance (fraud ≪ 1%)
- Delayed, noisy, and asymmetric labels (e.g., chargebacks)
- Regulatory and audit requirements for explainability
- Strict latency constraints for real-time decisions
- Unequal and business-critical error costs (false positives vs. false negatives)
Most portfolio projects ignore these constraints and frame fraud as a static classification task.
This project intentionally does not.
The goal is to design and implement a production-aware fraud detection system that mirrors how real organizations deploy, monitor, govern, and retrain ML-driven risk decisions—while explicitly documenting what cannot be replicated without institutional access.
- An end-to-end fraud decision engine
- A cost-sensitive, explainable ML system
- A simulation of delayed-label reality and temporal leakage constraints
- An MLOps-oriented system emphasizing lifecycle ownership
- A system designed to be auditable, monitorable, and retrainable
- A Kaggle-style notebook
- A single “best model” benchmark
- A purely real-time demo without decision logic
- A claim of real production deployment or regulatory approval
The emphasis is on correct system design and trade-offs, not inflated performance metrics.
flowchart TB
A[Incoming Transaction Stream]
A --> B[Feature Pipeline<br/>Stateless & Versioned]
B --> C[Fraud Model<br/>Cost-Sensitive]
C --> D[Explainability Engine<br/>SHAP]
C --> E[Risk Score]
E --> F[Decision Policy Layer]
F -->|Approve| G[Auto Approve]
F -->|Step-Up| H[Additional Authentication]
F -->|Block| I[Manual Review Queue]
I --> J[Analyst Feedback]
J --> K[Label Store<br/>Delayed Ground Truth]
K --> L[Retraining Pipeline]
L --> C
- Accuracy is not a primary metric under extreme imbalance
- Decisions are optimized for expected financial loss
- Explainability is a first-class artifact
- Time and label availability are modeled explicitly
- Predictions must be auditable and reproducible
- System realism is prioritized over algorithm novelty
Each component is built incrementally and versioned independently.
- IEEE-CIS dataset ingestion
- Unified transaction event table
- Simulated label-delay distribution
- Time-aware train / validation / stream splits
- Rolling-window aggregates
- Velocity and frequency features
- Strict leakage prevention
- Feature computation aligned with event time
- Interpretable baseline (logistic regression)
- Imputation-aware pipelines
- Precision–Recall–centric evaluation
- Explicit handling of delayed labels
- Cost-based risk-to-action mapping
- Threshold tuning under asymmetric costs
- Separation of prediction and business decision logic
- FastAPI inference service
- Production model loading via registry
- Health and scoring endpoints
- Request-level latency-safe inference
- Feature drift via Population Stability Index (PSI)
- Prediction drift monitoring
- Stream vs. train distribution comparisons
- Mature-label data selection
- Candidate vs. production model evaluation
- Automated promotion via PR-AUC improvement
- Model registry updates
- Request-level audit logging
- Unique request IDs
- Model artifact path + immutable SHA256 hash
- Explainability linkage
- Size-based audit log rotation
This project intentionally does not claim to replicate:
- Real chargeback or dispute pipelines
- Legal responsibility or regulatory approval
- Live customer friction costs
- Production SLAs or on-call operations
- Organization-specific fraud heuristics
These limitations are explicitly acknowledged, not ignored.
The repository is organized to reflect a real-world ML system rather than a model-centric workflow.
fraud-detection-mlops/
├── docs/ # Design notes and diagrams
├── data/ # Ingestion, label delay simulation, time splits
├── features/ # Feature engineering pipelines
├── models/ # Training, evaluation, registry
├── decision/ # Cost-based decision policies
├── explainability/ # SHAP-based explanations
├── monitoring/ # Drift detection utilities
├── retraining/ # Retraining and promotion logic
├── api/ # FastAPI inference service
├── audit/ # Request-level audit logging
└── README.md
Development philosophy:
- Structure precedes implementation
- Each system phase is developed and committed independently
- Design changes are preserved in Git history
- No silent assumptions or hidden shortcuts
- Python
- scikit-learn
- SHAP
- FastAPI
- Joblib
- Git + GitHub
- VS Code
Specific libraries may evolve as the system matures; architectural intent will not.
- v1: System design, end-to-end MLOps pipeline, real-time inference, retraining, drift monitoring, and governance completed