Skip to content

DAFU is a comprehensive fraud detection and e-commerce analytics platform designed for enterprise deployment. Currently in active development, it provides advanced machine learning-based fraud detection capabilities with a focus on anomaly detection and sequence analysis. C: dafu@masterfabric.co

License

masterfabric/dafu

Repository files navigation

DAFU - Data Analytics Functional Utilities

License: AGPL v3 Python FastAPI Docker Kubernetes

DAFU is a comprehensive fraud detection and e-commerce analytics platform designed for enterprise deployment. Currently in active development, it provides advanced machine learning-based fraud detection capabilities with a focus on anomaly detection and sequence analysis.

🌟 Platform Overview

DAFU is a fraud detection platform that combines multiple machine learning algorithms to provide comprehensive fraud detection and prevention solutions. The platform is built with modern technologies and follows enterprise best practices, with core ML capabilities fully implemented and enterprise features in development.

Current Capabilities (Implemented)

  • πŸš€ Unified CLI with API Integration: All-in-one command-line interface for authentication, logs, reports, products, and ML models NEW
  • πŸ” Authentication & User Management: JWT-based auth with role-based access control (RBAC) NEW
  • πŸ“‹ Logging System: Structured logging with analytics and statistics NEW
  • πŸ“Š Report Management: Fraud detection report generation and tracking NEW
  • πŸ›οΈ Product Risk Management: E-commerce product management with fraud risk tracking NEW
  • 🎯 Unified Model Interface: Single entry point for all fraud detection models
  • 🧠 Advanced ML Algorithms: Isolation Forest and LSTM/GRU sequence models fully implemented
  • πŸ“‘ Stream Processing: Real-time data stream processing with pre-trained models
  • πŸ’Ύ Model Persistence: Save and load trained models for production deployment
  • πŸ”€ Dual Prediction Modes: Both batch and stream prediction capabilities
  • 🌐 FastAPI Backend: Complete REST API with auth, logs, reports, products endpoints NEW
  • πŸ—„οΈ Database Layer: PostgreSQL with SQLAlchemy ORM, complete schema NEW
  • 🐳 Docker Infrastructure: PostgreSQL containerization ready
  • πŸ§ͺπŸŽ“ Dual Learning Modes: Both supervised and unsupervised learning approaches
  • πŸ“ŠπŸ“ˆ Comprehensive Analysis: 4-panel visualization with detailed performance metrics
  • πŸš€ Production-Ready Core: Complete fraud detection pipeline with evaluation
  • πŸ•΅οΈβ€β™€οΈπŸŽ― Flexible Detection: Classic and risk-score based detection methods
  • 🧹 Data Processing: Automatic preprocessing with missing value handling
  • ⚑ Fast Startup: Lazy loading for instant model selection interface

Planned Capabilities ROADMAP

  • ⚑ Real-time API : Sub-50ms fraud scoring endpoints for ultra-low latency decisioning.
    Enables the system to detect fraud instantly in live payment flows, ensuring compliance with real-time financial transaction requirements.

  • πŸ” Enterprise Security : OAuth2, JWT, RBAC implementation.
    Adds enterprise-grade authentication, token-based access, and role-based authorization to secure deployments in regulated environments.

  • ☸️ Scalable Architecture : Kubernetes deployment with auto-scaling.
    Provides seamless horizontal scaling based on traffic load, supporting both small-scale PoCs and large enterprise production clusters.

  • πŸ“ˆ Advanced Monitoring : Prometheus, Grafana, Jaeger integration.
    Full observability with metrics collection, real-time dashboards, and distributed tracing for faster issue detection and resolution.

  • 🚦 High-throughput Processing : 10,000+ TPS optimization.
    Optimized to handle extremely high transaction volumes, scaling to 10,000+ transactions per second to meet the demands of major banks and payment providers.

πŸ“‹ Table of Contents

πŸš€ Getting Started

πŸ—οΈ Architecture & Components

βš™οΈ Configuration & Usage

πŸ§ͺ Development & Testing

πŸ“š API & Documentation

πŸ“Š Project Status

πŸ“š Documentation

πŸ†˜ Support


πŸš€ Quick Start

Prerequisites

  • Python 3.8+ (Python 3.9+ recommended)
  • Docker (for PostgreSQL database)
  • Git (for cloning the repository)
  • 8GB+ RAM (for ML model training)
  • 2GB+ free disk space (for models and data)

For API Features (NEW!):

  • PostgreSQL (Docker container recommended)
  • Port 8000 (for API server)
  • Port 5432 (for PostgreSQL)

Optional (for production):

  • Kubernetes cluster
  • Redis (for caching)

Installation

Option 1: Interactive CLI (Easiest - Recommended) NEW

The fastest way to get started with DAFU!

# Clone the repository
git clone https://github.com/MasterFabric/dafu.git
cd dafu

# Make CLI executable
chmod +x dafu

# Start interactive CLI
./dafu

# You'll see:
# ╔════════════════════════════════════════════════════════════╗
# β•‘  ____    _    _____ _   _                                  β•‘
# β•‘ |  _ \  / \  |  ___| | | |                                 β•‘
# β•‘ | | | |/ _ \ | |_  | | | |                                 β•‘
# β•‘ | |_| / ___ \|  _| | |_| |                                 β•‘
# β•‘ |____/_/   \_\_|    \___/                                  β•‘
# β•‘                                                            β•‘
# β•‘ Data Analytics Functional Utilities - Interactive CLI     β•‘
# β•‘ Enterprise Fraud Detection & Analytics Platform           β•‘
# β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
# 
# Welcome to DAFU Interactive CLI!
# Type 'help' for available commands or 'exit' to quit
# 
# dafu> 

# Try different features
dafu> help                    # See all commands
dafu> auth login              # Login to API (NEW!)
dafu> logs list               # View system logs (NEW!)
dafu> reports list            # View fraud reports (NEW!)
dafu> products stats          # Product statistics (NEW!)
dafu> fraud-detection         # Run ML models
dafu> docker status           # Check Docker services

# The CLI will:
# βœ“ Auto-create virtual environment if needed
# βœ“ Auto-install dependencies
# βœ“ Manage authentication sessions
# βœ“ Provide unified access to all features
# βœ“ Return to CLI prompt after each command

Available CLI Commands:

Category Command Description
API & Auth auth login/logout/whoami/register User authentication & management NEW
Logs logs list/stats System logging & analytics NEW
Reports reports list/create/view/stats Fraud detection reports NEW
Products products list/high-risk/stats Product risk management NEW
ML Models fraud-detection, models, ml Run fraud detection models
Docker docker up/down/restart/status/logs Manage Docker services
System status, info, version Show system information
Utilities help, clear, exit Utility commands

Key Features:

  • βœ… API Integration - Full authentication, logs, reports, products management NEW
  • βœ… Persistent Session - Login once, use everywhere with session management
  • βœ… Auto-Setup - Automatically creates virtual environment and installs dependencies
  • βœ… Error Resilient - CLI stays active even when commands fail
  • βœ… User-Friendly - Color-coded output and helpful messages
  • βœ… Scriptable - Use in automation with single command mode
  • βœ… Role-Based Access - Support for viewer, user, analyst, admin roles NEW

πŸ“– Documentation:

Option 2: With API Features (Full Platform) NEW

Complete platform with authentication, logging, reports, and product management

Step 1: Start PostgreSQL

docker run -d --name dafu-postgres \
  -e POSTGRES_USER=dafu \
  -e POSTGRES_PASSWORD=dafu_secure_password \
  -e POSTGRES_DB=dafu \
  -p 5432:5432 \
  postgres:15-alpine

Step 2: Start API Server (in separate terminal)

cd dafu/core/features/fraud_detection
./start_api.sh

# Wait for:
# INFO:     Uvicorn running on http://0.0.0.0:8000
# INFO:     Application startup complete.

Step 3: Use DAFU CLI

./dafu

dafu> auth register     # First time: register user
dafu> auth login        # Login with credentials
dafu> auth whoami       # Check your user info
dafu> logs list         # View system logs
dafu> reports list      # View fraud reports
dafu> products stats    # Product statistics
dafu> fraud-detection   # Run ML models

πŸ“– Complete Guide: See docs/USAGE_GUIDE.md for detailed instructions

Features Available:

  • βœ… JWT authentication with RBAC
  • βœ… System logging and analytics
  • βœ… Fraud detection report generation
  • βœ… Product risk management
  • βœ… All ML models
  • βœ… RESTful API endpoints
  • βœ… Database persistence
  • βœ… Session management

API Documentation: http://localhost:8000/docs (Swagger UI)

Option 3: ML Models Only (Standalone)

Step 1: Clone and Setup Environment

# Clone the repository
git clone https://github.com/MasterFabric/dafu.git
cd dafu

# Create virtual environment
python3 -m venv dafu_env
source dafu_env/bin/activate  # On Windows: dafu_env\Scripts\activate

# Expected output:
# (dafu_env) masterfabric@machine:dafu$ 

Step 2: Install Dependencies

# Navigate to fraud detection module
cd core/features/fraud_detection

# Install minimal dependencies (recommended for first-time users)
pip install -r requirements-minimal.txt

# Expected output:
# Collecting numpy>=1.21.0
#   Downloading numpy-1.24.3-cp39-cp39-macosx_10_9_x86_64.whl (20.1 MB)
#      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 20.1/20.1 MB 2.1 MB/s eta 0:00:00
# Collecting pandas>=1.3.0
#   Downloading pandas-1.5.3-cp39-cp39-macosx_10_9_x86_64.whl (11.3 MB)
#      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 11.3/11.3 MB 2.8 MB/s eta 0:00:00
# ...
# Successfully installed numpy-1.24.3 pandas-1.5.3 scikit-learn-1.3.0 ...

Step 3: Verify Installation

# Test the installation
python -c "from src.models.anomaly_detection import IsolationForestFraudDetector; print('βœ… Installation successful!')"

# Expected output:
# βœ… Installation successful!

Step 4: Run Unified Model Interface

# Run the unified model selection interface
cd core/features/fraud_detection/src/models
python main.py

# Expected terminal interaction:
# ========================================
# πŸ” ENTERPRISE FRAUD DETECTION PLATFORM
# ========================================
# Advanced Machine Learning Models for Fraud Detection
# Version: 1.0.0
# ========================================
# 
# This platform offers multiple fraud detection approaches:
# β€’ Traditional ML: Isolation Forest with Risk Score analysis
# β€’ Deep Learning: LSTM and GRU sequence-based models
# β€’ Both supervised and unsupervised learning modes
# β€’ Real-time streaming and batch processing capabilities
# ========================================
# 
# ⚑ Fast startup - models load only when selected!
# 
# ============================================================
# 🎯 SELECT FRAUD DETECTION MODEL
# ============================================================
# Choose the type of fraud detection model you want to use:
# 
# 1. πŸ” ISOLATION FOREST & RISK SCORE
#    β€’ Traditional machine learning approach
#    β€’ Excellent for tabular data with numerical features
#    β€’ Supports both supervised and unsupervised learning
#    β€’ Risk score based anomaly detection
#    β€’ Fast training and prediction
# 
# 2. 🧠 SEQUENCE MODELS (LSTM & GRU)
#    β€’ Deep learning approach for sequential data
#    β€’ Captures temporal patterns and dependencies
#    β€’ Autoencoder architecture for anomaly detection
#    β€’ Best for time-series and transaction sequences
#    β€’ More complex but potentially more accurate
# 
# 3. ℹ️  MODEL COMPARISON
#    β€’ Compare different models on the same dataset
#    β€’ Get recommendations based on your data
# 
# 4. ❓ HELP & INFORMATION
#    β€’ Detailed information about each model
#    β€’ Data requirements and recommendations
# 
# 5. πŸšͺ EXIT
#    β€’ Exit the application
# ============================================================
# 
# Enter your choice (1-5): 

Alternative: Run Individual Model Tests

# Run individual model tests (legacy method)
cd core/features/fraud_detection
python test_anomaly_detection.py
python test_sequence_models_interactive.py

Option 4: Docker Compose Setup ⚠️ PLANNED

Status: Infrastructure prepared, services not integrated yet

What's Ready:

  • βœ… Docker configuration files
  • βœ… Database schemas
  • βœ… Service definitions
  • ⚠️ ML models NOT integrated with API yet

Current Limitation: Docker Compose services are commented out until API-ML integration is complete. For now, use Option 1 (Local Development) to run ML models.

Future Setup (when ready):

# Clone and navigate
git clone https://github.com/MasterFabric/dafu.git
cd dafu

# Uncomment services in docker-compose.yml
# Then start services
docker-compose up -d

Why Services Are Commented Out:

The ML models (Isolation Forest, LSTM/GRU) work perfectly standalone, but the FastAPI endpoints need ML integration. All infrastructure (database schemas, service configs, monitoring) is prepared and ready to be activated once the integration is complete.

What You Can Do Now:

  • βœ… Use all ML models via Python (Option 1)
  • βœ… Train and save models
  • βœ… Stream and batch processing
  • βœ… See Docker Status for integration roadmap

Next Step: Integrate ML models with FastAPI, then uncomment services in docker-compose.yml.

Option 5: Docker Deployment (Coming Soon) PLANNED

For testing individual components:

# Build the fraud detection service
cd core/features/fraud_detection
docker build -f deployment/Dockerfile -t dafu-fraud-detection .

# Run with sample data
docker run -it --rm \
  -v $(pwd)/sample_fraud_data.csv:/app/data.csv \
  dafu-fraud-detection \
  python test_anomaly_detection.py

Option 6: Kubernetes Deployment (Production)

Step 1: Deploy with Helm

# Deploy using Helm (when API is ready)
cd core/features/fraud_detection/deployment
helm install dafu-fraud-detection ./helm-charts/ \
  --set image.tag=latest \
  --set replicas=3 \
  --set resources.requests.memory=512Mi

# Expected output:
# NAME: dafu-fraud-detection
# LAST DEPLOYED: Mon Jan 15 10:30:00 2024
# NAMESPACE: default
# STATUS: deployed
# REVISION: 1
# TEST SUITE: None

Step 2: Verify Deployment

# Check pod status
kubectl get pods -l app=dafu-fraud-detection

# Expected output:
# NAME                                READY   STATUS    RESTARTS   AGE
# dafu-fraud-detection-7d4b8c9f-abc   1/1     Running   0          2m
# dafu-fraud-detection-7d4b8c9f-def   1/1     Running   0          2m
# dafu-fraud-detection-7d4b8c9f-ghi   1/1     Running   0          2m

First-Time User Guide

🎯 Quick Demo (5 minutes)

1. Run the Unified Model Interface

cd core/features/fraud_detection/src/models
python main.py

Expected Terminal Interface:

πŸ” ENTERPRISE FRAUD DETECTION PLATFORM
========================================
Advanced Machine Learning Models for Fraud Detection
Version: 1.0.0
========================================

This platform offers multiple fraud detection approaches:
β€’ Traditional ML: Isolation Forest with Risk Score analysis
β€’ Deep Learning: LSTM and GRU sequence-based models
β€’ Both supervised and unsupervised learning modes
β€’ Real-time streaming and batch processing capabilities
========================================

⚑ Fast startup - models load only when selected!

============================================================
🎯 SELECT FRAUD DETECTION MODEL
============================================================
Choose the type of fraud detection model you want to use:

1. πŸ” ISOLATION FOREST & RISK SCORE
2. 🧠 SEQUENCE MODELS (LSTM & GRU)
3. ℹ️  MODEL COMPARISON
4. ❓ HELP & INFORMATION
5. πŸšͺ EXIT

Enter your choice (1-5): 

2. Select Your Model

Choose option 1 for Isolation Forest or option 2 for Sequence Models. The system will:

  • Load the selected model (with progress indicator)
  • Guide you through configuration
  • Handle all setup automatically

3. View Results

After completion, you'll see comprehensive results with visualizations and exported data.

πŸ”„ Unified Model Interface NEW

The new unified interface provides:

  • Single Entry Point: One command to access all fraud detection models
  • Smart Model Selection: Interactive guidance for choosing the right model
  • Fast Startup: Lazy loading ensures instant interface response
  • Model Comparison: Built-in comparison tools and recommendations
  • Help System: Comprehensive information and decision trees
  • Seamless Navigation: Easy switching between models and options

πŸ”„ Stream Processing Demo NEW

1. Train a Model First

python test_sequence_models_interactive.py

Expected Questions:

🎯 Prediction Mode Selection:
1. Batch Prediction (train and evaluate)
2. Stream Prediction (use pre-trained model)

Please select prediction mode (1 or 2): 1

🎯 Learning Mode Selection:
1. Supervised Learning
2. Unsupervised Learning

Please select learning mode (1 or 2): 1

🎯 Model Selection:
Available models: ['LSTM', 'GRU', 'Both']
Please select models (comma-separated): LSTM,GRU

2. Test Stream Processing

# Run stream prediction with pre-trained model
python test_sequence_models_interactive.py

Select Stream Mode:

🎯 Prediction Mode Selection:
1. Batch Prediction (train and evaluate)
2. Stream Prediction (use pre-trained model)

Please select prediction mode (1 or 2): 2

πŸ“ Model Package Selection:
Available models: ['my_fraud_model', 'production_model']
Please select model: my_fraud_model

βœ… Model loaded successfully!
πŸ“Š Processing stream data...
βœ… Stream processing complete! Results saved to: stream_results/

🐳 Docker Quick Start

1. One-Command Demo

# Run complete demo in Docker
docker run -it --rm \
  -v $(pwd)/sample_fraud_data.csv:/app/data.csv \
  -v $(pwd)/results:/app/results \
  dafu-fraud-detection \
  python test_anomaly_detection.py

Expected Output:

πŸš€ DAFU Fraud Detection System - Docker Demo
========================================

πŸ“Š Data Analysis Results:
- Dataset shape: (1000, 8)
- Missing values: 0
- Fraud rate: 5.0%

🎯 Running unsupervised anomaly detection...
βœ… Analysis complete! Results saved to: /app/results/

πŸ”§ Troubleshooting

Common Issues and Solutions:

Issue 1: Import Error

ModuleNotFoundError: No module named 'src.models.anomaly_detection'

Solution:

# Make sure you're in the fraud_detection directory
cd fraud_detection
python -c "from src.models.anomaly_detection import IsolationForestFraudDetector; print('βœ… Fixed!')"

Issue 2: Memory Error

MemoryError: Unable to allocate array

Solution:

# Use smaller dataset or reduce model complexity
export PYTHONHASHSEED=0
python test_anomaly_detection.py

Issue 3: Docker Build Fails

ERROR: failed to solve: failed to resolve source

Solution:

# Clean Docker cache and rebuild
docker system prune -f
docker build --no-cache -f deployment/Dockerfile -t dafu-fraud-detection .

πŸ“Š Expected Performance

System Requirements:

  • Minimum: 4GB RAM, 2 CPU cores
  • Recommended: 8GB RAM, 4 CPU cores
  • Production: 16GB+ RAM, 8+ CPU cores

Processing Times:

  • Small dataset (1K records): 10-30 seconds
  • Medium dataset (10K records): 2-5 minutes
  • Large dataset (100K records): 10-20 minutes
  • Stream processing: <1 second per record

Memory Usage:

  • Training: 2-4GB RAM
  • Prediction: 500MB-1GB RAM
  • Stream mode: 200-500MB RAM

Basic Usage

1. Unified Model Interface (Recommended)

# Start the unified interface
cd core/features/fraud_detection/src/models
python main.py

# Follow the interactive prompts:
# 1. Choose your model (Isolation Forest or Sequence Models)
# 2. Select prediction mode (Batch or Stream)
# 3. Configure parameters
# 4. Run analysis

2. Direct Model Usage (Advanced)

from fraud_detection.src.models.anomaly_detection import IsolationForestFraudDetector

# Initialize the detector
detector = IsolationForestFraudDetector(random_state=42)

# Load and analyze your data
detector.load_and_analyze_data('transaction_data.csv')

# Setup learning mode (supervised/unsupervised)
detector.setup_learning_mode()

# Choose detection method
# - Classic: Binary classification with contamination levels
# - Risk Score: Custom threshold-based detection

# Preprocess data
detector.preprocess_data()

# Train models
detector.train_models([0.01, 0.05, 0.1])  # Multiple contamination levels

# Evaluate and visualize
if detector.is_supervised:
    detector.evaluate_models()

detector.create_visualizations(save_plots=True)
detector.export_results('fraud_analysis_results')

3. Run Individual Model Tests (Legacy)

# Run comprehensive anomaly detection tests
cd core/features/fraud_detection
python test_anomaly_detection.py

# Run sequence model tests
python test_sequence_models_interactive.py

Note: The unified interface (main.py) is now the recommended way to access all fraud detection capabilities. Individual model tests are still available for advanced users.

4. Sequence Model Analysis (NEW! Enhanced)

from fraud_detection.src.models.sequence_models import SequenceFraudDetector

# Initialize sequence detector
sequence_detector = SequenceFraudDetector()

# Setup prediction mode (NEW!)
sequence_detector.setup_prediction_mode()
# Choose: 1. Batch Prediction or 2. Stream Prediction

# For Batch Prediction Mode
if sequence_detector.prediction_mode == 'batch':
    # Load and analyze data
    sequence_detector.load_and_analyze_data('user_sequences.csv')
    
    # Setup learning mode
    sequence_detector.setup_learning_mode()
    
    # Preprocess data
    sequence_detector.preprocess_data()
    
    # Train models
    sequence_detector.train_models(['LSTM', 'GRU'])
    
    # Save trained models (NEW!)
    sequence_detector.save_model_package('my_fraud_model')
    
    # Evaluate and export
    sequence_detector.evaluate_models()
    sequence_detector.export_results('batch_results')

# For Stream Prediction Mode (NEW!)
elif sequence_detector.prediction_mode == 'stream':
    # Load pre-trained model
    sequence_detector.load_model_package('my_fraud_model')
    
    # Load new stream data
    stream_data = pd.read_csv('new_stream_data.csv')
    
    # Preprocess stream data
    processed_stream = sequence_detector.preprocess_stream_data(stream_data)
    
    # Make predictions on stream
    predictions = sequence_detector.predict_stream(processed_stream)
    
    # Export stream results (NEW!)
    sequence_detector.export_stream_results(stream_data, predictions)

5. Stream Prediction with Isolation Forest NEW

from fraud_detection.src.models.anomaly_detection import IsolationForestFraudDetector

# Initialize detector
detector = IsolationForestFraudDetector()

# Setup prediction mode (NEW!)
detector.setup_prediction_mode()
# Choose: 1. Batch Prediction or 2. Stream Prediction

# For Stream Prediction Mode
if detector.prediction_mode == 'stream':
    # Load pre-trained model
    detector.load_model_package('trained_fraud_model')
    
    # Load new stream data
    stream_data = pd.read_csv('new_transactions.csv')
    
    # Preprocess stream data
    processed_stream = detector.preprocess_stream_data(stream_data)
    
    # Make predictions on stream
    results = detector.predict_stream(processed_stream, contamination=0.1)
    
    # Export stream results (NEW!)
    detector.export_stream_results(stream_data, results)

πŸ“Š Supported Data Formats

Transaction Data

transaction_id,user_id,amount,merchant_id,timestamp,category,is_fraud
tx_001,user_123,150.00,merchant_456,2024-01-15 10:30:00,electronics,0
tx_002,user_124,2500.00,merchant_789,2024-01-15 11:45:00,jewelry,1

User Behavior Data

user_id,timestamp,action_type,device_id,location,amount
user_123,2024-01-15 10:30:00,login,mobile_device_001,location_A,0
user_123,2024-01-15 10:31:00,purchase,mobile_device_001,location_A,150.00

Time Series Data

timestamp,user_id,transaction_count,daily_amount,risk_score
2024-01-15,user_123,5,750.00,0.2
2024-01-16,user_123,8,1200.00,0.4

🎯 Use Cases and Scenarios

1. Unified Model Selection and Analysis

Scenario: Choose the right fraud detection model for your data Solution: Interactive model selection interface

# Start the unified interface
cd core/features/fraud_detection/src/models
python main.py

# Interactive model selection:
# 1. πŸ” ISOLATION FOREST & RISK SCORE - For tabular data
# 2. 🧠 SEQUENCE MODELS (LSTM & GRU) - For sequential data
# 3. ℹ️  MODEL COMPARISON - Compare different approaches
# 4. ❓ HELP & INFORMATION - Get detailed guidance
# 5. πŸšͺ EXIT - Exit the application

2. Real-time E-commerce Fraud Detection

Scenario: Detect fraudulent transactions in real-time during checkout Solution: Risk Score API with sub-50ms response time

# Real-time scoring
response = requests.post('https://api.masterfabric.co/dafu/v1/score', json={
    'transaction_id': 'tx_123',
    'amount': 150.00,
    'user_id': 'user_456',
    'merchant_id': 'merchant_789',
    'device_fingerprint': 'fp_abc123',
    'ip_address': '192.168.1.1',
    'user_agent': 'Mozilla/5.0...'
})

fraud_score = response.json()['risk_score']
is_fraud = fraud_score > 0.7  # Custom threshold

3. Batch Fraud Analysis

Scenario: Analyze historical data for fraud patterns and model retraining Solution: Batch Processing API with large-scale data handling

# Batch analysis
batch_request = {
    'data_source': 's3://fraud-data/transactions_2024.csv',
    'analysis_type': 'comprehensive',
    'models': ['isolation_forest', 'lstm', 'xgboost'],
    'output_format': 'detailed_report'
}

response = requests.post('https://api.masterfabric.co/dafu/v1/batch/analyze', json=batch_request)

4. User Behavior Analysis

Scenario: Detect anomalous user behavior patterns over time Solution: Sequence models with LSTM/GRU for temporal pattern recognition

# User behavior analysis
from fraud_detection.src.models.sequence_models import UserBehaviorAnalyzer

analyzer = UserBehaviorAnalyzer()
analyzer.load_user_sequences('user_behavior_data.csv')

# Detect anomalies in user patterns
anomalies = analyzer.detect_behavioral_anomalies(
    sequence_length=30,
    threshold=0.8
)

5. Merchant Risk Assessment

Scenario: Evaluate merchant risk profiles for payment processing Solution: Multi-model ensemble with business rules

# Merchant risk assessment
from fraud_detection.src.rules_engine.rule_processor import MerchantRiskProcessor

processor = MerchantRiskProcessor()
merchant_risk = processor.assess_merchant_risk(
    merchant_id='merchant_123',
    transaction_history='merchant_transactions.csv',
    risk_factors=['chargeback_rate', 'transaction_patterns', 'location_anomalies']
)

6. Real-time Stream Processing NEW

Scenario: Process incoming transactions in real-time using pre-trained models Solution: Stream prediction mode with model persistence

# Real-time stream processing
from fraud_detection.src.models.anomaly_detection import IsolationForestFraudDetector

detector = IsolationForestFraudDetector()
detector.setup_prediction_mode()
# Choose: 2. Stream Prediction

# Load pre-trained model
detector.load_model_package('production_fraud_model')

# Process incoming transactions
while True:
    # Get new transaction from stream
    new_transaction = get_next_transaction()
    
    # Preprocess and predict
    processed = detector.preprocess_stream_data(new_transaction)
    result = detector.predict_stream(processed, contamination=0.1)
    
    # Take action based on prediction
    if result['predictions'][0] == 1:
        block_transaction(new_transaction)
    else:
        approve_transaction(new_transaction)

7. Model Training and Deployment Pipeline NEW

Scenario: Train models on historical data and deploy for production use Solution: Batch training with model persistence

# Training and deployment pipeline
from fraud_detection.src.models.sequence_models import SequenceFraudDetector

# Phase 1: Train models
trainer = SequenceFraudDetector()
trainer.setup_prediction_mode()
# Choose: 1. Batch Prediction

trainer.load_and_analyze_data('historical_fraud_data.csv')
trainer.setup_learning_mode()
trainer.preprocess_data()
trainer.train_models(['LSTM', 'GRU'])

# Save trained models for production
trainer.save_model_package('production_lstm_model')
print("βœ… Models trained and saved for production use")

# Phase 2: Deploy for stream processing
deployer = SequenceFraudDetector()
deployer.setup_prediction_mode()
# Choose: 2. Stream Prediction

deployer.load_model_package('production_lstm_model')
print("βœ… Models loaded and ready for stream processing")

8. Network Analysis and Graph-based Detection

Scenario: Detect fraud rings and coordinated attacks Solution: Network features with graph analysis

# Network fraud detection
from fraud_detection.src.feature_engineering.network_features import NetworkAnalyzer

analyzer = NetworkAnalyzer()
fraud_rings = analyzer.detect_fraud_networks(
    transaction_data='network_transactions.csv',
    similarity_threshold=0.8,
    min_ring_size=3
)

πŸ—οΈ Architecture

High-Level Architecture

High-Level Architecture

Core Components

1. Fraud Detection Service (/fraud_detection/)

The main fraud detection microservice with end-to-end ML capabilities, exposed via FastAPI for real-time and batch use cases.


2. Feature Engineering Pipeline

A modular preprocessing and feature extraction pipeline designed for real-time and offline analytics.

  • Transaction Features: Amount distributions, frequency of transactions, merchant category profiling.
  • User Features: Historical behavioral patterns, device fingerprints, account age/risk indicators.
  • Network Features: Graph-based entity relationships (shared IPs, merchants, accounts).
  • Temporal Features: Time-series analysis (sliding windows, session duration, peak-time anomalies).

3. API Services

Enterprise-grade RESTful APIs providing low-latency endpoints for real-time scoring and large-scale data ingestion.
All APIs are implemented using FastAPI, leveraging OpenAPI/Swagger for documentation and schema validation.

  • Fraud Scoring API: Real-time fraud detection endpoint.

    • Built on FastAPI for async performance.
    • Supports REST and optionally gRPC for low-latency scenarios.
    • Designed for sub-50ms response times with Redis caching.
  • Batch Processing API: Bulk scoring and data ingestion.

    • Optimized for large datasets with Dask / Apache Spark integration.
    • Used for offline analysis, backfills, reporting, and model monitoring.
    • Supports scheduled jobs (via Apache Airflow).
  • Model Management API: Centralized model lifecycle control.

    • Provides endpoints for deployment, versioning, and rollback of fraud detection models.
    • Integrates with object storage (MinIO / Amazon S3) for model registry.
    • Secured with OAuth2 / JWT for enterprise compliance.

4. Enterprise Infrastructure

A cloud-native, microservices-based foundation, optimized for scalability and observability.

  • Containerization: Docker multi-stage builds for lightweight, reproducible services (Podman is another option).
  • Orchestration: Kubernetes with Helm charts for deployment, scaling, and service discovery.
  • Monitoring: Prometheus (metrics), Grafana (dashboards), Jaeger (distributed tracing).
  • Security: OAuth2 / JWT authentication, RBAC policies, and API key management for multi-tenant enterprise compliance.

πŸ”§ Configuration and Customization

Environment Configuration

# .env file
DATABASE_URL=postgresql://user:password@localhost:5432/dafu
REDIS_URL=redis://localhost:6379/0
MODEL_STORAGE_PATH=/models
LOG_LEVEL=INFO
API_RATE_LIMIT=1000
FRAUD_THRESHOLD=0.7

Model Configuration

# config/models.json
{
    "isolation_forest": {
        "contamination": [0.01, 0.05, 0.1],
        "n_estimators": 100,
        "random_state": 42
    },
    "lstm": {
        "sequence_length": 10,
        "hidden_units": 64,
        "dropout": 0.2,
        "epochs": 50
    },
    "xgboost": {
        "n_estimators": 100,
        "max_depth": 6,
        "learning_rate": 0.1
    }
}

Business Rules Configuration

# config/rules.json
{
    "amount_threshold": {
        "condition": "amount > 10000",
        "risk_score": 0.8,
        "action": "flag_for_review"
    },
    "velocity_check": {
        "condition": "transactions_per_hour > 10",
        "risk_score": 0.6,
        "action": "additional_verification"
    },
    "location_anomaly": {
        "condition": "distance_from_home > 1000km",
        "risk_score": 0.5,
        "action": "location_verification"
    }
}

πŸ“ˆ Performance and Scalability

Performance Metrics

  • Latency: <50ms for real-time scoring
  • Throughput: 10,000+ transactions per second
  • Accuracy: 95%+ fraud detection accuracy
  • Availability: 99.9% uptime SLA

Scaling Options

Horizontal Scaling

# Kubernetes HPA configuration
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: dafu-fraud-detection
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: dafu-fraud-detection
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Caching Strategy

# Redis caching for model predictions
import redis
from functools import wraps

redis_client = redis.Redis(host='localhost', port=6379, db=0)

def cache_prediction(expiration=3600):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            cache_key = f"prediction:{hash(str(args) + str(kwargs))}"
            cached_result = redis_client.get(cache_key)
            
            if cached_result:
                return json.loads(cached_result)
            
            result = func(*args, **kwargs)
            redis_client.setex(cache_key, expiration, json.dumps(result))
            return result
        return wrapper
    return decorator

πŸ”’ Security and Compliance

Security Features

  • Authentication: OAuth2/JWT token-based authentication
  • Authorization: Role-based access control (RBAC)
  • Data Encryption: TLS 1.3 for data in transit, AES-256 for data at rest
  • Input Validation: Comprehensive data validation and sanitization
  • Audit Logging: Complete audit trail for compliance

Compliance Standards

  • GDPR: Data processing transparency and user rights
  • PCI DSS: Secure payment card data handling
  • SOC 2: Security controls and monitoring
  • ISO 27001: Information security management

Security Configuration

# Security middleware
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.trustedhost import TrustedHostMiddleware

app = FastAPI()

# CORS configuration
app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://dafu.masterfabric.co"],
    allow_credentials=True,
    allow_methods=["GET", "POST"],
    allow_headers=["*"],
)

# Trusted hosts
app.add_middleware(
    TrustedHostMiddleware,
    allowed_hosts=["dafu.masterfabric.co", "*.masterfabric.co"]
)

πŸ§ͺ Testing and Quality Assurance

Test Suite

# Run all tests
cd fraud_detection
pytest tests/ -v --cov=src --cov-report=html

# Run specific test categories
pytest tests/test_anomaly_detection.py -v
pytest tests/test_api_endpoints.py -v
pytest tests/test_feature_engineering.py -v

# Performance testing
pytest tests/test_performance.py -v --benchmark-only

Code Quality

# Linting and formatting
black src/ tests/
flake8 src/ tests/
mypy src/
pylint src/

# Pre-commit hooks
pre-commit install
pre-commit run --all-files

πŸ“Š Monitoring and Observability

Metrics Collection

# Prometheus metrics
from prometheus_client import Counter, Histogram, Gauge

fraud_predictions_total = Counter('fraud_predictions_total', 'Total fraud predictions', ['model', 'result'])
prediction_latency = Histogram('prediction_latency_seconds', 'Prediction latency')
active_models = Gauge('active_models_total', 'Number of active models')

Logging Configuration

# Structured logging
import structlog

logger = structlog.get_logger()

# Usage
logger.info(
    "fraud_prediction_completed",
    transaction_id="tx_123",
    model="isolation_forest",
    risk_score=0.85,
    processing_time_ms=45
)

Grafana Dashboards

  • Fraud Detection Metrics: Prediction accuracy, latency, throughput
  • Model Performance: Model accuracy, drift detection, retraining triggers
  • System Health: CPU, memory, disk usage, API response times
  • Business Metrics: Fraud rates, false positives, cost analysis

πŸš€ Deployment

Current Deployment Options

Local Python Deployment (Active Now) βœ…

Best for: Development, testing, ML model training

# Clone repository
git clone https://github.com/MasterFabric/dafu.git
cd dafu/core/features/fraud_detection

# Setup environment
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run ML models
cd src/models
python main.py  # Interactive model selection

Features Available:

  • βœ… All ML models (Isolation Forest, LSTM, GRU)
  • βœ… Training and prediction
  • βœ… Stream/batch processing
  • βœ… Model persistence
  • βœ… Visualization and export

Docker Compose (Infrastructure Ready) ⚠️

Status: Configuration complete, services commented out until API-ML integration

The complete Docker Compose setup is prepared in docker-compose.yml but all services are currently commented out. See Docker Status for details.

What's Prepared:

  • Complete service definitions (API, PostgreSQL, Redis, RabbitMQ, Celery, Prometheus, Grafana)
  • Database schemas
  • Network and volume configuration
  • Health checks and monitoring

When Active (after integration):

# Uncomment services in docker-compose.yml
docker-compose up -d

Kubernetes Production

# Deploy to Kubernetes
kubectl apply -f core/features/fraud_detection/deployment/k8s-manifests/

# Or using Helm
helm install dafu ./core/features/fraud_detection/deployment/helm-charts/ \
  --set image.tag=latest \
  --set replicas=3 \
  --set resources.requests.memory=512Mi \
  --set resources.requests.cpu=250m

CI/CD Pipeline

# .github/workflows/deploy.yml
name: Deploy to Production
on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Run tests
        run: |
          cd core/features/fraud_detection
          pip install -r requirements.txt
          pytest tests/
  
  deploy:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to Kubernetes
        run: |
          kubectl set image deployment/dafu-fraud-detection \
            fraud-detection=masterfabric/dafu:latest

🀝 Contributing

Development Setup

  1. Fork the repository
  2. Create a feature branch
    git checkout -b feature/your-feature-name
  3. Install development dependencies
    cd core/features/fraud_detection
    pip install -r requirements.txt
    pip install -r requirements-dev.txt  # If available
  4. Run tests and linting
    pytest tests/
    black src/ tests/
    flake8 src/ tests/
  5. Submit a pull request

Code Standards

  • Python Style: PEP 8 compliance with Black formatting
  • Type Hints: Comprehensive type annotations
  • Documentation: Google-style docstrings
  • Testing: 90%+ test coverage required
  • ASCII Only: No non-ASCII characters in code (enforced by pre-commit hooks)

Commit Convention

Follow the conventional commit format:

type(scope): description

[optional body]

[optional footer]

Types: feat, fix, docs, style, refactor, test, chore

πŸ“š API Documentation

Fraud Scoring API

POST /dafu/v1/score

Real-time fraud scoring endpoint.

Request:

{
  "transaction_id": "tx_123",
  "amount": 150.00,
  "user_id": "user_456",
  "merchant_id": "merchant_789",
  "timestamp": "2024-01-15T10:30:00Z",
  "device_fingerprint": "fp_abc123",
  "ip_address": "192.168.1.1"
}

Response:

{
  "transaction_id": "tx_123",
  "risk_score": 0.85,
  "is_fraud": true,
  "model_used": "isolation_forest",
  "processing_time_ms": 45,
  "confidence": 0.92,
  "explanations": {
    "amount_risk": 0.3,
    "user_behavior_risk": 0.7,
    "merchant_risk": 0.2
  }
}

POST /dafu/v1/batch/analyze

Batch fraud analysis endpoint.

Request:

{
  "data_source": "s3://bucket/data.csv",
  "analysis_type": "comprehensive",
  "models": ["isolation_forest", "lstm", "xgboost"],
  "output_format": "detailed_report"
}

Model Management API

GET /dafu/v1/models

List available models.

POST /dafu/v1/models/deploy

Deploy a new model version.

GET /dafu/v1/models/{model_id}/performance

Get model performance metrics.

Postman Collection

A complete Postman collection is available for testing all API endpoints:

πŸ“¦ DAFU_API.postman_collection.json

What's Included:

Category Endpoints Description
1. Authentication 7 endpoints Register, Login, Logout, Token refresh, Password change, API keys
2. Log Management 6 endpoints CRUD operations, Statistics, Filtering
3. Report Management 6 endpoints Report generation, Tracking, Statistics
4. Product Management 7 endpoints Product CRUD, High-risk detection, Statistics
5. Health & System 3 endpoints Health check, API info, OpenAPI schema

Features:

  • βœ… Auto-save tokens: Login automatically saves access_token to environment
  • βœ… Complete examples: All requests include sample data
  • βœ… Test scripts: Automated token management
  • βœ… Documentation: Each endpoint documented with descriptions
  • βœ… Environment variables: Pre-configured base_url and tokens

How to Use:

  1. Import into Postman

    # Option 1: Import file directly
    File β†’ Import β†’ Select DAFU_API.postman_collection.json
    
    # Option 2: Import from URL (if hosted)
    File β†’ Import β†’ Link β†’ Paste collection URL
  2. Create Environment (Optional but recommended)

    Environment Name: DAFU Local
    Variables:
    - base_url: http://localhost:8000
    - access_token: (will be set automatically after login)
    - refresh_token: (will be set automatically after login)
    
  3. Start API Server

    cd core/features/fraud_detection
    ./start_api.sh
  4. Test Workflow

    Step 1: Health Check β†’ Verify API is running
    Step 2: Register β†’ Create new user account
    Step 3: Login β†’ Get access token (auto-saved)
    Step 4: Try any endpoint β†’ Use authenticated requests
    

Quick Start with Postman:

  1. Start API: ./start_api.sh
  2. Import collection: DAFU_API.postman_collection.json
  3. Run "Register New User" β†’ Create account
  4. Run "Login" β†’ Token saved automatically βœ…
  5. Try any authenticated endpoint!

Alternative: Swagger UI

If you prefer browser-based testing:

πŸ† Current Implementation Status

βœ… Implemented Features (Production Ready)

Feature Description Status Implementation Level
Unified Model Interface Single entry point for all models βœ… NEW! Fully Implemented Complete with interactive selection
Isolation Forest Detection Core anomaly detection algorithm βœ… Fully Implemented Complete with evaluation & visualization
Sequence Models (LSTM/GRU) Time-series fraud detection βœ… Fully Implemented Complete with TensorFlow implementation
Stream Prediction Mode Real-time data stream processing βœ… NEW! Fully Implemented Complete with model persistence
Batch Prediction Mode Batch data processing βœ… NEW! Fully Implemented Complete with training & prediction
Model Persistence Save/load trained models βœ… NEW! Fully Implemented Complete with .joblib & .h5 support
Data Preprocessing Automatic data analysis & feature engineering βœ… Fully Implemented Complete with missing value handling
Supervised/Unsupervised Modes Dual learning approaches βœ… Fully Implemented Complete with mode selection
Risk Score Detection Custom threshold-based detection βœ… Fully Implemented Complete with business interpretation
Comprehensive Evaluation Performance metrics & visualization βœ… Fully Implemented Complete with 4-panel analysis
Enhanced Result Export CSV, JSON output with stream support βœ… Enhanced Complete with stream & batch exports
Docker Infrastructure Docker Compose configuration πŸ“‹ Prepared All services configured, not integrated yet
FastAPI Basic Structure REST API framework πŸ“‹ Prepared Basic endpoints exist, ML integration pending
Database Schema PostgreSQL schema design πŸ“‹ Prepared Complete schema ready, not connected yet
Docker Support Containerization βœ… Fully Implemented Dockerfile with multi-stage build
Fast Startup Interface Lazy loading for instant response βœ… NEW! Fully Implemented Complete with optimized imports

🚧 In Development (Infrastructure Ready, Integration Pending)

Feature Description Status Implementation Level
API-ML Integration Connect ML models to FastAPI 🚧 Next Priority API structure ready, needs ML integration
Database Integration PostgreSQL connection 🚧 In Development Schema ready, ORM integration pending
Redis Caching Performance optimization 🚧 In Development Config ready, not implemented
Celery Tasks Background job processing 🚧 In Development Not implemented yet
Feature Engineering Pipeline Advanced feature extraction 🚧 Basic Structure Framework exists, needs implementation
Rules Engine Business rule processing 🚧 Basic Structure Framework exists, needs implementation
Ensemble Models XGBoost, Random Forest 🚧 Basic Structure Framework exists, needs implementation

πŸ“‹ Planned Features (Roadmap)

Feature Description Status Target Timeline
Real-time API Sub-50ms fraud scoring API πŸ“‹ Planned In Development
Enterprise Security OAuth2, JWT, RBAC πŸ“‹ Planned In Development
Monitoring & Observability Prometheus, Grafana, Jaeger πŸ“‹ Planned In Development
Auto-scaling Kubernetes HPA πŸ“‹ Planned In Development
Advanced Analytics Graph-based fraud detection πŸ“‹ Planned In Development
Model Management Versioning, A/B testing πŸ“‹ Planned In Development
Compliance Features GDPR, PCI DSS compliance πŸ“‹ Planned In Development
High-throughput Processing 10,000+ TPS optimization πŸ“‹ Planned In Development

🎯 Current Performance Metrics

Achieved Metrics

  • Accuracy: 90%+ fraud detection accuracy (based on test results)
  • Model Training: Complete end-to-end pipeline
  • Data Processing: Handles large datasets efficiently
  • Visualization: Comprehensive 4-panel analysis plots
  • Export Capability: Structured results with timestamps

Target Metrics (Roadmap)

  • Latency: <50ms for real-time scoring (planned)
  • Throughput: 10,000+ transactions per second (planned)
  • Availability: 99.9% uptime SLA (planned)
  • Security: Zero data breaches (planned)

πŸ—ΊοΈ Development Roadmap

Phase 1 - API & Security Foundation

  • Real-time API Implementation: Complete FastAPI endpoints for fraud scoring
  • Authentication & Authorization: OAuth2/JWT implementation
  • Input Validation: Comprehensive request validation
  • API Documentation: OpenAPI/Swagger documentation
  • Basic Security: HTTPS, CORS, rate limiting

Phase 2 - Enterprise Infrastructure

  • Kubernetes Production Deployment: Full K8s manifests and Helm charts
  • Monitoring & Observability: Prometheus, Grafana, Jaeger integration
  • Auto-scaling: Kubernetes HPA with custom metrics
  • Message Queuing: RabbitMQ/Celery for async processing
  • Advanced Feature Engineering: Complete pipeline implementation

Phase 3 - Advanced Features

  • Model Management: Versioning, A/B testing, model registry
  • Ensemble Methods: XGBoost, Random Forest implementation
  • Graph-based Detection: Network analysis for fraud rings
  • Business Rules Engine: Complete rule processing system
  • Advanced Analytics: Dashboard and reporting system

Phase 4 - Scale & Optimization

  • High-throughput Optimization: 10,000+ TPS processing
  • Performance Tuning: Memory optimization, caching strategies
  • Compliance Features: GDPR, PCI DSS compliance tools
  • Machine Learning Pipeline: Automated model training and deployment
  • Multi-tenant Architecture: Enterprise multi-tenancy support

πŸ“Š Current Test Results

πŸ†˜ Support and Community

Getting Help

  • Documentation: Comprehensive guides and API documentation
  • GitHub Issues: Bug reports and feature requests
  • Community Forum: Discussions and Q&A
  • Feedback & Support: dafu@masterfabric.co
  • Enterprise Support: Contact the platform support team

Resources

Common Issues

  1. High Memory Usage: Configure chunked processing for large datasets
  2. Slow Predictions: Enable model caching and optimize feature engineering
  3. False Positives: Adjust risk score thresholds and retrain models
  4. API Rate Limiting: Configure appropriate rate limits for your use case

DAFU Enterprise Fraud Detection Platform v1.0.0
Built with ❀️ for secure, scalable, and intelligent fraud detection


πŸ“„ License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

License Summary

  • Commercial Use: βœ… Allowed with restrictions
  • Modification: βœ… Allowed
  • Distribution: βœ… Allowed with source code disclosure
  • Patent Use: βœ… Allowed
  • Private Use: βœ… Allowed
  • Sublicensing: ❌ Not allowed

Key Requirements

  1. Source Code Disclosure: Any distribution of the software must include the complete source code
  2. Network Interaction: If you run the software on a server and provide services over a network, you must make the source code available to users
  3. License Compatibility: Any derivative works must be licensed under the same AGPL-3.0 license
  4. Attribution: You must retain all copyright notices and license text

Full License Text

The complete license text is available in the LICENSE file in this repository.

Enterprise Licensing

For commercial enterprises requiring different licensing terms, please contact MasterFabric for enterprise licensing options.

MasterFabric - Enterprise-level fraud detection and e-commerce analytics solutions.

Contact: dafu@masterfabric.co

Based on the existing test results in the project:

Anomaly Detection Performance

  • Accuracy: 90%+ on test datasets
  • Detection Methods: Both classic and risk-score based detection working
  • Contamination Levels: Multiple levels (0.01, 0.05, 0.1) tested successfully
  • Visualization: 4-panel analysis plots generated successfully
  • Stream Processing: 100,000 records processed successfully NEW

Sequence Model Performance NEW

  • LSTM/GRU Models: Successfully trained and evaluated
  • Time-series Analysis: User behavior patterns detected
  • Model Architecture: Configurable sequence length and hidden units
  • Training: TensorFlow-based implementation with early stopping
  • Stream Prediction: 10,000 sequence records processed in stream mode NEW
  • Model Persistence: Models saved and loaded successfully NEW

Stream Processing Capabilities NEW

  • Real-time Processing: Stream data processed with pre-trained models
  • Model Loading: Pre-trained models loaded successfully for prediction
  • Data Preprocessing: Stream data preprocessed using saved transformers
  • Prediction Accuracy: High accuracy maintained in stream mode
  • Export Capabilities: Stream results exported with timestamps

Data Processing Capabilities

  • Automatic Analysis: Column detection and data suitability assessment
  • Preprocessing: Missing value handling, categorical encoding, scaling
  • Export Formats: CSV and JSON outputs with timestamps
  • Large Datasets: Efficient processing of substantial data volumes
  • Batch vs Stream: Both processing modes working efficiently NEW

About

DAFU is a comprehensive fraud detection and e-commerce analytics platform designed for enterprise deployment. Currently in active development, it provides advanced machine learning-based fraud detection capabilities with a focus on anomaly detection and sequence analysis. C: dafu@masterfabric.co

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors 2

  •  
  •