FraudDetectionSystem

Overview

Real-time fraud detection pipeline for financial transactions using Kafka, Spark Structured Streaming, PostgreSQL, Isolation Forest, Grafana, and Streamlit.

This project simulates transactions, detects anomalies, stores them in a database, and provides both real-time dashboards (Grafana) and machine learning insights (Streamlit).

Project Features

Kafka Producer: generates synthetic transactions with fraud flag, is_fraud. Learn why synthetic data is mostly used in ML model training as opposed to real data here
Spark Structured Streaming: two consumers (transaction_consumer & fraud_consumer).
PostgreSQL sink: stores transactions and flagged frauds.
Grafana dashboards: real-time monitoring & SQL analytics.
Isolation Forest ML model: trained offline, saved to jobs/isolation_forest.pkl.
Streamlit app: visualizes model predictions & performance.
Docker Setup: services run end-to-end with docker compose up.

Project Architecture:

Project Setup

1. Clone this repository

git clone https://github.com/dkkinyua/FraudDetectionSystem.git
cd FraudDetectionSystem

2. Set up virtual environment and install required dependencies

python3 -m venv myenv
source myenv/bin/activate # MacOS/ Linux
myenv\Scripts\activate # Windows/Powershell

pip install -r requirements.txt

3. Run all services in Docker

docker compose up --build

The services include:

trainer: which trains the model when run
producer: invokes the producer to write data in Kafka topic
transaction-consumer and fraud-consumer: invokes the consumers to consume and write data to Postgres

4. Setup Kafka on Redpanda.

The Apache Kafka used in this project is cloud-based, hosted on Redpanda. Visit their website to start configuring your Kafka cluster and topic free for 15 days with $100 credit.

5. Run Streamlit application

streamlit run app.py

This will run on localhost:8501 and will show the model's performance.

Dashboards.

To access the Grafana dashboard, please follow this link

Conclusion

I have written a blog about this project, read it here

Do you have any questions or contributions? Please reach out to me in any of my social media platforms or open a PR request.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
app		app
jobs		jobs
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FraudDetectionSystem

Overview

Project Features

Project Architecture:

Project Setup

1. Clone this repository

2. Set up virtual environment and install required dependencies

3. Run all services in Docker

4. Setup Kafka on Redpanda.

5. Run Streamlit application

Dashboards.

Conclusion

About

Uh oh!

Releases

Packages

Languages

dkkinyua/FraudDetectionSystem

Folders and files

Latest commit

History

Repository files navigation

FraudDetectionSystem

Overview

Project Features

Project Architecture:

Project Setup

1. Clone this repository

2. Set up virtual environment and install required dependencies

3. Run all services in Docker

4. Setup Kafka on Redpanda.

5. Run Streamlit application

Dashboards.

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages