Fraud Detection

General info

The project concerns the anomaly detection in credit cards transactions using machine learning models and Autoencoders. The main aim of this project is predict whether a given transaction was a fraud or not. The analysis includes data analysis, data preparation and creation model by using Isolation Forest, Local Outlier Factor, Support Vector Machine (OneClassSVM) algorithms and autoencoder model.

Dataset

The dataset contains transactions made by credit cards in two days in 2013 by European cardholders and contains frauds as well. It comes from Kaggle and can be find here.

Motivation

The aim of this project is predict whether a given transaction was a fraud. The frauds in the credit cards industry are stealing or using stolen cards, or take more an aggressive form such as account takeover, counterfeiting and much more. The magnitude of credit card frauds is growing larger each day due to ever-increasing online transactions. Anomaly detection is an unsupervised data processing technique to detect anomalies from the dataset. Anomalies are data points that stand out amongst other data points in the dataset and do not fit the normal behavior in the data. These data points or observations deviate from the dataset’s normal behavioral patterns. The anomaly detection is very useful to detect fraud transactions just like in this case.

Project contains:

fraud detection using machine learning models - Fraud_detection.ipynb
fraud detection using autoencoder model - Fraud_Autoencoder.ipynb

Summary

The main aim of this project was prediction whether a given transaction was a fraud or not. The analysis included data analysis, data preparation and creation models by using Isolation Forest, Local Outlier Factor, Support Vector Machine (OneClassSVM) algorithms. In the first approach I evaluated our models with a few methods to check which model is the best. I used a accuracy score, f1 score, recall and confusion matrix. Finally the best model was One Class SVM with recall value 84%. That model has the lowest possible False Negatives rate and will be not miss many anomalies.

The second part of analysis I used autoencoders model to fraud anomaly detection. I built the model only on one-class examples (normal transactions) with no suspicious transactions. I evaluated our model with a few methods such as reconstruction error, recall and confusion matrix. The model captures 83% of the anomaly data points (based on recall value) and has the low False Negatives rate and will be not miss many anomalies.

Technologies

The project is created with:

Python 3.9
libraries: tensorflow, pandas, numpy, scikit-learn, seaborn, matplotlib.

Running the project:

To run this project use Jupyter Notebook or Google Colab.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Fraud_Autoencoder.ipynb		Fraud_Autoencoder.ipynb
Fraud_detection.ipynb		Fraud_detection.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fraud Detection

General info

Dataset

Motivation

Project contains:

Summary

Technologies

References:

About

Uh oh!

Releases

Packages

Languages

aniass/Fraud-Detection

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection

General info

Dataset

Motivation

Project contains:

Summary

Technologies

References:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages