This repository contains notebooks and models for detecting cyber threats and classifying attacks using machine learning and deep learning techniques. Following that, real-time implementation was done using synthetic data and self-healing mechanisms were implemented.
This project aims to enhance cybersecurity by implementing the following primary tasks:
- Threat Detection: Using an ensemble of Artificial Neural Networks (ANN) and Long Short-Term Memory (LSTM) models.
- Attack Classification: Using a Decision Tree model to classify different types of attacks.
- Self-Healing Mechanism: Based on attack classification, implementing self-healing mechanisms in real-time.
- Dashboards and Visualisation: Use of dashboards for real-time threat detection and attack classification.
Two different datasets were used to train models in this project. One was the Network Intrusion Detection dataset (CSV format) used to train, test and validate threat detection models. Second was CICIDS2017 (CSV format) which was used to train and test the attack classification model. Both datasets are uploaded to the repository.
This notebook includes the following:
- Data Loading and Preprocessing: Loading the dataset, handling missing values, encoding categorical variables, and feature scaling.
- Model Training:
- ANN Model: Binary classification using a deep neural network.
- LSTM Model: Sequential data classification using LSTM layers.
- Ensemble Model: Combining predictions from ANN and LSTM models.
- Model Evaluation: Accuracy, confusion matrix, ROC curve, and AUC score.
- Model Saving: Saving the trained models and ensemble model.
This notebook includes:
- Data Loading and Preprocessing: Combining multiple datasets, handling missing values, encoding categorical variables, and feature scaling.
- Model Training: Decision Tree model to classify different types of attacks.
- Model Evaluation: Accuracy, confusion matrix, and classification report.
- Model Saving: Saving the trained Decision Tree model.
This notebook includes:
- Real-Time Monitoring: Streaming data, making predictions using the ensemble model, and logging results.
- Visualization: Displaying real-time predictions and plotting attack distribution.
- Self-Healing Dashboard: Using the Decision Tree model for real-time attack classification and visualization.
The following models are trained and saved in this project:
- ANN Model:
ANN_model.h5
- LSTM Model:
LSTM_model.h5
- Ensemble Model:
ensemble_model.pkl
- Decision Tree Model:
decision_tree_model.joblib
Also, note that, these models were originally saved on my drive so they have their paths accordingly. Please change the paths according to your need on your side.
To run the notebooks and scripts in this repository, you need the following libraries:
You can install these dependencies using the following command:
pip install numpy pandas seaborn matplotlib scikit-learn tensorflow joblib
-
Clone the Repository:
git clone https://github.com/your-username/threat-detection-and-attack-classification.git cd threat-detection-and-attack-classification
-
Run the Notebooks: Open the Jupyter notebooks in the order mentioned above and run the cells.
-
Monitor Real-Time Data: Use the Dashboard notebook to visualize and monitor real-time predictions.
-
Analyze Results: Use the saved models to make predictions on new data and analyze the results.
Contributions are welcome! Please fork this repository and submit a pull request for any improvements or new features.
This project is licensed under the MIT License. See the LICENSE file for details.