This system implements a real-time Intrusion Detection System (IDS) using the nfstream library to analyze network traffic, extract features, and predict anomalous behaviors (attack or normal) using a machine learning model. Results are saved to a CSV file and displayed via a web interface built with Flask.
- Capture and analyze network traffic in real-time using nfstream.
- Extract features from network flows in CICIDS2017 format.
- Predict network behavior (normal or attack) using a machine learning model.
- Save results to a CSV file (predictions.csv) and display them on a Flask web interface.
- Support remote monitoring through the Flask web interface.
- models/: Directory containing pre-trained machine learning model files (customizable).
- best_binary_model.pkl: Binary classification model.
- scaler.pkl: Feature scaler (StandardScaler).
- label_encoder_binary.pkl: Label encoder for binary classification (LabelEncoder).
- templates/: Directory for web interface files.
- index.html: HTML file for displaying prediction results.
- application.py: Main source code for packet processing, prediction, and result display.
- predictions.csv: File storing prediction results (source IP, destination IP, label, probability, timestamp).
- Operating System: Ubuntu (or other Linux-based systems).
- Python 3.6 or higher.
- Required Python libraries:
- nfstream
- pandas
- numpy
- joblib
- scikit-learn
- flask
- Ensure Python 3 and pip are installed. If not, run the following commands on Ubuntu:
sudo apt update
sudo apt install python3 python3-pip
- Install the necessary Python libraries using:
pip3 install nfstream pandas numpy joblib scikit-learn flask flask-socketio
- Open application.py and update the INTERFACE variable with your network interface (e.g., eth0, wlan0).
INTERFACE = "ens33" # Thay "ens33" bằng giao diện của bạn
- To check available network interfaces, use:
ifconfig
- Place the best_binary_model.pkl, scaler.pkl, and label_encoder_binary.pkl files in the models/ directory.
- If these files are not available, train a model on a dataset (e.g., CICIDS2017) and save it using joblib.
- Open application.py and update the paths for the model files:
MODEL_BINARY_FILE = "/path/to/your/IDS/models/best_binary_model.pkl"
SCALER_FILE = "/path/to/your/IDS/models/scaler.pkl"
LE_BINARY_FILE = "/path/to/your/IDS/models/label_encoder_binary.pkl"
- In the directory containing application.py, run:
python3 application.py
- The program will:
- Capture packets from the configured network interface.
- Extract features, make predictions, and save results to predictions.csv.
- Launch the Flask web interface at http://localhost:5001.
- Open a browser and navigate to:
http://localhost:5001
- The interface displays processed network flows, including source IP, destination IP, port, predicted label, and probability.
- Real-time Monitoring: The system continuously captures packets and predicts network behavior. Results are updated on the web interface.
- Post-analysis: Review the predictions.csv file for a history of processed network flows.
- Customization:
- Adjust the flow processing timeouts by modifying
active_timeout
andidle_timeout
inapplication.py
. - Update the machine learning model in the models/ directory to improve accuracy.
- Adjust the flow processing timeouts by modifying
- Ensure you have sufficient permissions to access the network interface (may require sudo):
sudo python3 application.py
- For high network traffic, the system may consume significant CPU/memory. Consider adjusting WINDOW_DURATION or limiting traffic.
- The predictions.csv file is appended to, so its size will grow over time. Periodically archive or delete it as needed.
- Real-time Flow Table: Displays network flows with details (Flow ID, Source/Destination IP, Ports, Predicted Label, Confidence, Timestamp, Age).
- Filtering: Filter flows by type (All, Attacks Only, Benign Only).
- Chart Visualization: Line chart showing the number of attack and benign flows over time using Chart.js.
- Export to CSV: Download the displayed flows as a CSV file.
- Timeout Configuration: Adjust
active_timeout
andidle_timeout
directly from the interface. - Pagination: Navigate through flow history with a paginated table.
- Công Quân
Email: 22521190@gm.uit.edu.vn - Quốc Minh
Email: 22520855@gm.uit.edu.vn
This project was developed as part of the research topic Applying Machine Learning Techniques to Detect Malicious Network Traffic in Cloud Computing. The source code is provided for educational and research purposes.