This project implements a machine-learning-based Network Intrusion Detection System (NIDS) using the KDD Cup 1999 (10%) dataset.
The system is capable of:
- Training an intrusion detection model offline
- Detecting network intrusions in near-realtime
- Classifying traffic into attack categories
- Displaying live predictions in a Streamlit dashboard
- Simulating live network traffic for demonstration purposes
The model predicts high-level attack categories:
normaldosprober2lu2r
.
├── dashboard.py
├── incoming.csv
├── incoming copy.csv
├── kddcup.data_10_percent
├── nids_kdd_pipeline.joblib
├── nids_label_encoder.joblib
├── nids_training.py
├── realtime_predict.py
├── traffic_simulator.py
└── README.md
The KDD Cup 1999 (10%) dataset used for training and simulation.
- Each row = one network connection
- 41 traffic features + 1 label
- Used for model training and traffic simulation
Trains the intrusion detection model.
- Loads the KDD dataset
- Groups attack labels into categories
- Applies preprocessing (scaling + one-hot encoding)
- Trains a machine-learning classifier
- Saves trained artifacts
Outputs:
nids_kdd_pipeline.joblibnids_label_encoder.joblib
Serialized trained model pipeline (preprocessing + classifier).
Maps numeric class IDs to readable labels (normal, dos, probe, r2l, u2r).
Live input file for realtime detection.
- 41 feature columns only
- No label column
- Continuously updated by the traffic simulator
Backup/reference copy of incoming.csv.
Simulates realtime network traffic by appending rows to incoming.csv.
Terminal-based realtime intrusion detection script.
Streamlit-based GUI dashboard for realtime intrusion detection.
- Python 3.9 or higher recommended
python -m pip install pandas numpy scikit-learn joblib streamlitcd path\to\your\project\folderpython --versionpython nids_training.py --data "kddcup.data_10_percent" --use_categoriescopy "incoming copy.csv" "incoming.csv"python traffic_simulator.pypython realtime_predict.pypython -m streamlit run dashboard.pyKDD Dataset
↓
Model Training
↓
Saved Model (.joblib)
↓
Traffic Simulator → incoming.csv
↓
Realtime Prediction
↓
Streamlit Dashboard
- Always run Streamlit using
python -m streamlit run dashboard.py - Do not add a label column to
incoming.csv - Keep the traffic simulator running during demos
This project demonstrates realtime intrusion detection, attack classification, and live visualization using machine learning and Streamlit.