🔐 Network Security : End-to-End ML Pipeline

This is a showcase project to present my abilities in the development and deployment of an end-to-end machine learning pipeline, with a focus on cybersecurity and malicious URL detection. The project demonstrates my skills across the ML lifecycle — from data ingestion to production-ready deployment.

🚀 Project Highlights

Modular Pipeline using custom components for:
- ✅ Data ingestion from MongoDB, validation, and transformation
- ✅ Model training, tuning (via GridSearchCV), and evaluation
- ✅ Overfitting/underfitting checks and drift detection
- ✅ MLflow logging for experiment tracking
- ✅ Batch prediction support for incoming CSVs
- ✅ Streamlit app for interactive use
- ✅ Docker support
- ✅ CI/CD with GitHub Actions
Preprocessing
- Missing value imputation with KNNImputer
- Feature scaling & label normalization
- YAML schema-driven pipeline logic
Modeling
- Ensemble methods (RandomForest, GradientBoosting, AdaBoost) and LogisticRegression
- Custom evaluation metrics with f1_score, precision, recall, accuracy
Monitoring
- Data drift detection using Kolmogorov–Smirnov test
- Drift reports saved in timestamped YAML files
Deployment
- ✅ Final model serialization (including preprocessor)
- ✅ batch_prediction.py for real-world inference
- ✅ Streamlit app for CSV-based prediction and visualization

🛠 Tech Stack

Python 3.12
Scikit-learn, Pandas, NumPy
MLflow for experiment tracking
Streamlit for UI
Docker for containerization
GitHub Actions for CI
YAML-based configuration

📁 Folder Structure

.
├── src/
│   ├── components/         # Data & model pipeline steps
│   ├── utils/              # Utility functions
│   ├── entity/             # Config & artifact classes
│   ├── constants/          # Static paths and values
│   ├── pipeline/           # Training & prediction pipelines
│   └── monitoring/         # Drift checking logic
├── artifacts/              # Timestamped pipeline outputs
├── app.py                  # Streamlit app
├── batch_prediction.py     # Inference logic
├── main.py                 # Training pipeline trigger
└── requirements.txt

🧪 Run Locally

Install dependencies:

pip install -r requirements.txt

Trigger training pipeline:

python main.py

Run Streamlit app for prediction:

streamlit run app.py

📊 MLflow Tracking

Track all training metrics, parameters, and models via MLflow by setting:

MLFLOW_TRACKING_URI=<your_tracking_uri>

🐳 Docker Support

Build the Docker image:

docker build -t network_security_app .

Run batch prediction using mounted volumes:

docker run -v /local/input:/data/in -v /local/output:/data/out network_security_app

✅ CI/CD with GitHub Actions

This project uses GitHub Actions to:

Install dependencies
Run unit tests
Ensure reproducibility of builds

See .github/workflows/main.yml.

🎯 Purpose

Built to simulate a production-grade ML system in the security domain. This project reflects real-world challenges like data quality, model drift, and deployment readiness — all handled with modular, testable, and extensible code.

Feel free to explore, fork, or ask questions!

Author: Shahriyar A. | 2025

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
Artifacts		Artifacts
Final_model		Final_model
Network_Data		Network_Data
batch_predictions		batch_predictions
logs		logs
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.MD		README.MD
app.py		app.py
config.yaml		config.yaml
main.py		main.py
params.yaml		params.yaml
push_data.py		push_data.py
requirements.txt		requirements.txt
schema.yaml		schema.yaml
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔐 Network Security : End-to-End ML Pipeline

🚀 Project Highlights

🛠 Tech Stack

📁 Folder Structure

🧪 Run Locally

📊 MLflow Tracking

🐳 Docker Support

✅ CI/CD with GitHub Actions

🎯 Purpose

About

Uh oh!

Releases

Packages

Languages

Shah-xai/Network_Security_project

Folders and files

Latest commit

History

Repository files navigation

🔐 Network Security : End-to-End ML Pipeline

🚀 Project Highlights

🛠 Tech Stack

📁 Folder Structure

🧪 Run Locally

📊 MLflow Tracking

🐳 Docker Support

✅ CI/CD with GitHub Actions

🎯 Purpose

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages