ML-Based Web Application Firewall (WAF)

This project implements a machine learning-powered Web Application Firewall (WAF) to detect and prevent web-based attacks, including threats from the OWASP Top 10. Built using Python, FastAPI, and Scikit-learn, the WAF combines NLP-based supervised learning with anomaly detection to provide real-time threat classification and mitigation.

Overview

Traditional WAFs rely heavily on static rules and signatures, making them less effective against zero-day attacks or evasive payloads. This project enhances traditional detection capabilities using machine learning models trained on both synthetic and real-world attack traffic.

The firewall is integrated into a modular FastAPI backend and can be used as a standalone microservice or within an existing web stack.

Features

Capability	Description
ML-Based Threat Detection	Detects malicious payloads using classification models trained on attack patterns.
OWASP Coverage	Focuses on high-impact vulnerabilities including SQLi, XSS, SSRF, RCE, and more.
Anomaly Detection	Uses unsupervised models to identify outliers and unknown attacks.
Real-Time Classification	Handles live traffic and provides immediate threat feedback.
FastAPI Integration	Easy-to-deploy REST API for modular usage or CI/CD testing.
Burp Suite Compatibility	Simulates attacks and traffic using Burp Suite for model evaluation.

Technology Stack

Component	Description
Python	Primary programming language
FastAPI	Backend framework for serving the WAF API
Scikit-learn	ML models for classification and anomaly detection
Regex & NLP	Used for pattern extraction and text feature engineering
Burp Suite	Used for payload simulation and traffic replay

Model Training

Dataset: Mix of real-world payloads, synthetic attack vectors, and clean traffic.
Features:
- Regex patterns
- Token frequency
- Length and entropy measures
Algorithms:
- Supervised: Random Forest, Logistic Regression
- Unsupervised: Isolation Forest, One-Class SVM
Validation: Burp Suite replay to simulate attack traffic and test detection effectiveness.

API Endpoints

Method	Endpoint	Description
POST	`/scan`	Accepts a web request payload and classifies it as benign or malicious.
GET	`/status`	Health check endpoint.
GET	`/rules`	Returns current regex-based rule set.

License

This project is licensed under the MIT License.
Please refer to the LICENSE file for more details.

Contact

Abhijit Rai

GitHub: https://github.com/aerostorm19
LinkedIn: https://www.linkedin.com/in/abhijit-rai-163214280/
Email: abhi160407@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
modules		modules
waf-backend		waf-backend
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
burpy.py		burpy.py
core.py		core.py
rawweb.py		rawweb.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-Based Web Application Firewall (WAF)

Overview

Features

Technology Stack

Model Training

API Endpoints

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ML-Based Web Application Firewall (WAF)

Overview

Features

Technology Stack

Model Training

API Endpoints

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages