🛡️ Phishing Detection Model (FastText)
A lightweight FastText-based model to classify domain names as phishing or clean.
It uses supervised learning with wordNgrams=2 for better n-gram feature coverage.
git clone https://github.com/facebookresearch/fastText.git
cd fastText
mkdir build && cd build
cmake ..
makepip install fasttextecho "carreeffoursa.site" | ./fasttext predict phishing_model.bin -You can also run the model via Docker API:
# Run with Docker Hub image
docker run -p 8080:8080 mstfknn/phishing-fasttext:latest
# Or run with GHCR image
docker run -p 8080:8080 ghcr.io/mstfknn/phishing-fasttext:latestExample request:
curl -X POST http://localhost:8080/predict \
-H "Content-Type: application/json" \
-d '{"domain": "carreeffoursa.site"}'Example response:
{
"domain": "carreeffoursa.site",
"label": "__label__phishing",
"probability": 0.9734
}- Framework: FastText
- Labels:
__label__phishing,__label__clean - Epochs: 10
- Learning rate: 0.5
- wordNgrams: 2
The model was trained on mstfknn/phishing-domain-list-2m-plus, a dataset with 2,000,000 domain names labeled as either phishing or clean.
This project is licensed under the MIT License.