An intelligent system that automatically categorizes SMS messages using Machine Learning and Natural Language Processing (NLP).
The project goes beyond traditional spam filtering by organizing messages into multiple meaningful categories, helping users manage their inbox efficiently.
This project was developed as part of an academic seminar in the Machine Learning / NLP domain.
1οΈβ£ Multi-class SMS categorization
2οΈβ£ Categories: Personal, Transactions, Promotions, Star, Spam
3οΈβ£ Automatic classification using ML algorithms
4οΈβ£ Reduced inbox clutter and improved message visibility
5οΈβ£ Highlights important messages like OTPs and bank alerts
6οΈβ£ Protects users from spam and phishing SMS
7οΈβ£ User-centric and scalable design
1οΈβ£ Python β Core programming language
2οΈβ£ Machine Learning β Model training and prediction
3οΈβ£ Natural Language Processing (NLP) β Text analysis
4οΈβ£ Scikit-learn β ML algorithms and evaluation
5οΈβ£ Pandas & NumPy β Data processing
6οΈβ£ Frontend β (Optional UI module)
7οΈβ£ Backend β Model integration and logic

Click the screenshot to view full size.
sms-categorization/
ββ sms-frontend/
β ββ (UI components)
ββ sms-backend/
β ββ data/
β ββ preprocessing/
β ββ models/
β ββ train_model.py
β ββ predict.py
ββ assets/
β ββ screenshot.png
ββ .gitignore
ββ README.md
1οΈβ£ Clone or download the repository
2οΈβ£ Navigate to the backend folder
3οΈβ£ Install dependencies
pip install -r requirements.txt- π§ ML Algorithms: Naive Bayes, Logistic Regression, SVM, Decision Tree
- π Feature Extraction: Bag-of-Words, TF-IDF
- π SMS Categories: Personal, Transactions, Promotions, Star, Spam
- βοΈ Model Settings: Training parameters and thresholds
- π Dataset: Size and language support
1οΈβ£ Demonstrate the practical application of Machine Learning and NLP in real-world SMS management
2οΈβ£ Move beyond traditional binary spam filtering to multi-class SMS categorization
3οΈβ£ Improve user experience by organizing messages into meaningful categories
4οΈβ£ Reduce inbox clutter and prevent missing important messages such as OTPs and bank alerts
5οΈβ£ Build an academic and portfolio-ready Machine Learning project
- Naive Bayes β Efficient and fast for text-based classification
- Logistic Regression β Probabilistic classification using One-vs-Rest for multi-class SMS categorization
- Decision Tree β Rule-based classification for interpretable results
- Support Vector Machine (SVM) β High-accuracy classification for high-dimensional text data
Models are evaluated using standard metrics such as accuracy, precision, recall, and F1-score.
- Automatically organizes SMS into Personal, Transactions, Promotions, Star, and Spam
- Improves productivity by reducing manual message sorting
- Enhances security by