A machine learning–based web application built with Streamlit that classifies SMS or Email messages as Spam or Ham (Not Spam).
This project demonstrates text preprocessing, feature extraction using TF-IDF, and classification with Multinomial Naive Bayes.
- ✅ Classifies SMS/Email into Spam or Ham in real time
- 🎨 Clean and simple Streamlit web interface
- 🔤 Text preprocessing with NLTK (stopwords removal, stemming, tokenization)
- 📊 Feature extraction using TF-IDF Vectorizer
- 🤖 Machine learning model trained with Multinomial Naive Bayes
- 🚀 Deployment-ready with Streamlit Community Cloud
-
Data Preprocessing
- Lowercasing text
- Removing special characters, punctuation, and stopwords
- Stemming words
-
Feature Engineering
- Transform text into numerical vectors using TF-IDF
-
Model Training
- Trained a MultinomialNB classifier
- Evaluated with accuracy, precision, and recall
-
Model Persistence
- Saved trained model and vectorizer as
.pklfiles using pickle
- Saved trained model and vectorizer as
-
Deployment
- Interactive UI built with Streamlit
- Deployed on Streamlit Cloud
📂 Project Structure Bash
sms-spam-classifier/
│
├── app.py # Streamlit application
├── model.pkl # Trained MultinomialNB model
├── vectorizer.pkl # TF-IDF vectorizer
├── requirements.txt # Dependencies
├── README.md # Project documentation
└── data/ # (Optional) Dataset files
Bash
git clone https://github.com/<your-username>/sms-spam-classifier.git
cd sms-spam-classifier
Bash
python -m venv venv
source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windows
Bash
pip install -r requirements.txt
Bash
streamlit run app.py
streamlit run app.py
Input:
Congratulations! You have won a $1000 Walmart gift card. Click here to claim. Output:
SPAM 🚨
Input:
Hey, are we still meeting for lunch today?
Output:
NOT SPAM ✅
Python 3.8+
Streamlit – UI & Deployment
Scikit-learn – Machine Learning Model
NLTK – Text Preprocessing
Pickle – Model persistence
Make sure your requirements.txt includes:
nginx Copy code streamlit scikit-learn nltk pandas numpy
Deployed on Streamlit Community Cloud
Push code & model files (app.py, model.pkl, vectorizer.pkl, requirements.txt) to GitHub
Go to Streamlit Cloud → New App
Select repo & branch → Deploy
This project is licensed under the MIT License.