SMS Spam Collection

This is a text corpus of over 5,500 English SMS messages with ~13% labeled as spam. The text file contains one message per line with two columns: the label ("ham" or "spam") and the raw text of the message. Messages labeled as "ham" are non-spam messages considered legitimate.

Background: You work for a telecom company launching a new messaging app. Unfortunately, the previous spam filters that they used were out of date and no longer effective. They have asked you whether you can use new data they supplied to accurately distinguish between spam and regular messages. They have also told you that it is essential that regular messages are rarely if ever, categorized as spam.

Objective: Build a streamlit web app to detect spam accurately
Techniques Used: Exploratory Data Analysis, Data Visualization,, Predictive Modeling, Web Frameworks, RESTful APIs, Containerization .
Type of Problem: Binary Classification
Language, Libraries, technologies used: Python, Pandas, Matplotlib, Seaborn, Numpy, word cloud, String, Nltk, Scikit-learn, pickle, Docker, flask, Streamlit .

Source of the dataset. This corpus was created by Tiago A. Almeida and José María Gómez Hidalgo.

Citations:

Almeida, T.A., Gómez Hidalgo, J.M., Yamakami, A. Contributions to the Study of SMS Spam Filtering: New Collection and Results. Proceedings of the 2011 ACM Symposium on Document Engineering (DOCENG'11), Mountain View, CA, USA, 2011.
Gómez Hidalgo, J.M., Almeida, T.A., Yamakami, A. On the Validity of a New SMS Spam Collection. Proceedings of the 11th IEEE International Conference on Machine Learning and Applications (ICMLA'12), Boca Raton, FL, USA, 2012.
Almeida, T.A., Gómez Hidalgo, J.M., Silva, T.P. Towards SMS Spam Filtering: Results under a New Dataset. International Journal of Information Security Science (IJISS), 2(1), 1-18, 2013.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
static		static
template		template
Dockerfile		Dockerfile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
SMSSpamCollection.csv		SMSSpamCollection.csv
back_end.py		back_end.py
model.bin		model.bin
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt
wepik-export-2023110200462732x8.png		wepik-export-2023110200462732x8.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS Spam Collection

About

Releases

Packages

Languages

grascya/Sms-Spam-Detection

Folders and files

Latest commit

History

Repository files navigation

SMS Spam Collection

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages