GitHub - arjunravi26/Spam-Detection: A Spam Detection Project using gensim(Word2Vec)

End-to-End Spam Detection Using Word2Vec and SVC

This project presents an end-to-end solution for message spam detection, leveraging Google's pretrained average Word2Vec model for vectorization and a Support Vector Classifier (SVC). The model is designed to achieve high accuracy in identifying spam messages. The key components of the project include:

Data Preprocessing: Comprehensive text cleaning, tokenization, and handling of missing values to ensure data quality.
Feature Extraction: Utilization of the average Word2Vec model for effective vectorization of text data.
Model Training: Implementation of the SVC, trained on a labeled dataset with a defined train-test split ratio and extensive hyperparameter tuning to optimize performance.
Evaluation: Rigorous assessment of the model's performance using metrics such as accuracy, precision, recall, and F1-score, along with cross-validation to ensure the model is not overfitting.
Accuracy Improvement: The initial model achieved an accuracy of 70%. Through the integration of the pretrained Word2Vec model and extensive algorithm and hyperparameter tuning, the accuracy was enhanced by 10%.
Deployment: Seamless integration with a Flask web application and deployment on AWS Elastic Beanstalk, ensuring scalable and reliable access to the model.

All these processes are encapsulated within a Scikit-learn pipeline, facilitating streamlined processing and reproducibility.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflow		.github/workflow
Data		Data
config		config
setup_nltk		setup_nltk
src/spam_detection		src/spam_detection
templates		templates
train_test_model		train_test_model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Spam Detection Using Word2Vec and SVC

About

Releases

Packages

Languages

License

arjunravi26/Spam-Detection

Folders and files

Latest commit

History

Repository files navigation

End-to-End Spam Detection Using Word2Vec and SVC

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages