Email Spam Detection

A machine learning project for detecting spam emails using LSTM neural networks with TensorFlow and Keras.

Overview

This project implements a binary classification model to distinguish between spam and legitimate emails. The model uses natural language processing techniques and a Long Short-Term Memory (LSTM) architecture to achieve high accuracy in spam detection.

Features

Data preprocessing with stopword removal and punctuation cleaning
Balanced dataset handling to prevent bias
LSTM-based deep learning model
Comprehensive evaluation metrics including confusion matrix and classification report
Model persistence for future predictions

Dataset

The project uses the spam_ham_dataset.csv containing labeled email samples. The dataset is balanced to ensure equal representation of spam and non-spam emails during training.

Model Architecture

Embedding Layer: 32-dimensional word embeddings
LSTM Layer: 16 units for sequence processing
Dense Layer: 32 units with ReLU activation
Output Layer: Single unit with sigmoid activation for binary classification

Results

Test Accuracy: 96.33%
Test Loss: 0.1537

The model demonstrates strong performance in distinguishing between spam and legitimate emails, with high precision and recall for both classes.

Requirements

Python 3.12+
TensorFlow 2.16.2
NLTK
WordCloud
scikit-learn
pandas
numpy
matplotlib
seaborn

Usage

Open and run the Jupyter notebook in the notebook/ directory to:

Load and preprocess the dataset
Train the LSTM model
Evaluate performance metrics
Save the trained model

The trained model is saved in models/spam_detection_model.keras and can be loaded for future predictions.

Project Structure

Email-Spam-Detection-ML/
├── data/
│   └── spam_ham_dataset.csv
├── models/
│   └── spam_detection_model.keras
├── notebook/
│   └── notebook.ipynb
└── README.md

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.vscode		.vscode
data		data
models		models
notebook		notebook
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Email Spam Detection

Overview

Features

Dataset

Model Architecture

Results

Requirements

Usage

Project Structure

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

FTokarek/Email-Spam-Detection-ML

Folders and files

Latest commit

History

Repository files navigation

Email Spam Detection

Overview

Features

Dataset

Model Architecture

Results

Requirements

Usage

Project Structure

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages