Email Classification using NLP

This project focuses on classifying emails into spam and non-spam categories using Natural Language Processing (NLP) techniques. We'll preprocess the text data, visualize the label distribution, perform feature engineering, and train machine learning models for classification.

Dataset

The dataset used for this project can be found on Kaggle: Email Classification - NLP

Columns

Message Body: Contains the email content.
Label: Indicates whether the email is spam or non-spam.

Tasks

Task 1: Text Cleaning and Preprocessing

Load the training and testing datasets.
Check for missing values and remove them if any.
Convert all text to lowercase.
Remove stop words.
Remove punctuation.
Perform stemming or lemmatization.

Task 2: Data Visualization

Visualize the distribution of the labels in the training dataset using a histogram, bar chart, or pie chart.

Task 3: Feature Engineering

Apply text representation techniques:
- Bag of words
- TF-IDF

Task 4: Model Training

Train SVM model after applying Bag of Words.
Train SVM model after applying TF-IDF.
Train Random Forest model after applying Bag of Words.
Train Random Forest model after applying TF-IDF.

Evaluation Metrics

Evaluate the performance of the trained models on the testing dataset using the following metrics:

Accuracy
Precision
Recall

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
NLP- Email Classification.ipynb		NLP- Email Classification.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Email Classification using NLP

Dataset

Columns

Tasks

Task 1: Text Cleaning and Preprocessing

Task 2: Data Visualization

Task 3: Feature Engineering

Task 4: Model Training

Evaluation Metrics

About

Releases

Packages

Languages

License

Ahmed-AI-01/NLP

Folders and files

Latest commit

History

Repository files navigation

Email Classification using NLP

Dataset

Columns

Tasks

Task 1: Text Cleaning and Preprocessing

Task 2: Data Visualization

Task 3: Feature Engineering

Task 4: Model Training

Evaluation Metrics

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages