Comparative Study of Classification Algorithms for Spam Email Detection.
Studied Na ̈ıve Bayes, Support Vector Machine (SVM), Decision Tree and Linear Logistic Regression based models. Improved performance by data preprocessing (lemmatization and cosine normalization of log transformed TF-IDF feature vector). Built a system with 99% accuracy using SVM based classifier with linear kernel.(SMS Spam Collection dataset)