Basic text classification techniques are applied using multiple classifiers on top of 20_newsgroup data set available at http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html. calculated precision, recall, F1 and accuracy for each of above models.
-
Features - Five features tried out, including BoW and TF-IDF
-
Classifiers - Three classifiers tried out, including Naive Bayes, and SVM
-
10-fold cross validation is used.