Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 470 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 470 Bytes

TextClassification

Basic text classification techniques are applied using multiple classifiers on top of 20_newsgroup data set available at http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html. calculated precision, recall, F1 and accuracy for each of above models.

  1. Features - Five features tried out, including BoW and TF-IDF

  2. Classifiers - Three classifiers tried out, including Naive Bayes, and SVM

  3. 10-fold cross validation is used.