Implementing different text classification models, strating from Logisitic regression, ending with bert based language models.
Documents used are news articles borrowed from Lebanon files (www.lebanonfiles.com).
These documents are classed in 5 categories:
- أخبار محلية
- أخبار اقليمية ودولية
- أخبار اقتصادية
- أخبار رياضة
- أخبار فنبة
The idea is to report at the end the accuracy of each model and to examine the performance of the SOTA in text classification task over the time.
Models implemented (yet):
- Logistic Regression + Tf-Idf features
- Simple Neural Network model from fastText