Skip to content

Studing the changes of the dominant political views on the sub-reddit r/canada. The work is still in progress. Honours research project - Concordia University - BA. Statistics

Notifications You must be signed in to change notification settings

khaledfouda/A-probabilistic-model-for-text-categorization

Repository files navigation

A probabilistic model for text categorization

Classifying political content on Reddit

Abstract

Classifying social media content has been the interest of researchers during the last decade. This paper proposes a probabilistic representation of topic-related keywords on social media. We aim to estimate the conditional likelihood of a class given a short text like a tweet. We used Reddit as a case study with an interest in identifying political content. We reported the performance and compared it to machine learning methods. Our model achieves a precision of 97% and takes only a few seconds to fit over 500,000 data points.


For full report: A probabilistic model for text categorization

For code structure and how to use the library check code structure

About

Studing the changes of the dominant political views on the sub-reddit r/canada. The work is still in progress. Honours research project - Concordia University - BA. Statistics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages