We have implemented Topic Classification for news articles to classify different articles into multiple topics in real-time. We have used a deep learning network model to classify news articles into 42 categories. We trained our classification model to classify different news articles, and then applied this model to real-time Tweets from various authorized Twitter news handles to predict the topics at any given time. We also allow users to view the top ’N’ most popular Twitter topics at any given time and see their related Tweets as well.
View the report here.
We are using multiple datasets for this project:
HuffPost Dataset: https://www.kaggle.com/datasets/rmisra/news-category-dataset
RealNews Dataset: https://paperswithcode.com/dataset/realnews
News Aggregator Dataset: https://www.kaggle.com/datasets/uciml/news-aggregator-dataset
A Million News Headlines: https://www.kaggle.com/datasets/therohk/million-headlines
All the News 2.0: https://components.one/datasets/all-the-news-2-news-articles-dataset/
India News Headlines Dataset: https://www.kaggle.com/datasets/therohk/india-headlines-news-dataset
- Create a
.env
file in the root directory with the following fields for Tweepy user authentication:
bearer_token=YOUR_BEARER_TOKEN
consumer_key=YOUR_CONSUMER_KEY
consumer_secret=YOUR_CONSUMER_SECRET
access_token=YOUR_ACCESS_TOKEN
access_token_secret=YOUR_ACCESS_TOKEN_SECRET
- Install required libraries:
pip install -r requirements.txt
- In the
deep_learning_clustering/twitter_dash
directory, run the server:flask --app main.py run
- Open a web browser and visit the following URL:
http://127.0.0.1:5000/api/docs
Advith Chegu |
Vipul Gharde |
Diksha Wuthoo |