Skip to content

Topic Modelling and Recommendation System for News Articles using Non-Negative Matrix Factorization (NMF) and Linear discriminant analysis (LDA).

Notifications You must be signed in to change notification settings

aifenaike/Topic-Modelling-Using-LDA-and-NMF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Topic-Modelling-Using-LDA-and-NMF

Topic Modelling and Recommendation System for News Articles using Non-Negative Matrix Factorization (NMF) and Linear discriminant analysis (LDA).

An article recommendation engine using TF-IDF where by giving a keyword, the engine would suggest the top most documents by using cosine similarity from the pool of documents is also developed.

Topic modeling is a type of statistical modeling for discovering the abstract “topics” that occur in a collection of documents.

Latent Dirichlet Allocation (LDA)

LDA is an example of topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.

Non-Negative Matrix Factorization (NMF)

NMF is an unsupervised technique so there are no labeling of topics that the model will be trained on. The way it works is that, NMF decomposes (or factorizes) high-dimensional vectors into a lower-dimensional representation. These lower-dimensional vectors are non-negative which also means their coefficients are non-negative.

Approach

  • Topic Modelling Using LDA.
  • Topic Modelling Using NMF.
  • Cosine Similarity as a means for recommending articles.
  • Given a keyword, Document Recommender system can suggest you the best documents from the pool of documents.

Frameworks

  • Gensim
  • NLTK
  • Scikit-learn
  • Numpy

About

Topic Modelling and Recommendation System for News Articles using Non-Negative Matrix Factorization (NMF) and Linear discriminant analysis (LDA).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published