Skip to content

Topic modeling using Word2Vec and LDA algorithm with Python

Notifications You must be signed in to change notification settings

lprtk/nlp-topic-modeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Topic modeling using word2vec & LDA

GitHub issues GitHub forks Github Stars Code style: black

Table of contents

Content

The objective is to extract information and value from large volumes of textual data using Natural Language Processing (NLP). This notebook focuses on the use of the word2vec algorithm to represent and study the existing similarities between the words of several documents and on the combination of word2vec and the unsupervised learning algorithm LDA to perform topic modeling by grouping the documents by topic and by detailing the keywords of each document.

Requirements

  • Python version 3.9.7

File details

  • nlp-topic-modeling
    • This is a .ipynb file which contains the code.
  • data
    • This folder contains the data.

Here is the project pattern:

- project
    > nlp-topic-modeling
        - nlp-topic-modeling.ipynb
        > data 
            - papers.csv

Features

My profilMy GitHubOriginal Kaggle dataset