Project is hosted live on Heroku
Project implements machine learning model for Natural Language Processing (NLP). Visualization is done with Plotly Dash. Flexibility of hovering over data points to visualize book properties (meta-data) and similarity score, horizontal bar chart and book imprint. Major processing on books to extract tokenized and lemmatized features, principal component analysis for dimension reduction, and Kmeans clustering to visualize relationship among books. Project is hosted live on heroku.
- Import and preprocess all 148 French books
- Stemming & Lemmatization of extracted tokens
- Visualize most frequent words on hover. Return ordered Barplot
- TF-IDF Model
- Document Similarity using Cosine distance of book content
Topic Models
- Principal component analysis
- K-Means clustering
git clone https://github.com/kennedyCzar/NLP-PROJECT-BOOK-INSIGHTS-WITH-PLOTLY
Open the script folder in your terminal and run the following command
python mplot_script.py
Navigate http://127.0.0.1:8050/