- IMDb 50k movie reviews dataset released by Stanford
- Reviews of The Shawshank Redemption scraped from IMDb using Selenium
Complete EDA of the above datsets can be found here.
Check out 3-min summary of the movie for better understanding of the results.
Two kinds of analysis were performed on the reviews of the movie, The Shawshank Redemption:
- Traditional supervised ML models with BOW and TFIDF vector representations
- Naive Bayes
- Decision Trees
- Random Forest
- Deep Learning models like LSTM and RNN
- LDA with BOW and TFIDF vector representations.
- NMF with BOW and TFIDF vector representations.
Code for them can be found in jupyter notebooks inside their respective folders.