Skip to content

Sentimentally analyze product reviews to predict opinion honesty.

License

Notifications You must be signed in to change notification settings

rdadrl/reviewlytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reviewlytics

Predicts accuracy of Amazon reviews by sentiment analysis.

Strategy

  • Select sample data
    • Picked software related categories for the sake of context
    • Pareto's Principle: the law of the vital few requires proper selection of vital group.
      • Selected as a sample generalization technique.
      • Dataset offers review votes; higher vote count with high helpfulness rate can be considered more "truthful".
    • Verified Purchases
      • An Amazon verified purchase tends to be more credible.
  • Form vocabulary
    • Preprocces
      • remove punctuations
      • tokenize
      • remove stop words
      • stem
    • Select most frequent N-words
    • Fill Bag of Words (BOW)
  • Model data
    • Logistic Regression
    • Multinomial/Bernoulli Naive Bayesian
    • Support Vector Classifier
  • Run statistical analysis' on data
    • Use model to find and filter outliers (in respect to stdev of actual score to predicted score)
    • Use model to predict overall score
  • Compare findings

Datasets