Sentiment Classification

This project is to build a sentiment classifier to classify restaurant reviews. Data is in reviews.csv.

It has two approaches. One is using Bag of Words in Sklearn, the other is using the ULMfit in fastai.

Bag of Words Approach:

Tokenization: change the sentence into words.
Lemmatization: standardize the words, like change went and goes to go. Examples in here.
Remove stop words: Remove the stop words like the, a, which might influence the model.
TF - IDF transform: Count the term frequency of the word, and calculate term-frequency times inverse document-frequency. Details in here.
Build the model using SGD classifier

Fastai Text Approach:

Fastai.text provides a pretrained NLP model basing on WikiText-103 dataset. All you need to do is to fine-tune the pre-trained model on your dataset and make prediction.

ULMFiT achieves good results by relying on techniques like:

Discriminative fine-tuning (layer-specific learning rates)
Slanted triangular learning rates (increasing and then decreasing learning rates over epochs)
Gradual unfreezing (gradually unfreeze layers, starting from the last)

Runtime

This is deep-learning-NLP, and the harware matters. Colab provides both GPUs (graphics processing units) and TPU (tensor processing units). And if you have a Nvidia GPU, Nvidia cuda will help in speed up the process.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
BagofWords_Approach.ipynb		BagofWords_Approach.ipynb
Fastai_Approach.ipynb		Fastai_Approach.ipynb
LICENSE		LICENSE
README.md		README.md
reviews.csv		reviews.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Classification

Bag of Words Approach:

Fastai Text Approach:

About

Releases

Packages

Languages

License

Collinjia/NLP-Sentiment-Classification

Folders and files

Latest commit

History

Repository files navigation

Sentiment Classification

Bag of Words Approach:

Fastai Text Approach:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages