COT6930 Natural Language Processing, Spring 2019

Assignment 1

Text processing with OpenNLP.

Assignment report here.

Detect sentences with SentenceDetectorME
Extract tokens with SimpleTokenizer, WhitespaceTokenizer, TokenizerME and compare their performance
Detect parts-of-speech with POSTaggerME
Find named entities with the generic entity finder NameFinderME, inititalized with persons (en-ner-person.bin), locations (en-ner-location.bin), money/currencies (en-ner-money.bin), and percentages (en-ner-percentage.bin).

Document classification using Weka.

Assignment report here (the README file).

Create ARFF train and test file from plain text file (already tokenized and stemmed)
Use Weka's StringToWordVector to create word vectors and FilteredClassifier to split into train and test datasets
Use Weka's AttributeSelection to select attributes (words) from the text, to fine-tune the classifiers
Compare the NaiveBayesMultinomial with the LibSVM classifiers

Sentiment analysis with TextBlob.

Assignment report here.

Compare the performance of PatternAnalyzer and NaiveBayesAnalyzer in sentiment analysis of restaurant reviews.

TensforFlow introduction and applications for natural language processing (NLP).

Introduction here and slide deck used for presentation here.