Skip to content

Model with Dimensionality Reduction with performing SMOTE and Tuning will get comparable results comparing with off-the-shelf models in Sentiment Analysis of Citations (Athar 2011).

License

Notifications You must be signed in to change notification settings

HuyTu7/sentiment_citation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Citation Sentiment Classifier

Description:

Detailed Report can be accessed HERE!

Sentiment analysis of scientific citations have been well-studied recently for bibliometrics (the measures of popularity and the impact of the published research). Most research work on this topic have only attempted to use complex, slow, and thorough off-the-shelf models for their problems.

Proposed Hypothesis: Dimensionality Reduction with performing Undersampling and Oversampling technique and Tuning will get comparable results.

Proposed methods:

  • Principal Component Analysis (PCA)
  • Synthetic Minority Over-sampling Technique (SMOTE)
  • Tuning with Differential Evolution Algorithms

Dataset:

Data for citation sentiment classification reported in the Sentiment Analysis of Citations using Sentence Structure-Based Features paper.
The file test.arff contains only the test set with dependency triplets generated with Stanford CoreNLP.
Full corpus available at here.

  1. 7261 citation contexts
  2. 209795 words/phrases features + 88031 dependencies features
  3. Classes distribution:
  • Objective [‘o’] - 6276 (86.43%)
  • Positive [‘p’] - 742 (10.22%)
  • Negative [‘n’] - 243 (3.35%)

Implementation:

Model:

Feartures Reduction - PCA:

SMOTE:

Params to be Optimized per Learner:

  • RF: max_features, max_leaf_nodes, min_samples_split, & n_estimators
  • SVM: kernels, coef0, & C
  • CART: max_features, max_depth, min_samples_split, & n_estimators
  • KNN: n_neighbors & weights

Results:

Files:

sentiment_citation/
├── document/
│   ├── proposal.pdf
│   ├── final.pdf
├── data/
│   ├── test.arff
├── preprocess/
|   ├── features_engineer.py    
│   ├── readarff.py
├── work/    
│   ├── experiment_v1.ipynb
│   ├── experiment_v2.ipynb
│   ├── de_tuner.py
|   ├── learners.py
|   ├── tuning.py

Requirements:

  • Python 2.7
  • scikit-learn
  • pandas
  • numpy

About

Model with Dimensionality Reduction with performing SMOTE and Tuning will get comparable results comparing with off-the-shelf models in Sentiment Analysis of Citations (Athar 2011).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published