Skip to content

SENtence SYNonymity - A Natural Language Processing engine

Notifications You must be signed in to change notification settings

SiddharthGC/SENSYN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

SENSYN

SENtence SYNonymity

Siddharth Chandrasekar and Kavya Jampani, 2019

A Natural Language Processing engine to rank the similarity of two sentences on a scale of 1-5.

Please do not plagiarize my code. You may use it as a reference but please make sure to cite it as a source in your bibliography.

SETUP

Use the following steps to generate the following directories if not present -> Navigate to the project folder in command prompt -> mkdir data -> mkdir data\csv -> mkdir data\predictions -> mkdir data\tmp train-set.txt,dev-set.txt,test-set.txt files present in data folder.

INSTRUCTIONS

  1. Import the project into PyCharm or any IDE of your choice
  2. Mark the src/ folder as "Sources Root"
  3. Execute Task2.py
    • i/p: sentence
    • o/p: Results asked to compute in Task2 4 .Model Training and Evaluation 4.1 Execute sts_features.py to generate the features(X) and labels(y) for train and dev partitions.
      • These numpy arrays are saved to disk on data/tmp folder. 4.2 Execute sts_train.py to train a gradient boosting model
      • Train and Dev data is loaded from data/tmp folder
      • model file is saved to data/models/gb-model.sav file.
      • predictions on dev data is written to data/predictions/dev-set-predicted-answers.txt file.
        • They are in a format as expected by the evauation.py script 4.3 Execute sts_infer.py to test the trained model on test data
      • Predictions are written to data/predictions/test-set-predicted-answers.txt

About

SENtence SYNonymity - A Natural Language Processing engine

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published