Skip to content

Program to analyze transfer loss across domains using TF-IDF vectors with Chi squared into logistic regression model.

License

Notifications You must be signed in to change notification settings

chandnii7/Transfer-Loss-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transfer Loss Across Domain

Program to analyze transfer loss across domains using example of books and electronics. Creating TF-IDF vectors and selecting k best with Chi Squared. Logistic regression model is used for training.

Dataset:

  1. Source Domain: Books

    • number of positive reviews = 1000
    • number of negative reviews = 1000
    • source domain training set vector: (2000, 4500)
  2. Target Domain: Electronics

    • number of positive reviews = 1000
    • number of negative reviews = 1000
    • target domain training set vector: (1600, 4500)
    • target domain test set vector: (400, 4500)

Result:

  1. Direct Transfer:
    • Training a logistic regression classifier on the Electronics training dataset.
    • Evaluating it on the Electronics test dataset.


  1. Cross-domain Transfer:
    • Training a logistic regression classifier on the Books training dataset.
    • Evaluating it on the Electronics test dataset.


  1. Transfer Loss Across Domains:
    • LOSS = direct_transfer_accuracy - cross_domain_transfer_accuracy = 0.39

About

Program to analyze transfer loss across domains using TF-IDF vectors with Chi squared into logistic regression model.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published