Skip to content

Prediction for StackOverFlow Tags based on the Content(Title + Body).

Notifications You must be signed in to change notification settings

chandanmalla/StackoverFlow-Tag-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

StackoverFlow-Tag-Prediction

Prediction for StackOverFlow Tags based on the Content(Title + Body).

Data set is taken from https://www.kaggle.com/c/facebook-recruiting-iii-keyword-extraction

I have taken very small set of data from 4 million data.

Steps in order which I have performed:

  • EDA --> PreProcessing(Stemming,Tokenization,Stopword removal,html tags removal etc) --> Logistic Regression with One vs Rest Classifier(Can use any classifier, as I was not hoping to get a good accuracy, used a Simple Linear Model)

Original Data-SET

GitHub Logo

Data set after pre-processing

  • question = body + title, code was removed from body and new block for code exist was created. GitHub Logo

GitHub Logo

About

Prediction for StackOverFlow Tags based on the Content(Title + Body).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published