Prediction for StackOverFlow Tags based on the Content(Title + Body).
Data set is taken from
I have taken very small set of data from 4 million data.
Steps in order which I have performed:
- EDA --> PreProcessing(Stemming,Tokenization,Stopword removal,html tags removal etc) --> Logistic Regression with One vs Rest Classifier(Can use any classifier, as I was not hoping to get a good accuracy, used a Simple Linear Model)