Skip to content

This repo is made to store all the source code created while working on 66DaysOfData challenge started by Ken Jee

Notifications You must be signed in to change notification settings

asad-mahmood/66DaysOfData

Repository files navigation

Brief Project Descriptions

  • Covid Vaccinations - In this project I am analyzing the data which show's progess of covid 19 vaccination across the globe, what vaccination sequences they are using and etc.
  • Data Preprocessing and ML Template - A preprocessing template that I use for every project classification/regression to start with. I am using lazy predict package to test out which algo is the best suited for the project.
  • Fake News Detection - This project is about how to use tweets text and classify them. In this cproblem statement is to classify if the suggested tweets contain misinformation or not. The classification model used is Passive Aggressive Classfier and achieved 92 % accuracy score and 585 true positives, 590 true negatives, 48 false positives, and 44 false negatives.
  • Heart Faliure Detection - The main objective of this project is to build an effective classification model to predict heart attack based on underlying factors. I felt some features were not as important as others so I used Extra Trees Classifier and Step forward and backward feature selection methods to determine the most important features. After that I chose the top three and standaerdized them and used LazyPredict library to determine which algos will be best for making an effective model. I chose the top two i.e Extra Trees Classifier and Decision Trees Classfier. I fine tuned them and in the end Extra Trees Classifier won because of its higher accuracy rate of 92 % compared to Decision Trees 88 %. Along with that it also had a better sensitivity and specificty rate.
  • Hyperparameter Tuning - I was testing the different ways through which a hyperparameter tuning could be done. I tested out three different methods Grid Search CV, Random Search and Optuna. I found that Optuna performs best however Random Search is easiest and quickest to implement.
  • Translating SQL to Pandas - A small project that I used to perpare for interviews where questions regarding translating SQL statements to pandas or vice versa were expected. It walks through how all the major keywords of SQL can be successfully translated to pandas statements.
  • Wall Street Bets This is a Sentiment Analysis Project for Wall Street Bets subreddit. I downloaded the subreddit using PRAW and carried out the sentiment analysis. The purpose for this project was to determine what were the sentiments of users in this subreddit towards differnet companies, trading platforms and such. I also worked on how the general trend of sentiments changed overtime and much more.

About

This repo is made to store all the source code created while working on 66DaysOfData challenge started by Ken Jee

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published