Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automobiles Sales Data Analysis #576

Closed
mariam7084 opened this issue Feb 9, 2024 · 7 comments · Fixed by #648
Closed

Automobiles Sales Data Analysis #576

mariam7084 opened this issue Feb 9, 2024 · 7 comments · Fixed by #648
Assignees
Labels
Advanced Points 40 - SSOC 2024 Assigned 💻 Issue has been assigned to a contributor SSOC

Comments

@mariam7084
Copy link
Contributor

ML-Crate Repository (Proposing new issue)

🔴 Project Title : Automobiles Sales Data Analysis
🔴 Aim : Perform EDA
🔴 Dataset : https://www.kaggle.com/datasets/ddosad/auto-sales-data
🔴 Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.


📍 Follow the Guidelines to Contribute in the Project :

  • You need to create a separate folder named as the Project Title.
  • Inside that folder, there will be four main components.
    • Images - To store the required images.
    • Dataset - To store the dataset or, information/source about the dataset.
    • Model - To store the machine learning model you've created using the dataset.
    • requirements.txt - This file will contain the required packages/libraries to run the project in other machines.
  • Inside the Model folder, the README.md file must be filled up properly, with proper visualizations and conclusions.

🔴🟡 Points to Note :

  • The issues will be assigned on a first come first serve basis, 1 Issue == 1 PR.
  • "Issue Title" and "PR Title should be the same. Include issue number along with it.
  • Follow Contributing Guidelines & Code of Conduct before start Contributing.

To be Mentioned while taking the issue :

  • Full name :
  • GitHub Profile Link :
  • Participant ID (If not, then put NA) :
  • Approach for this Project :
  • What is your participant role? (Mention the Open Source Program name. Eg. HRSoC, GSSoC, GSOC etc.)

Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

@mariam7084 mariam7084 added the Up-for-Grabs ✋ Issues are open to the contributors to be assigned label Feb 9, 2024
@shivansh-2003
Copy link
Contributor

shivansh-2003 commented May 29, 2024

Can You Please Assign this issue under SSOC. 2024 Season 3
Shivansh Mahajan
Github:- https://github.com/shivansh-2003
Participation ID:- NA
I will do EDA of the data set by various statistical methods like IQR , Study Distribution OF Feature and Correlation Matrix.
I would train the data in Various ML model to. arrive to the better Accuracy score. I would then feed the data for Feature engineering and then train it with different machine learning models KNN , Random forest , Decision Tree , SVM and Bossting Algorithms
. I am well versed with Machine Learning you can check out my linkedin :-https://www.linkedin.com/in/shivansh-mahajan-13227824a/ and Git repository .
can u assign me with this issue @abhisheks008
Participation Role:- SSOC Season 3

@abhisheks008
Copy link
Owner

Contributions will start from June 1, 2024. Till then please have some patience.

@DarkRaiderCB
Copy link
Contributor

Full name : Sanyog Mishra
GitHub Profile Link : https://github.com/DarkRaiderCB
Participant ID: NA
Approach for this Project : Will perform EDA on dataset provided using techniques like Data cleaning, categorial analysis, finding any outliers, visualisation (correlation matrix, heat maps and more), summary stats., etc. Would utilise feature engineering and use ML algorithms like Linear Regression, Decision Tree, XGBoost, KNN, etc. and will find the best performing model.
Tools to be used: Pandas, Numpy, Matplotlib, Scikit Learn, XGBoost.
Resume: https://drive.google.com/file/d/1sDVtq69GJd83t4H1-EOlvHEyQc2oat1k/view?usp=drive_link
Participant Role: Contributor SSOC Season 3

@aryan0931
Copy link

Name: Aryan Yadav
github: https://github.com/aryan0931
Participant id:NA
Approach for this project: I will perform Exploratory Data Analysis (EDA) on the provided dataset using techniques such as data cleaning, categorical analysis, outlier detection, and visualization (including correlation matrices, heat maps, and more). This process will involve generating summary statistics and conducting feature engineering to prepare the data for machine learning.
For the machine learning analysis, I will utilize algorithms like Linear Regression, Decision Trees, XGBoost, and K-Nearest Neighbors (KNN) to identify the best-performing model. The tools and libraries used for this analysis will include Pandas, NumPy, Matplotlib, Scikit-Learn, and XGBoost.
Participant role: SSOC

@abhisheks008
Copy link
Owner

Full name : Sanyog Mishra GitHub Profile Link : https://github.com/DarkRaiderCB Participant ID: NA Approach for this Project : Will perform EDA on dataset provided using techniques like Data cleaning, categorial analysis, finding any outliers, visualisation (correlation matrix, heat maps and more), summary stats., etc. Would utilise feature engineering and use ML algorithms like Linear Regression, Decision Tree, XGBoost, KNN, etc. and will find the best performing model. Tools to be used: Pandas, Numpy, Matplotlib, Scikit Learn, XGBoost. Resume: https://drive.google.com/file/d/1sDVtq69GJd83t4H1-EOlvHEyQc2oat1k/view?usp=drive_link Participant Role: Contributor SSOC Season 3

Implement 5-6 models for this project and compare them based on their accuracy scores.

Assigning this issue to you @DarkRaiderCB

@abhisheks008 abhisheks008 added Assigned 💻 Issue has been assigned to a contributor Intermediate Points 30 - SSOC 2024 SSOC and removed Up-for-Grabs ✋ Issues are open to the contributors to be assigned labels Jun 2, 2024
@DarkRaiderCB
Copy link
Contributor

Full name : Sanyog Mishra GitHub Profile Link : https://github.com/DarkRaiderCB Participant ID: NA Approach for this Project : Will perform EDA on dataset provided using techniques like Data cleaning, categorial analysis, finding any outliers, visualisation (correlation matrix, heat maps and more), summary stats., etc. Would utilise feature engineering and use ML algorithms like Linear Regression, Decision Tree, XGBoost, KNN, etc. and will find the best performing model. Tools to be used: Pandas, Numpy, Matplotlib, Scikit Learn, XGBoost. Resume: https://drive.google.com/file/d/1sDVtq69GJd83t4H1-EOlvHEyQc2oat1k/view?usp=drive_link Participant Role: Contributor SSOC Season 3

Implement 5-6 models for this project and compare them based on their accuracy scores.

Assigning this issue to you @DarkRaiderCB

Thanks sir!! @abhisheks008

@abhisheks008 abhisheks008 added Advanced Points 40 - SSOC 2024 and removed Intermediate Points 30 - SSOC 2024 labels Jun 11, 2024
abhisheks008 added a commit that referenced this issue Jun 11, 2024
Automobiles Sales Data Analysis #576
Copy link

Hello @DarkRaiderCB! Your issue #576 has been closed. Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Advanced Points 40 - SSOC 2024 Assigned 💻 Issue has been assigned to a contributor SSOC
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants