Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computer Hardware Dataset Analysis #729

Merged
merged 6 commits into from
Aug 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
329 changes: 329 additions & 0 deletions Computer Hardware Analysis/Dataset/CPUData.csv

Large diffs are not rendered by default.

1,419 changes: 1,419 additions & 0 deletions Computer Hardware Analysis/Dataset/GPUData.csv

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions Computer Hardware Analysis/Models/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Computer Hardware Analysis Models

## Overview
This document provides a summary of the machine learning models used in analysis , including their performance metrics such as RMSE and R2 score.

## Models Implemented

1. **Linear Regression**
2. **Ridge Regression**
3. **Lasso Regression**
4. **Decision Tree Regressor**
5. **Random Forest Regressor**
6. **Gradient Boosting Regressor**
7. **XGBoost Regressor**
8. **CatBoost Regressor**
9. **Support Vector Regressor**
10. **K-Nearest Neighbors Regressor**
11. **Extra Trees Regressor**

![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_0.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_1.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_2.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_3.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_4.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_5.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_6.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_7.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_8.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_9.png?raw=true)
![results](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___32_10.png?raw=true)


## Performance of the Models

| Model | Train RMSE | Test RMSE | Train R2 | Test R2 |
|-----------------------------|---------------------|---------------------|---------------------|---------------------|
| Linear Regression | 17.65 | 302016927576.74 | 0.9991 | -1.9900E+017 |
| Ridge Regression | 123.93 | 300.17 | 0.9580 | 0.8034 |
| Lasso Regression | 134.90 | 333.59 | 0.9502 | 0.7572 |
| Decision Tree Regressor | 17.65 | 302.87 | 0.9991 | 0.7999 |
| Random Forest Regressor | 151.01 | 353.12 | 0.9376 | 0.7280 |
| Gradient Boosting Regressor | 105.99 | 307.28 | 0.9693 | 0.7940 |
| XGBoost Regressor | 38.36 | 328.19 | 0.9960 | 0.7650 |
| CatBoost Regressor | 81.89 | 330.35 | 0.9817 | 0.7619 |
| Support Vector Regressor | 626.45 | 696.85 | -0.0733 | -0.0594 |
| K-Nearest Neighbors Regressor| 290.01 | 364.72 | 0.7700 | 0.7098 |
| Extra Trees Regressor | 17.65 | 359.22 | 0.9991 | 0.7185 |

## Conclusion
The evaluation of different models based on RMSE and R2 scores highlights their strengths and weaknesses. Models like Linear Regression and Decision Tree Regressor showed lower RMSE values, while XGBoost and Gradient Boosting Regressor had higher R2 scores, indicating better fit for the data.

## Signature
- **Name:** Aditya D
- **Github:** [https://www.github.com/adi271001](https://www.github.com/adi271001)
- **LinkedIn:** [https://www.linkedin.com/in/aditya-d-23453a179/](https://www.linkedin.com/in/aditya-d-23453a179/)
- **Topmate:** [https://topmate.io/aditya_d/](https://topmate.io/aditya_d/)
- **Twitter:** [https://x.com/ADITYAD29257528](https://x.com/ADITYAD29257528)

Large diffs are not rendered by default.

86 changes: 86 additions & 0 deletions Computer Hardware Analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Computer Hardware Analysis

## Goal
The goal of this project is to analyze the computer hardware dataset based on various features such as HDMI support, boost clock, VRAM, and memory clock. Accurate price predictions can help consumers make informed decisions and manufacturers optimize pricing strategies.

## Dataset
The dataset used for this project is sourced from [GPUData.csv](https://www.kaggle.com/datasets/dilshaansandhu/general-computer-hardware-dataset), which includes columns like:
- `Name`: GPU model name
- `Producer`: GPU producer
- `HDMI`: HDMI support
- `Boost.Clock`: Boost clock speed
- `Vram`: VRAM size
- `Memory.Clock`: Memory clock speed

## Description
The dataset is preprocessed to handle missing values and encode categorical variables. Several machine learning models are used to predict the price of GPUs, and their performances are evaluated based on metrics such as RMSE and R2 score.

## What I Had Done
1. **Data Preprocessing**: Cleaned and preprocessed the dataset by handling missing values, encoding categorical variables, and scaling features.
2. **Model Training and Evaluation**: Trained multiple regression models and evaluated their performance using RMSE and R2 scores.
3. **Results Visualization**: Plotted model performance metrics to compare their effectiveness.

## EDA

![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___5_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___6_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___7_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___8_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___9_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___10_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___11_1.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___12_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___13_0.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Computer-Hardware/Computer%20Hardware%20Analysis/Images/__results___14_1.png?raw=true)

## Models Implemented
1. **Linear Regression**
2. **Ridge Regression**
3. **Lasso Regression**
4. **Decision Tree Regressor**
5. **Random Forest Regressor**
6. **Gradient Boosting Regressor**
7. **XGBoost Regressor**
8. **CatBoost Regressor**
9. **Support Vector Regressor**
10. **K-Nearest Neighbors Regressor**
11. **Extra Trees Regressor**

## Libraries Needed
- `pandas`
- `numpy`
- `matplotlib`
- `seaborn`
- `scikit-learn`
- `xgboost`
- `lightgbm`
- `catboost`

## EDA Results
Preprocessing steps included cleaning missing values, encoding categorical features, and scaling numerical features. The features used for model training include `HDMI`, `Boost.Clock`, `Vram`, and `Memory.Clock`.

## Performance of the Models based on Accuracy Scores

| Model | Train RMSE | Test RMSE | Train R2 | Test R2 |
|-----------------------------|---------------------|---------------------|---------------------|---------------------|
| Linear Regression | 17.65 | 302016927576.74 | 0.9991 | -1.9900E+017 |
| Ridge Regression | 123.93 | 300.17 | 0.9580 | 0.8034 |
| Lasso Regression | 134.90 | 333.59 | 0.9502 | 0.7572 |
| Decision Tree Regressor | 17.65 | 302.87 | 0.9991 | 0.7999 |
| Random Forest Regressor | 151.01 | 353.12 | 0.9376 | 0.7280 |
| Gradient Boosting Regressor | 105.99 | 307.28 | 0.9693 | 0.7940 |
| XGBoost Regressor | 38.36 | 328.19 | 0.9960 | 0.7650 |
| CatBoost Regressor | 81.89 | 330.35 | 0.9817 | 0.7619 |
| Support Vector Regressor | 626.45 | 696.85 | -0.0733 | -0.0594 |
| K-Nearest Neighbors Regressor| 290.01 | 364.72 | 0.7700 | 0.7098 |
| Extra Trees Regressor | 17.65 | 359.22 | 0.9991 | 0.7185 |

## Conclusion
The models were evaluated based on RMSE and R2 scores. Linear Regression and Decision Tree Regressor showed the lowest RMSE values, while Support Vector Regressor had the lowest R2 scores. XGBoost and Gradient Boosting Regressor performed well in terms of R2 score, indicating strong predictive capabilities.

## Signature
- **Name:** Aditya D
- **Github:** [https://www.github.com/adi271001](https://www.github.com/adi271001)
- **LinkedIn:** [https://www.linkedin.com/in/aditya-d-23453a179/](https://www.linkedin.com/in/aditya-d-23453a179/)
- **Topmate:** [https://topmate.io/aditya_d/](https://topmate.io/aditya_d/)
- **Twitter:** [https://x.com/ADITYAD29257528](https://x.com/ADITYAD29257528)
12 changes: 12 additions & 0 deletions Computer Hardware Analysis/Results/model_results.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Model,Train RMSE,Test RMSE,Train R2,Test R2
Linear Regression,17.647381123395498,302016927576.73517,0.9991482567920041,-1.9900446232960662e+17
Ridge Regression,123.92508660576131,300.16808962390667,0.9579983249054264,0.8034245461894287
Lasso Regression,134.8963445423951,333.58510662503335,0.9502321921750121,0.7572196406459344
Decision Tree Regressor,17.64737830640841,309.87481343277966,0.9991482570639254,0.7905054153630061
Random Forest Regressor,126.92922626266237,338.95991985527183,0.9559372686801817,0.7493331283020898
Gradient Boosting Regressor,105.99150551123878,307.49701578786687,0.9692751146610664,0.7937081572941895
XGBoost Regressor,38.35541713874928,328.19233798284796,0.9959765225688201,0.7650058126646463
CatBoost Regressor,81.89314457208386,330.34568186878647,0.9816581454877237,0.7619119966048983
Support Vector Regressor,626.4523694262396,696.8481267879035,-0.0733077403050506,-0.059440371600980146
K-Nearest Neighbors Regressor,290.0148497472672,364.72292201910454,0.7699678065715658,0.7097806522648277
Extra Trees Regressor,17.64737830640841,358.7695516693132,0.9991482570639254,0.7191778239681932
8 changes: 8 additions & 0 deletions Computer Hardware Analysis/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
pandas==1.5.3
numpy==1.24.2
matplotlib==3.7.1
seaborn==0.12.2
scikit-learn==1.2.2
xgboost==1.7.3
lightgbm==3.3.5
catboost==1.1.1
Loading