Skip to content

Commit

Permalink
Merge pull request #489 from Avdhesh-Varshney/gender
Browse files Browse the repository at this point in the history
Gender Recognition By Voice Model
  • Loading branch information
abhisheks008 authored Jan 9, 2024
2 parents aefd422 + 5c9027c commit 3e30972
Show file tree
Hide file tree
Showing 10 changed files with 11,985 additions and 0 deletions.

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions Gender Recognition By Voice/Dataset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
The Dataset is used here is taken from the Kaggle database website. You can download the file from the link given here, [Gender Recognition By Voice](https://www.kaggle.com/datasets/alarmanovi/gender-recognition-by-voice-2023)

I have also attached the dataset in this folder to download from here directly [Link](./Male%20and%20female%20Voice%20data%20creat%20by%20al%20arman%20ovi%20.csv)

Binary file added Gender Recognition By Voice/Images/1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Gender Recognition By Voice/Images/2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Gender Recognition By Voice/Images/3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Gender Recognition By Voice/Images/4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Gender Recognition By Voice/Images/5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5,901 changes: 5,901 additions & 0 deletions Gender Recognition By Voice/Models/Gender_Recognition_By_Voice.ipynb

Large diffs are not rendered by default.

77 changes: 77 additions & 0 deletions Gender Recognition By Voice/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
<h1>Gender Recognition By Voice</h1>

**GOAL**

To build a machine learning model for predicting the voice based on their genders.

**DATASET**

https://www.kaggle.com/datasets/alarmanovi/gender-recognition-by-voice-2023

**DESCRIPTION**

To analyze the dataset of Gender recognition by voice 2023 and build and train the model on the basis of their voice and predict the genders.

### Visualization and EDA of different attributes:

<img alt="Screenshot1" src="./Images/1.png">

<img alt="Screenshot2" src="./Images/2.png">

<img alt="Screenshot3" src="./Images/3.png">

<img alt="Screenshot4" src="./Images/4.png">

<img alt="Screenshot5" src="./Images/5.png">

**MODELS USED**

| Model | MSE | R2 |
|------------------------------|------------------|------------------|
| Gradient Boosting Regression | 1.406672e-01 | 4.031331e-01 |
| Random Forest Regression | 1.435905e-01 | 3.907292e-01 |
| XG Boost Regression | 1.505178e-01 | 3.613356e-01 |
| Linear Regression | 1.718679e-01 | 2.707449e-01 |
| Ridge Regression | 1.809122e-01 | 2.323690e-01 |
| Elastic Net Regression | 1.854163e-01 | 2.132574e-01 |
| Deep NN | 2.566216e-01 | -8.887454e-02 |
| Decision Tree Regression | 2.777314e-01 | -1.784462e-01 |
| SGD Regression | 2.557489e+29 | -1.085172e+30 |


**WHAT I HAD DONE**

* Load the dataset which contains 5993 entries in it and having 22 columns in it.
* Checked for missing values and cleaned the data accordingly.
* Analyzed the data, found insights and visualized them accordingly.
* Plotting heatmap using correlation and checking the relation between different features.
* Found detailed insights of different columns with target variable using plotting libraries.
* Train the datasets by different models and saves their accuracies into a dataframe.


**LIBRARIES NEEDED**

1. Pandas
2. Matplotlib
3. Sklearn
4. NumPy
5. XGBoost
6. Tensorflow
7. Keras
8. Sci-py
9. Seaborn


**CONCLUSION**

- Gradient Boosting Regression and Random Forest Regression Models are best fitted to the datasets.
- MSE and R2 score of both models are very much good compare to other models.
- The visualization graph also best fitted to the dataset on Gradient Boosting Regression Model.


**YOUR NAME**

*Avdhesh Varshney*

[![LinkedIn](https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/avdhesh-varshney-5314a4233/) [![GitHub](https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white)](https://github.com/Avdhesh-Varshney)

9 changes: 9 additions & 0 deletions Gender Recognition By Voice/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
numpy==1.19.2
pandas==1.4.3
matplotlib==3.7.1
scikit-learn~=1.0.2
scipy==1.5.0
seaborn==0.10.1
xgboost~=1.5.2
tensorflow==2.4.1
keras==2.4.0

0 comments on commit 3e30972

Please sign in to comment.