This project aims to predict the price of diamonds based on various attributes such as carat weight, cut, clarity, color, and depth. The goal is to develop a machine learning model that can accurately estimate the price of a diamond given its characteristics.
The dataset used for this project is sourced from https://www.kaggle.com/competitions/playground-series-s3e8/data?select=train.csv(https://www.kaggle.com/competitions/playground-series-s3e8/data?select=train.csv). It contains information about various diamonds, including their attributes and corresponding prices. The dataset is provided in csv format and can be found in the data
directory.
To run the project locally, please ensure you have the following dependencies installed:
- Python 3.7 or higher
- NumPy
- Pandas
- Scikit-learn
- Matplotlib
- Jupyter Notebook
Once you have the dependencies, follow these steps to set up the project:
- Clone the repository:
git clone https://github.com/your-username/diamond-price-prediction.git
- Navigate to the project directory:
cd diamond-price-prediction
- Create a virtual environment (optional):
python -m venv env
- Activate the virtual environment (optional):
source env/bin/activate
- Install the required packages:
pip install -r requirements.txt
- Launch Jupyter Notebook:
jupyter notebook
- Open the
diamond_price_prediction.ipynb
notebook. - Run the cells in the notebook to execute the code and see the results.
- Feel free to modify the code or experiment with different models and parameters.
The results of the diamond price prediction are evaluated based on various metrics such as mean absolute error (MAE), root mean squared error (RMSE), and R-squared score. These metrics provide insights into the performance of the model and how well it predicts diamond prices.
Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request. Make sure to follow the project's code of conduct.